views:

296

answers:

2

I had a scanned multipage TIFF image and needed to split each page out into individual files.

This is easy to do in by leveraging the .NET framework and C#, but since I did not have all the development tools installed on the machine I was using, I instead opted to use IronPython (via ipy.exe) to quickly script the processing logic.

Using Stack Overflow as a 'blog' engine, I'll provide an answer to my own question. Comments, suggestions, alternatives, etc. are welcome!

+2  A: 

Here is one way to do this - tweak as needed.


import clr
clr.AddReference("System.Drawing")

from System.Drawing import Image
from System.Drawing.Imaging import FrameDimension
from System.IO import Path

# sourceFilePath - The full path to the tif image on disk (e.g path = r"C:\files\multipage.tif")
# outputDir - The directory to store the individual files.  Each output file is suffixed with its page number.
def splitImage(sourceFilePath, outputDir):
     img = Image.FromFile(sourceFilePath)

     for i in range(0, img.GetFrameCount(FrameDimension.Page)):

         name = Path.GetFileNameWithoutExtension(sourceFilePath)
         ext = Path.GetExtension(sourceFilePath)
         outputFilePath = Path.Combine(outputDir, name + "_" + str(i+1) + ext)

         frameDimensionId = img.FrameDimensionsList[0]
         frameDimension = FrameDimension(frameDimensionId)

         img.SelectActiveFrame(frameDimension, i)
         img.Save(outputFilePath, ImageFormat.Tiff)
Kevin Pullin
+1  A: 

One downside to doing it this way is that the image data was decompressed and then re-compressed when it was saved. This is not a problem if your compression is lossless (just time and memory), but if you are using JPEG compression for the images inside the TIFF, you will lose quality.

There are ways to do this using libtiff directly -- I don't know of any other non-commercial tools that can do it. Basically, you need to find the TIFF directory entries in the file that relate to the image data and copy them directly into a new TIFF without decoding them and reencoding. Depending on how much you want to do, you may need to fix offsets in the entries (e.g. if you are also bringing over the meta-data)

If you are interested in being able to split, merge, remove pages from or reorder TIFF documents without losing quality (and also faster and using less memory), take a look at my company's product, DotImage, and look at the TiffDocument class. This CodeProject article shows how to do it.

Lou Franco