image_extractor tool


Page containing seven archival photos of people in outdoor wear on a trip through the woods.
Page from the Tacoma Mountaineers 1911 scrapbook, courtesy Tacoma Northwest Room.

The second tool was prompted by an archivist digitizing scrapbooks with our Zeutschel overhead scanner and wondering if we could batch extract individual photographs from the full page images. The image processing here is done using Scikit-Image, a collection of algorithms that can be implemented in different ways depending on the type of content you need to mask and extract.


image_extractor tool code snippet


  • from skimage import io, color, filters, measure, morphology, util: image reading, gray scale conversion for preprocessing, region analysis and labeling
  • from skimage.morphology import binary_closing, remove_small_objects: “closing” object edges and removing noise
  • from scipy.ndimage import binary_fill_holes: “fill” interior holes in identified objects
  • import numpy as np: replace Python loops with array operations for speed
  • import os: file and directory operations and path manipulation
  • from pathlib import Path: cleaning syntax for path manipulations
  • import time: performance logging
  • from PIL import Image: converting to final image format
  • Parameters:
    • min_long_edge = 300
    • max_long_edge = 1000
    • max_photos = 10max_photos


Find the full script here


The design of this code is a little different than the previous tool, with simple, human readable parameters you can adjust depending on the general size of the photos in their collection, the number of photos within an image and the amount of margin around those images. The code identifies each photograph within the image placed in the A folder and generates a new derivative in the B folder where they are given sequential file names based on the parent item name. The code is currently set up to give sequential files names by the position of each photograph in a clockwise rotation, but this can be configured to whatever you find intuitive.