AI & Agents

How to Extract Metadata from TIFF Image Files

TIFF files carry more metadata than most image formats, from standard EXIF camera data to GeoTIFF spatial coordinates and multi-page document structures. This guide walks through extracting TIFF metadata with ExifTool, Python, and GDAL, then shows how to automate extraction at scale with AI-powered tools.

Fast.io Editorial Team 9 min read
Visual representation of file indexing and metadata extraction

What Metadata Does a TIFF File Contain?

TIFF (Tagged Image File Format) stores metadata in Image File Directory (IFD) entries, a structure that predates and inspired the EXIF standard used in JPEGs. Where JPEG metadata is limited to a handful of well-known fields, TIFF's IFD system supports over 80 baseline tags defined in the TIFF 6.0 specification, plus hundreds of extension and private tags registered by software vendors and standards bodies.

A typical TIFF file can contain several distinct metadata layers:

  • Baseline TIFF tags describe the image dimensions, bit depth, compression method, color space, and resolution. These 13 required fields appear in every valid TIFF file.
  • EXIF data records camera settings like shutter speed, aperture, ISO, focal length, and GPS coordinates. TIFF stores EXIF in a dedicated sub-IFD rather than an embedded binary block.
  • IPTC metadata carries editorial information: photographer credits, captions, keywords, copyright notices, and usage rights. News agencies and stock photo libraries depend on IPTC fields for cataloging.
  • XMP data embeds an XML document with Adobe-standard fields for editing history, color profiles, and custom properties.
  • GeoTIFF tags add coordinate reference systems, map projections, and spatial extents. These tags turn a standard TIFF into a georeferenced raster used in GIS and remote sensing.
  • Private IFD tags in the 32768+ range store vendor-specific data from applications like Photoshop, ImageJ, and medical imaging systems.

This layered structure means that extracting "the metadata" from a TIFF file requires different tools depending on which layer you need. A simple EXIF reader will miss GeoTIFF coordinates entirely, and a GIS tool might skip IPTC credits.

Reading TIFF Metadata with ExifTool

ExifTool is the most complete command-line tool for reading TIFF metadata across all layers. Written in Perl by Phil Harvey, it handles baseline tags, EXIF, IPTC, XMP, GeoTIFF keys, and most private IFD extensions in a single pass.

Install ExifTool

On macOS, install via Homebrew:

brew install exiftool

On Ubuntu/Debian:

sudo apt install libimage-exiftool-perl

On Windows, download the standalone executable from exiftool.org.

Read All Metadata

Run ExifTool against any TIFF file to dump every recognized tag:

exiftool photo.tiff

This outputs hundreds of lines for a well-tagged TIFF. To narrow the output, use group filters:

exiftool -EXIF:all photo.tiff
exiftool -IPTC:all photo.tiff
exiftool -XMP:all photo.tiff

Extract GeoTIFF Spatial Data

For GeoTIFF files, ExifTool reads the GeoKeys stored in tags 34735 (GeoKeyDirectoryTag), 34736 (GeoDoubleParamsTag), and 34737 (GeoAsciiParamsTag):

exiftool -GeoTiffDirectory -GeoTiffDoubleParams -GeoTiffAsciiParams satellite.tif

This returns the coordinate reference system, projection type, and datum. For the pixel-to-coordinate transformation matrix, check the ModelTiepointTag and ModelPixelScaleTag:

exiftool -ModelTiePoint -ModelPixelScale satellite.tif

Handle Multi-Page TIFFs

TIFF files can contain multiple pages, each with its own IFD and metadata. ExifTool reads all pages by default and numbers them sequentially. To extract metadata from a specific page, use the -use option with the MWG module or filter by the PageNumber tag to isolate the page you need.

To export all metadata as JSON for downstream processing:

exiftool -json -struct multipage.tiff > metadata.json

Extract TIFF Metadata with Python

Python offers several libraries for TIFF metadata extraction, each suited to different use cases.

Pillow for Baseline Tags and EXIF

Pillow (the maintained fork of PIL) reads baseline TIFF tags and EXIF data directly:

from PIL import Image
from PIL.ExifTags import TAGS

img = Image.open("photo.tiff")

### Baseline TIFF tags
tiff_tags = img.tag_v2
for tag_id, value in tiff_tags.items():
    tag_name = TAGS.get(tag_id, f"Unknown({tag_id})")
    print(f"{tag_name}: {value}")

### EXIF sub-IFD
exif_data = img.getexif()
for tag_id, value in exif_data.items():
    tag_name = TAGS.get(tag_id, f"Unknown({tag_id})")
    print(f"{tag_name}: {value}")

ExifRead for Lightweight Extraction

The exifread library is a pure-Python option with no compiled dependencies. It handles TIFF, JPEG, HEIC, and several RAW formats:

import exifread

with open("photo.tiff", "rb") as f:
    tags = exifread.process_file(f)
    for tag, value in tags.items():
        print(f"{tag}: {value}")

Tifffile for Scientific and Multi-Page TIFFs

For scientific imaging, microscopy, or multi-page TIFFs, the tifffile library provides low-level access to every IFD:

import tifffile

with tifffile.TiffFile("multipage.tiff") as tif:
    for i, page in enumerate(tif.pages):
        print(f"--- Page {i} ---")
        for tag in page.tags.values():
            print(f"{tag.name}: {tag.value}")

This is the only Python library that reliably iterates over every page in a multi-page TIFF and exposes per-page metadata individually.

PyExifTool for Full ExifTool Coverage

If you need the same comprehensive tag coverage as the ExifTool CLI but want to stay in Python, pyexiftool wraps the command-line tool in a persistent subprocess:

import exiftool

files = ["scan_001.tiff", "scan_002.tiff", "scan_003.tiff"]
with exiftool.ExifToolHelper() as et:
    metadata = et.get_metadata(files)
    for item in metadata:
        print(item.get("SourceFile"), item.get("EXIF:Make"))

Running ExifTool in batch mode this way avoids the startup cost of launching a new process per file, which matters when processing hundreds or thousands of TIFFs.

Fastio features

Turn TIFF archives into searchable, structured data

Upload TIFF files to a Fast.io workspace and let Metadata Views extract structured fields from both embedded tags and image content. 50 GB free storage, no credit card required.

Extract GeoTIFF Metadata with GDAL and Rasterio

Standard image tools read EXIF and IPTC from GeoTIFF files, but they ignore the spatial reference data that makes GeoTIFF useful for mapping and remote sensing. For spatial metadata, you need GDAL or its Python binding, Rasterio.

GDAL Command Line

The gdalinfo command prints the full spatial metadata of any GeoTIFF:

gdalinfo satellite.tif

This outputs the coordinate reference system (CRS), the geotransform matrix (origin point, pixel size, rotation), band count, data type, and any embedded metadata domains. For machine-readable output:

gdalinfo -json satellite.tif

Rasterio in Python Rasterio provides a Pythonic interface to GDAL that reads GeoTIFF spatial metadata cleanly:

import rasterio

with rasterio.open("satellite.tif") as src:
    print(f"CRS: {src.crs}")
    print(f"Bounds: {src.bounds}")
    print(f"Resolution: {src.res}")
    print(f"Band count: {src.count}")
    print(f"Data type: {src.dtypes}")
    print(f"Transform: {src.transform}")

### Read custom metadata domains
    tags = src.tags()
    print(f"File tags: {tags}")

The src.crs attribute returns an EPSG code or WKT string. The src.transform is an affine matrix that converts pixel coordinates to geographic coordinates. Combined with src.bounds, you can calculate the exact geographic extent of the image.

When to Use GDAL vs. ExifTool

Use GDAL/Rasterio when you need to work with spatial data: reprojecting coordinates, calculating coverage areas, or feeding metadata into a GIS pipeline. Use ExifTool when you need EXIF camera data, IPTC credits, or XMP editing history from the same GeoTIFF file. For a complete picture, run both.

AI-powered document analysis and metadata extraction interface

Automate TIFF Metadata Extraction at Scale

Extracting metadata from a few TIFF files is straightforward. Extracting it from thousands of files in an archive, across a distributed team, or as part of an automated pipeline requires a different approach.

Batch Processing with ExifTool

ExifTool can process entire directory trees and output structured data:

exiftool -r -json -ext tiff -ext tif ./archive/ > all_metadata.json

The -r flag recurses into subdirectories. The -ext flags limit processing to TIFF files only. The JSON output feeds directly into databases, spreadsheets, or analysis scripts.

For selective extraction, pull only the fields you need:

exiftool -r -csv -FileName -ImageWidth -ImageHeight -Make -Model -GPSLatitude -GPSLongitude ./archive/ > metadata.csv

Python Pipeline Example

For custom processing logic, combine tifffile with batch iteration:

import tifffile
import json
from pathlib import Path

results = []
for tiff_path in Path("./archive").rglob("*.tif"):
    with tifffile.TiffFile(tiff_path) as tif:
        file_meta = {"file": str(tiff_path), "pages": []}
        for i, page in enumerate(tif.pages):
            page_meta = {}
            for tag in page.tags.values():
                page_meta[tag.name] = str(tag.value)
            file_meta["pages"].append(page_meta)
        results.append(file_meta)

with open("metadata.json", "w") as f:
    json.dump(results, f, indent=2)

AI-Powered Extraction with Fast.io Metadata Views

When TIFF files contain information that standard metadata tags do not capture (scanned documents, handwritten labels on archival photos, embedded text in scientific imagery), traditional extraction tools fall short. Metadata Views takes a different approach: you describe the fields you want in plain language, and AI extracts structured data from the file content itself.

Upload TIFF files to a Fast.io workspace, define columns like "subject," "date taken," "location," or "document type," and Metadata Views populates a sortable, filterable spreadsheet. This works for scanned pages, photographs, and any TIFF where the valuable information lives in the image content rather than the embedded tags.

For teams managing large TIFF archives (museum collections, surveying firms, medical imaging departments), this combines tag-based metadata extraction with content-based extraction in one workspace. Standard EXIF and IPTC data gets indexed automatically through Intelligence Mode, while Metadata Views handles the unstructured content.

Troubleshooting Common TIFF Metadata Issues

TIFF metadata extraction fails in predictable ways. Here are the issues you will hit and how to handle them.

Missing or Stripped Metadata

Some image editors strip metadata during save operations. Photoshop preserves all metadata layers by default, but web export tools, batch converters, and compression utilities often discard EXIF, IPTC, and XMP to reduce file size. If a TIFF has no metadata beyond baseline tags, check whether the file passed through a conversion step.

To verify what metadata exists before processing:

exiftool -a -G1 photo.tiff

The -a flag shows duplicate tags, and -G1 groups results by metadata family so you can see exactly which layers are present.

BigTIFF and Large File Handling

Standard TIFF files are limited to 4 GB by the 32-bit offset structure. Files larger than 4 GB use the BigTIFF variant (identified by magic number 43 instead of 42 in the header). Most modern tools handle BigTIFF transparently, but older libraries may fail silently or truncate data. ExifTool, tifffile, GDAL, and Rasterio all support BigTIFF.

Corrupted IFD Chains

Multi-page TIFF files link IFDs in a chain, where each IFD points to the next. If one link breaks (due to file corruption, incomplete writes, or buggy software), all subsequent pages become inaccessible. ExifTool and tifffile will read up to the broken link and report an error. Recovery tools like tiffinfo from the LibTIFF suite can diagnose the exact offset where the chain breaks:

tiffinfo -D corrupted.tiff

Conflicting Metadata Between Layers

A TIFF file can have the same field (like "Author" or "Copyright") set differently in IPTC, XMP, and EXIF layers. Most tools follow the Metadata Working Group (MWG) priority order: XMP takes precedence over IPTC, which takes precedence over EXIF. ExifTool's -use MWG option enforces this automatically.

Frequently Asked Questions

How do I read metadata from a TIFF file?

The fast method is ExifTool on the command line. Run 'exiftool photo.tiff' to see all recognized metadata tags. For specific layers, use 'exiftool -EXIF:all photo.tiff' for camera data or 'exiftool -IPTC:all photo.tiff' for editorial fields. In Python, use the Pillow library for baseline tags and EXIF, or the exifread library for a lightweight pure-Python option.

What metadata does a TIFF file contain?

TIFF files store metadata in Image File Directory (IFD) entries. A standard TIFF contains baseline tags (dimensions, compression, color space), EXIF data (camera settings, GPS), IPTC fields (credits, keywords, copyright), and XMP data (editing history, custom properties). GeoTIFF files add spatial reference tags for coordinate systems and map projections. The TIFF 6.0 spec defines over 80 baseline tags, with hundreds more available through extensions.

How do I extract GeoTIFF metadata?

Use GDAL's gdalinfo command for a quick readout of the coordinate reference system, geotransform matrix, and band information. In Python, Rasterio provides the same data through a cleaner API: open the file with rasterio.open() and access src.crs for the coordinate system, src.bounds for geographic extent, and src.transform for the affine matrix. ExifTool can also read GeoTIFF keys but does not interpret the spatial math.

Can I view TIFF EXIF data online?

Several web tools accept TIFF uploads for metadata viewing, but they typically only read EXIF and basic tags. They will miss GeoTIFF spatial data, multi-page IFD structures, and private vendor tags. For complete metadata extraction, desktop tools like ExifTool or Python libraries give you access to every tag layer without file size limits or upload restrictions.

How do I handle multi-page TIFF metadata?

Each page in a multi-page TIFF has its own Image File Directory (IFD) with independent metadata. ExifTool reads all pages automatically and reports them sequentially. In Python, the tifffile library lets you iterate over each page individually with tif.pages, accessing per-page tags through page.tags. Standard libraries like Pillow only read the first page by default.

What is the difference between TIFF metadata and JPEG metadata?

Both formats support EXIF, IPTC, and XMP metadata, but TIFF's native IFD structure stores this data differently. TIFF embeds EXIF as a sub-IFD with direct tag access, while JPEG wraps EXIF in an APP1 marker segment. TIFF also supports multi-page metadata (one IFD per page), GeoTIFF spatial tags, and a wider range of private extension tags. JPEG metadata is typically limited to a single image.

Related Resources

Fastio features

Turn TIFF archives into searchable, structured data

Upload TIFF files to a Fast.io workspace and let Metadata Views extract structured fields from both embedded tags and image content. 50 GB free storage, no credit card required.