AI & Agents

How to Extract Metadata from WebP and AVIF Images

WebP and AVIF store EXIF, XMP, and ICC profile metadata inside container structures that older tools often skip entirely. This guide covers how to extract metadata from both formats using ExifTool, webpmux, Pillow, ExifReader, and libheif, with practical commands and code snippets for each approach.

Fast.io Editorial Team 11 min read
File metadata extraction and analysis interface

How WebP and AVIF Store Metadata Differently

Both WebP and AVIF support standard metadata types (EXIF, XMP, ICC color profiles), but they package them in fundamentally different container formats. Understanding the container structure explains why some tools extract metadata from JPEG without trouble but return nothing for WebP or AVIF.

WebP uses RIFF (Resource Interchange File Format). The file starts with a RIFF header, followed by a series of chunks identified by four-character codes. Metadata lives in dedicated chunks: EXIF for Exchangeable Image File Format data, XMP (with a trailing space) for XMP, and ICCP for ICC color profiles. Each chunk stores a 4-byte identifier, a 32-bit size field, and the raw metadata payload. The WebP spec allows at most one chunk of each metadata type per file. Odd-sized chunks get zero-padded to maintain alignment.

AVIF uses ISOBMFF (ISO Base Media File Format). This is the same box-based container that HEIF and MP4 use. The file opens with a File Type Box (ftyp) that identifies it as AVIF, followed by nested boxes for image data, item properties, and metadata. EXIF and XMP metadata are stored in dedicated boxes referenced through the item information table. The hierarchical box structure is more complex than RIFF chunks, which is why AVIF metadata support arrived later in most tools.

The practical difference comes down to tool maturity. WebP's RIFF format has been parseable by most metadata libraries since 2014. AVIF's ISOBMFF boxes required explicit support that many libraries only added in 2020 or later. If your extraction tool was built or last updated before AVIF support was added, it will silently return empty results rather than throwing an error.

One more quirk worth noting: neither WebP nor AVIF supports IPTC metadata natively at the container level. Some tools like Exiv2 will derive IPTC values by converting from XMP or EXIF data, but those derived values exist only in the tool's output, not in the actual file. If you need IPTC fields, extract the XMP block and parse it separately.

Reading WebP and AVIF Metadata with ExifTool

ExifTool, created by Phil Harvey, is the most reliable command-line option for reading metadata from both WebP and AVIF. It has supported WebP metadata since version 9.51 and added AVIF read/write support in version 11.79 (December 2019). The current version handles both formats without any special flags or configuration.

Installing ExifTool

macOS (Homebrew):

brew install exiftool

Ubuntu/Debian:

sudo apt install libimage-exiftool-perl

Windows: Download the standalone executable from exiftool.org and add it to your PATH.

Reading All Metadata

Run ExifTool against any WebP or AVIF file to dump all available metadata:

exiftool photo.webp
exiftool photo.avif

Both commands produce the same structured output: file type, MIME type, image dimensions, bit depth, and any embedded EXIF tags (camera model, GPS coordinates, date/time, exposure settings) plus XMP and ICC profile data.

Extracting Specific Fields

Pull just the fields you need with tag flags:

exiftool -Model -DateTimeOriginal -GPSLatitude -GPSLongitude photo.webp

JSON Output for Scripting

For pipeline integration, export as JSON:

exiftool -json -a photo.avif > metadata.json

The -a flag includes duplicate tags, which matters for AVIF files where the ISOBMFF container may store the same field in multiple boxes.

Batch Processing

Process an entire directory of mixed formats in one pass:

exiftool -csv -r ./images/ > all_metadata.csv

ExifTool auto-detects the format of each file regardless of extension. This handles the common case where a build tool outputs .webp and .avif files into the same directory and you need a unified metadata report.

Automated metadata analysis and audit interface

Using webpmux for WebP-Specific Extraction

Google's webpmux tool is part of the libwebp package and works directly with WebP's RIFF chunk structure. It is more limited than ExifTool (WebP only, no AVIF), but useful when you need to extract raw metadata blocks for further processing.

Installing webpmux

macOS:

brew install webp

Ubuntu/Debian:

sudo apt install webp

Extracting Raw EXIF Data

webpmux -get exif photo.webp -o photo_exif.bin

This writes the raw EXIF binary blob to a file. The output is a standard TIFF-structured EXIF block that you can then inspect with ExifTool or any EXIF parser:

exiftool photo_exif.bin

Extracting XMP Data

webpmux -get xmp photo.webp -o photo_xmp.xml

The output is a standard XML file you can parse with any XMP library or even a text editor.

Extracting ICC Profiles

webpmux -get icc photo.webp -o photo_profile.icc

Stripping All Metadata

If you need to remove metadata before sharing (for privacy or file size), webpmux can strip all metadata chunks:

webpmux -strip exif -strip xmp -strip icc photo.webp -o clean_photo.webp

webpmux is particularly handy in build pipelines where you already have libwebp installed for image conversion. It adds no extra dependencies and runs faster than ExifTool for WebP-only workflows because it reads the RIFF chunks directly without format detection overhead.

Fastio features

Stop Running ExifTool on Every Image Manually

Fast.io Metadata Views extract structured data from images at scale. Describe the fields you need, and AI populates a queryable spreadsheet from every file in your workspace. 50 GB free, no credit card required.

Extracting AVIF Metadata with libheif and Exiv2

AVIF's ISOBMFF container requires tools that understand the box structure. Two good options are libheif (a C library with Python bindings) and Exiv2 (a C++ library with a command-line interface).

Exiv2

Exiv2 supports AVIF, HEIF, and CR3 files when built with BMFF support enabled. Most recent package manager installations include this by default, but you can verify:

exiv2 --version --verbose --grep bmff

Look for enable_bmff=1 in the output. If BMFF support is present, reading AVIF metadata works the same as any other format:

exiv2 -pa photo.avif

This prints all EXIF and IPTC tags. For XMP:

exiv2 -px photo.avif

Exiv2 also supports modifying and removing metadata, making it useful when you need to clean AVIF files before distribution.

libheif

libheif is a C library for reading and writing HEIF and AVIF files. It parses the ISOBMFF box structure natively and exposes metadata through its API. The command-line tool heif-info shows file structure:

heif-info photo.avif

For programmatic access, the Python bindings (pyheif or pillow-heif) let you extract EXIF data directly:

import pillow_heif
from PIL import Image

pillow_heif.register_avif_opener()
img = Image.open("photo.avif")
exif_data = img.getexif()

for tag_id, value in exif_data.items():
    print(f"{tag_id}: {value}")

The pillow-heif library bridges libheif's ISOBMFF parser with Pillow's familiar API, so you can use the same getexif() method across JPEG, WebP, and AVIF files.

AI-powered file indexing and metadata extraction

Python and JavaScript Libraries for Programmatic Extraction

When you need to extract metadata inside an application rather than from the command line, several libraries handle both WebP and AVIF well.

Python with Pillow

Pillow (PIL Fork) supports reading EXIF and XMP from both WebP and AVIF. AVIF support requires Pillow 9.2 or later.

from PIL import Image
from PIL.ExifTags import TAGS

img = Image.open("photo.webp")
exif_data = img.getexif()

for tag_id, value in exif_data.items():
    tag_name = TAGS.get(tag_id, tag_id)
    print(f"{tag_name}: {value}")

The same code works for AVIF files. Pillow auto-detects the format and extracts from the appropriate container structure. For XMP data, use img.info.get("xmp") to get the raw XMP XML bytes.

One limitation: Pillow's EXIF support covers standard tags but not all manufacturer-specific MakerNote data. For comprehensive tag coverage, shell out to ExifTool or use the pyexiftool wrapper library.

JavaScript with ExifReader

ExifReader is a client-side

JavaScript library that supports JPEG, PNG, WebP, AVIF, HEIC, and TIFF. It parses EXIF, IPTC, XMP, ICC, and MPF metadata depending on the format.

import ExifReader from 'exifreader';

const tags = await ExifReader.load('photo.webp');
console.log(tags.DateTimeOriginal?.description);
console.log(tags.GPSLatitude?.description);

ExifReader works in both Node.js and browser environments. In the browser, pass a File object or ArrayBuffer instead of a file path. This makes it useful for client-side metadata preview before uploading images to a workspace or CDN.

Node.js with sharp

The sharp image processing library exposes metadata through its API:

const sharp = require('sharp');

const metadata = await sharp('photo.avif').metadata();
console.log(metadata.width, metadata.height);
console.log(metadata.exif);  // raw EXIF buffer
console.log(metadata.icc);   // raw ICC buffer

sharp uses libvips under the hood, which has native WebP and AVIF support. The metadata.exif field returns a raw buffer that you can parse with ExifReader or a similar library for human-readable tag names.

For workflows where images arrive from multiple sources in mixed formats, these libraries normalize the extraction interface so your code handles JPEG, WebP, and AVIF identically.

Handling Metadata Loss During Format Conversion

Converting images to WebP or AVIF does not always preserve metadata. Whether metadata survives depends entirely on the conversion tool, not the target format. Both WebP and AVIF are capable of storing full EXIF, XMP, and ICC data, but many conversion tools strip metadata by default for smaller file sizes.

Tools That Preserve

Metadata ExifTool copy-on-convert: After converting with any tool, copy metadata from the original:

exiftool -TagsFromFile original.jpg converted.webp

This reads all transferable tags from the source file and writes them into the destination's container. It works across formats, including JPEG to WebP, PNG to AVIF, or TIFF to WebP.

cwebp with metadata flag: Google's WebP encoder preserves metadata when you request it:

cwebp -metadata all input.jpg -o output.webp

The -metadata flag accepts all, exif, xmp, icc, or a comma-separated combination.

avifenc with metadata transfer: The reference AVIF encoder from libavif can preserve EXIF and XMP:

avifenc --exif exif_data.bin --xmp xmp_data.xml input.png output.avif

This requires extracting the metadata first, then embedding it during encoding. It is less convenient than ExifTool's copy approach but gives you full control over exactly which metadata blocks are included.

Common Pitfalls

ImageMagick's convert strips metadata by default. Use -define webp:metadata=all or run ExifTool afterward to restore it.

WordPress and CMS image processing pipelines often strip EXIF during upload optimization. If your workflow depends on metadata surviving upload, test with a file containing known GPS coordinates and verify they exist in the processed output.

CDN image optimization (Cloudflare Polish, Fastly Image Optimizer, Imgix) may strip or alter metadata during automatic format conversion. Check your CDN's documentation for metadata preservation settings.

For teams managing large image libraries where metadata preservation matters, storing originals in a workspace with versioning gives you a clean fallback. If a conversion pipeline strips metadata, you can always re-extract from the original. Fast.io workspaces keep every version of every file, so the metadata-rich original is always recoverable even after converting to WebP or AVIF for web delivery.

Fast.io's Metadata Views take this further. Instead of manually running ExifTool across hundreds of images, you can describe the fields you want extracted (camera model, GPS coordinates, creation date, color profile) in plain language, and Metadata Views populates a sortable, filterable spreadsheet from every image in the workspace. When new images arrive or you add columns, extraction runs incrementally without reprocessing files you have already handled.

Workspace file management with versioning and metadata

Frequently Asked Questions

Does WebP support EXIF metadata?

Yes. WebP stores EXIF metadata in a dedicated RIFF chunk labeled 'EXIF' inside the file's container structure. It also supports XMP metadata (in an 'XMP ' chunk) and ICC color profiles (in an 'ICCP' chunk). ExifTool, Exiv2, webpmux, Pillow, and ExifReader all support reading EXIF from WebP files.

How do I read metadata from AVIF files?

Use ExifTool (version 11.79 or later), Exiv2 with BMFF support enabled, or a library like pillow-heif in Python. ExifTool is the simplest option because it requires no special configuration. Run "exiftool photo.avif" to dump all metadata, or use tag flags to extract specific fields like GPS coordinates or camera model.

Does converting to WebP remove EXIF data?

It depends on the conversion tool. Many tools strip metadata by default to reduce file size. Google's cwebp encoder preserves metadata when you add the "-metadata all" flag. ImageMagick's convert command strips metadata unless you pass "-define webp:metadata=all". You can also restore metadata after conversion by running "exiftool -TagsFromFile original.jpg converted.webp" to copy tags from the source file.

Which tools support AVIF metadata extraction?

ExifTool has full AVIF read/write support. Exiv2 supports AVIF when built with BMFF enabled (check with "exiv2 --version --verbose --grep bmff"). In Python, Pillow 9.2+ and pillow-heif both read AVIF EXIF data. In JavaScript, ExifReader handles AVIF in both Node.js and browser environments. The sharp library also exposes raw EXIF buffers from AVIF files through its metadata API.

Can WebP and AVIF store GPS location data?

Yes. Both formats support EXIF metadata, which includes GPS latitude, longitude, altitude, and timestamp tags. The GPS data is stored in the same structure as JPEG EXIF GPS data, just packaged inside a different container (RIFF for WebP, ISOBMFF for AVIF). Extract GPS data with "exiftool -GPSLatitude -GPSLongitude photo.webp" or the equivalent for AVIF files.

Why does my metadata extraction tool return empty results for AVIF files?

Your tool likely does not support AVIF's ISOBMFF container format. AVIF uses the same box-based structure as HEIF and MP4, which requires explicit parser support that many metadata tools only added after 2020. Update to the latest version of your tool, or switch to ExifTool, which has supported AVIF since version 11.79 (December 2019).

Related Resources

Fastio features

Stop Running ExifTool on Every Image Manually

Fast.io Metadata Views extract structured data from images at scale. Describe the fields you need, and AI populates a queryable spreadsheet from every file in your workspace. 50 GB free, no credit card required.