MediaInfo Tutorial: How to Extract Video and Audio Metadata
MediaInfo is a free, open-source tool that reads technical and tag metadata from video, audio, and container files. This tutorial covers installation, CLI usage, output format options (JSON, XML, custom templates), batch processing, and integration into automated QC pipelines. You will also learn how MediaInfo compares to ffprobe and where each tool fits in a media workflow.
What MediaInfo Does and Why It Matters
MediaInfo is an open-source tool that displays technical and tag metadata for video, audio, and container formats including codec, bitrate, resolution, and embedded subtitles. It reads the internal structure of a media file and reports every property it finds, from frame rate and color primaries down to the language tags on individual audio tracks.
The tool supports over 100 container and codec formats: MP4, MKV, AVI, MOV, MXF, GXF, FLV, WebM, MPEG-TS, MPEG-PS, and dozens more on the container side, plus H.264, H.265/HEVC, AV1, ProRes, DNxHD, VP9, AAC, FLAC, Opus, DTS, Dolby E, and many others for video and audio streams. That breadth makes it useful whether you are working with consumer camera footage, broadcast mezzanine files, or archival masters.
MediaInfo is used by major broadcasters and archives for quality control workflows. The Library of Congress, the British Film Institute, and the European Broadcasting Union all reference it in their preservation toolchains. It is licensed under BSD-2-Clause, so you can use it in commercial projects without restriction.
Why does this matter in day-to-day work? A few common use cases:
- Pre-transcode validation: Confirm codec, bitrate, and resolution before sending files through an encoding pipeline
- Quality control: Verify that delivered files meet spec (correct frame rate, audio channel layout, color space)
- Cataloging: Extract metadata from hundreds of files to build a searchable database of your media library
- Debugging: Identify why a player chokes on a specific file by inspecting its container structure and stream properties
Installing the MediaInfo CLI
MediaInfo ships as both a GUI application and a CLI tool. For automation work, the CLI is what you want. Here is how to install it on each major platform.
macOS
The fast route is Homebrew. The media-info formula gives you the CLI binary:
brew install media-info
If you also want the GUI app, install the cask instead:
brew install --cask mediainfo
Linux (Debian/Ubuntu)
On Debian-based distributions, MediaInfo is in the default repositories:
sudo apt install mediainfo
For the latest version, MediaArea maintains official repositories at mediaarea.net/en/Repos with packages for Ubuntu, Debian, Fedora, CentOS, and openSUSE. Adding the MediaArea repo ensures you get new releases (the current version as of early 2026 is 26.01) instead of whatever ships with your distro.
Windows
Download the CLI package from mediaarea.net/en/MediaInfo/Download/Windows. The ZIP contains a standalone mediainfo.exe binary with no installer required. Add it to your PATH for terminal access. Alternatively, if you use Chocolatey:
choco install mediainfo-cli
Verifying the Installation
Run a quick test to confirm everything works:
mediainfo --Version
You should see output like MediaInfo Command line, MediaInfoLib - v26.01.
Basic CLI Usage and Output Modes
Point MediaInfo at any media file and it returns a structured report:
mediainfo input.mp4
The default output is plain text, organized by stream type. You will see a General section (container format, file size, duration, overall bitrate), followed by Video, Audio, Text (subtitles), and Menu sections as applicable.
Reading Specific Stream Types
If you only care about one stream type, the --Inform parameter lets you target it:
mediainfo --Inform="Video;%Format% %Width%x%Height% %FrameRate%fps" input.mp4
This prints something like AVC 1920x1080 23.976fps with no extra text. The %Parameter% tokens map to MediaInfo's internal field names, and the full list is documented at mediaarea.net/en/MediaInfo/Support/Fields.
Full Output
The default mode omits some less common fields. To see everything MediaInfo knows about a file, add the --Full flag:
mediainfo --Full input.mp4
This can produce hundreds of lines for a complex MXF or Matroska file with multiple audio and subtitle tracks.
Quick Summary
For a condensed view, you can extract just the most useful fields. Here is an example that pulls container format, video codec, resolution, frame rate, and primary audio codec in a single line:
mediainfo --Inform="General;%Format% | Video;%Format% %Width%x%Height% %FrameRate%fps | Audio;%Format% %Channel(s)%ch %SamplingRate%Hz" input.mp4
Each section in the Inform string is separated by the stream type prefix. MediaInfo fills in the tokens and prints the result on one line, which makes it easy to pipe into other tools.
Store and Search Your Media Metadata in One Place
Upload video files and metadata reports to Fast.io workspaces. Intelligence Mode indexes everything for semantic search, and Metadata Views extract structured data from your documents. 50 GB free, no credit card required.
Exporting JSON, XML, and Custom Templates
Plain text is fine for quick checks, but automated pipelines need structured data. MediaInfo supports JSON, XML, PBCore, and EBUCore output natively.
JSON Output
mediainfo --Output=JSON input.mp4
This produces a well-formed JSON document with nested objects for each stream. You can pipe it directly to jq for field extraction:
mediainfo --Output=JSON input.mp4 | jq '.media.track[] | select(."@type"=="Video") | {codec: .Format, width: .Width, height: .Height, bitrate: .BitRate}'
XML Output
mediainfo --Output=XML input.mp4
Starting with version 26.01, MediaInfo ships an XSD schema for its XML output, which makes it straightforward to validate programmatically. The XML format is useful for integration with archival systems that expect PBCore or custom schemas.
Custom Templates
The --Inform parameter also accepts template files. Create a text file with section headers and field tokens:
General;Filename: %FileName%
Format: %Format%
Duration: %Duration/String3%
Size: %FileSize/String3%
Video;Codec: %Format%
Resolution: %Width%x%Height%
Frame Rate: %FrameRate% fps
Bit Rate: %BitRate/String%
Color Space: %ColorSpace%
Audio;Audio: %Format% %Channel(s)%ch %SamplingRate/String% %BitRate/String%
Save it as template.txt, then reference it:
mediainfo --Inform=file://template.txt input.mp4
The /String suffix on numeric fields (like Duration/String3 or BitRate/String) tells MediaInfo to return human-readable values ("1h 23m 45s") instead of raw milliseconds or bits per second. Use the raw numeric versions when you need precise values for calculations.
CSV-Style Output for Spreadsheets
For catalog-building, combine --Inform with comma separation:
mediainfo --Inform="General;%FileName%,%Format%,%Duration/String3%,%FileSize/String3%
" *.mp4 > catalog.csv
This gives you one row per file, ready to open in a spreadsheet.
Batch Processing and Pipeline Integration
MediaInfo accepts multiple file arguments, so the simplest batch approach is passing a glob:
mediainfo --Output=JSON *.mp4 *.mkv > batch_report.json
For recursive processing across directories, combine it with find:
find /media/archive -name "*.mxf" -exec mediainfo --Output=JSON {} \; > archive_metadata.json
Building a QC Script
A practical quality control script might check that all delivered files meet broadcast specs. Here is a Bash example that flags files where the video codec is not H.264 or the frame rate is not 29.97:
#!/bin/bash
for file in /delivery/*.mp4; do
codec=$(mediainfo --Inform="Video;%Format%" "$file")
fps=$(mediainfo --Inform="Video;%FrameRate%" "$file")
if [ "$codec" != "AVC" ] || [ "$fps" != "29.970" ]; then
echo "FAIL: $file (codec=$codec, fps=$fps)"
fi
done
Python Integration with pymediainfo
If you are building a Python pipeline, the pymediainfo library wraps libmediainfo and returns parsed results as Python objects:
from pymediainfo import MediaInfo
media_info = MediaInfo.parse("input.mp4")
for track in media_info.tracks:
if track.track_type == "Video":
print(f"{track.format} {track.width}x{track.height} {track.frame_rate}fps")
elif track.track_type == "Audio":
print(f"{track.format} {track.channel_s}ch {track.sampling_rate}Hz")
Install it with pip install pymediainfo. Note that pymediainfo requires libmediainfo to be installed on the system (the apt install mediainfo or brew install media-info step handles this).
Node.js Integration
For JavaScript pipelines, the mediainfo.js package provides a WebAssembly build of libmediainfo that runs without native dependencies:
import MediaInfoFactory from 'mediainfo.js';
const mediainfo = await MediaInfoFactory();
const result = await mediainfo.analyzeFile(fileHandle);
console.log(result);
This is useful for browser-based tools or serverless functions where installing native binaries is impractical.
MediaInfo vs ffprobe
Both tools extract metadata from media files, but they approach the problem differently and each has strengths the other lacks.
MediaInfo reads container-level metadata thoroughly. It parses atoms (MP4), elements (MKV), and headers across a wider range of container formats, including broadcast formats like MXF, GXF, and LXF that ffprobe handles less completely. MediaInfo also reads tag metadata (title, artist, copyright, ISRC codes) more reliably across containers. Its --Inform template system gives you fine-grained control over output formatting.
ffprobe is part of the FFmpeg suite, so it is already installed anywhere FFmpeg is present. It excels at stream-level analysis: precise packet timing, individual frame properties, and per-GOP statistics. It is faster on large files because it can selectively read sections of the file rather than always scanning the entire container. For scripted pipelines that already use FFmpeg for transcoding, ffprobe keeps you in one ecosystem.
When to Use Which
- Container compliance checks (MXF AS-07, Matroska profiles): MediaInfo
- Tag metadata reading (title, copyright, embedded artwork): MediaInfo
- Frame-accurate packet analysis: ffprobe
- Integration with FFmpeg transcoding pipelines: ffprobe
- Archival QC workflows: MediaInfo (it generates PBCore and EBUCore XML natively)
- Quick codec/bitrate checks: either works, pick whichever is already installed
In practice, many professional workflows use both. MediaInfo handles the initial QC pass and cataloging, while ffprobe drives the transcoding decisions and per-frame analysis.
Organizing Extracted Metadata at Scale
Once you are extracting metadata from hundreds or thousands of files, the next challenge is storing, searching, and acting on that data. JSON exports from MediaInfo give you structured records, but you still need somewhere to put them.
Local databases work well for single-workstation setups. Pipe MediaInfo JSON into SQLite or PostgreSQL and query across your entire catalog. For team workflows where multiple people need to access the same metadata, a shared platform saves the step of syncing database exports.
Fast.io's Metadata Views takes a different approach to the problem. Instead of writing extraction scripts, you describe the fields you want in natural language, and the AI designs a typed schema (Text, Integer, Decimal, Boolean, Date, and more). It then matches files in your workspace and populates a sortable, filterable spreadsheet. This works with PDFs, images, documents, and presentations, so if your pipeline handles mixed media alongside video, you can extract structured data from both using one tool.
For video-specific metadata, MediaInfo remains the right tool for the job since it understands codec details, stream properties, and container structures that general-purpose extractors miss. A practical workflow combines both: use MediaInfo for deep technical metadata on your media files, and use Metadata Views for everything else in the workspace. The video metadata catalog feeds your transcoding and QC pipeline, while the document metadata feeds your project tracking and compliance workflows.
Fast.io workspaces also support Intelligence Mode, which auto-indexes uploaded files for semantic search and citation-backed AI chat. Upload your MediaInfo JSON reports alongside the source files, and the entire collection becomes searchable by meaning, not just filename.
Frequently Asked Questions
How do I use MediaInfo from the command line?
Install MediaInfo via your system's package manager (brew install media-info on macOS, apt install mediainfo on Debian/Ubuntu, or choco install mediainfo-cli on Windows). Then run 'mediainfo input.mp4' to see a full metadata report. Use the --Inform parameter for custom output, --Output=JSON for structured data, and --Full for every available field.
What is the difference between MediaInfo and ffprobe?
MediaInfo excels at container-level metadata, tag data, and broadcast format support (MXF, GXF). It also generates PBCore and EBUCore XML for archival workflows. ffprobe is better at frame-level packet analysis and works alongside FFmpeg transcoding pipelines. Both are free and open source. Many professional workflows use them together.
How to export MediaInfo output as JSON?
Run 'mediainfo --Output=JSON input.mp4' to get a complete JSON representation of all streams and metadata. You can pipe the output to jq for field extraction or redirect it to a file with '> output.json'. MediaInfo also supports XML (--Output=XML), HTML (--Output=HTML), and custom template formats.
Is MediaInfo free to use?
Yes. MediaInfo is open-source software released under the BSD-2-Clause license. You can use it for personal and commercial purposes without restriction. The GUI and CLI versions are both free, and the underlying libmediainfo library can be embedded in your own applications.
Can MediaInfo read metadata from MXF broadcast files?
Yes. MXF is one of MediaInfo's strongest format areas. It reads operational patterns, essence descriptors, timecode tracks, AS-07 metadata, and C2PA manifests (added in version 26.01). This is why many broadcast QC workflows rely on MediaInfo rather than general-purpose tools.
How do I batch process multiple files with MediaInfo?
Pass multiple filenames or a glob pattern directly: 'mediainfo --Output=JSON *.mp4 > report.json'. For recursive directory processing, combine with find: 'find /path -name "*.mxf" -exec mediainfo --Output=JSON {} \;'. You can also use the pymediainfo Python library to loop over files programmatically.
Related Resources
Store and Search Your Media Metadata in One Place
Upload video files and metadata reports to Fast.io workspaces. Intelligence Mode indexes everything for semantic search, and Metadata Views extract structured data from your documents. 50 GB free, no credit card required.