diffoscope - in-depth comparison of files, archives, and directories
diffoscope --help
diffoscope [OPTIONS] [--json output_diff] path1 path2
diffoscope [OPTIONS] diff
diffoscope [OPTIONS] < diff
diffoscope will try to get to the bottom of what makes files or directories
different. It will recursively unpack archives of many kinds and transform
various binary formats into more human-readable form to compare them. It can
compare two tarballs, ISO images, or PDF just as easily.
It can be scripted through error codes, and a report can be produced with the
detected differences. The report can be text or HTML. When no type of report
has been selected, diffoscope defaults to write a text report on the standard
output.
diffoscope was initially started by the "reproducible builds" Debian
project and now being developed as part of the (wider)
???Reproducible
Builds??? initiative. It is meant to be able to quickly understand why
two builds of the same package produce different outputs. diffoscope was
previously named debbindiff.
See the
COMMAND-LINE EXAMPLES section further below to get you started,
as well as more detailed explanations of all the command-line options. The
same information is also available in
/usr/share/doc/diffoscope/README.rst or similar.
- path1
- First file or directory to compare.
- path2
- Second file or directory to compare.
- --debug
- Display debug messages
- --pdb
- Open the Python pdb debugger in case of crashes
-
--status-fd FD
- Send machine-readable status to file descriptor FD
-
--progress, --no-progress
- Show an approximate progress bar. Default: yes if stdin is
a tty, otherwise no.
- --no-default-limits
- Disable most default output limits and diff calculation
limits.
-
--load-existing-diff INPUT_FILE
- Load existing diff from file. Specify "-" to read
a diffoscope diff from stdin.
-
--text OUTPUT_FILE
- Write plain text output to given file (use - for
stdout)
-
--text-color WHEN
- When to output color diff. WHEN is one of {never, auto,
always}. Default: auto, meaning yes if the output is a terminal, otherwise
no.
- --output-empty
- If there was no difference, then output an empty diff for
each output type that was specified. In --text output, an empty
file is written.
-
--html OUTPUT_FILE
- Write HTML report to given file (use - for stdout)
-
--html-dir OUTPUT_DIR
- Write multi-file HTML report to given directory
-
--css URL
- Link to an extra CSS for the HTML report
-
--jquery URL
- URL link to jQuery, for --html and --html-dir
output. If this is a non-existent relative URL, diffoscope will create a
symlink to a system installation. (Paths searched:
/usr/share/javascript/jquery/jquery.js.) If not given,
--html output will not use JS but --htmldir will if it can
be found; give "disable" to disable JS on all outputs.
-
--json OUTPUT_FILE
- Write JSON text output to given file (use - for
stdout)
-
--markdown OUTPUT_FILE
- Write Markdown text output to given file (use - for
stdout)
-
--restructured-text OUTPUT_FILE
- Write RsT text output to given file (use - for stdout)
-
--difftool TOOL
- Compare differences one-by-one using the specified external
command similar to git-difftool(1)
-
--profile [OUTPUT_FILE]
- Write profiling info to given file (use - for stdout)
-
--diff-context LINES
- Lines of unified diff context to show. (default: 7)
-
--max-text-report-size BYTES
- Maximum bytes written in --text report. (0 to
disable, default: 0)
-
--max-report-size BYTES
- Maximum bytes of a report in a given format, across all of
its pages. Note that some formats, such as --html, may be
restricted by even smaller limits such as --max-page-size. (0 to
disable, default: 41943040)
-
--max-diff-block-lines LINES
- Maximum number of lines output per unified-diff block,
across all pages. (0 to disable, default: 1024)
-
--max-page-size BYTES
- Maximum bytes of the top-level (--html-dir) or sole
( --html) page. (default: 41943040, remains in effect even with
--no-default-limits)
-
--max-page-diff-block-lines LINES
- Maximum number of lines output per unified-diff block on
the top-level ( --html-dir) or sole (--html) page, before
spilling it into a child page ( --html-dir) or skipping the rest of
the diff block. (default: 128, remains in effect even with
--no-default-limits)
- --new-file
- Treat absent files as empty
-
--exclude GLOB_PATTERN
- Exclude files whose names (including any directory part)
match GLOB_PATTERN. Use this option to ignore files based on their
names.
-
--exclude-command REGEX_PATTERN
- Exclude commands that match REGEX_PATTERN. For example
'^readelf.*\s--debug-dump=info' and '^radare2.*' can takea long time and
differences here are likely secondary differences caused by something
represented elsewhere. Use this option to disable commands that use a lot
of resources.
-
--exclude-directory-metadata
{auto,yes,no,recursive}
- Exclude directory metadata. Useful if comparing files whose
filesystem-level metadata is not intended to be distributed to other
systems. This is true for most distributions package builders, but not
true for the output of commands such as `make install`. Metadata of
archive members remain un-excluded except if "recursive" choice
is set. Use this option to ignore permissions, timestamps, xattrs etc.
Default: 'no' if comparing two directories, else 'yes'. Note that
"file" metadata is actually a property of its containing
directory and is not relevant when distributing the file across
systems.
-
--extended-filesystem-attributes,
--no-extended-filesystem-attributes
- Check potentially-expensive filesystem extended attributes
such as POSIX ACLs, lsattr(1)/chattr(1) attributes etc. (default:
False)
-
--diff-mask REGEX_PATTERN
- Replace/unify substrings that match regular expression
REGEX_PATTERN from output strings before applying diff. For example, to
filter out a version number or changed path.
-
--fuzzy-threshold FUZZY_THRESHOLD
- Threshold for fuzzy-matching (0 to disable, 110 is default,
400 is high fuzziness)
-
--tool-prefix-binutils PREFIX
- Prefix for binutils program names, e.g.
"aarch64-linux-gnu-" for a foreign-arch binary or "g"
if you're on a non-GNU system.
-
--max-diff-input-lines LINES
- Maximum number of lines fed to diff(1) (0 to disable,
default: 4194304)
-
--max-container-depth DEPTH
- Maximum depth to recurse into containers. (Cannot be
disabled for security reasons, default: 50)
-
--timeout SECONDS
- Best-effort attempt at a global timeout in seconds. If
enabled, diffoscope will not recurse into any further sub-archives after X
seconds of total execution time. (default: no timeout) [experimental]
-
--max-diff-block-lines-saved LINES
- Maximum number of lines saved per diff block. Most users
should not need this, unless you run out of memory. This truncates diff(1)
output before emitting it in a report, and affects all types of output,
including --text and --json. (0 to disable, default: 0)
-
--use-dbgsym WHEN
- When to automatically use corresponding -dbgsym
packages when comparing .deb files. WHEN is one of {no, auto, yes}.
Default: auto, meaning yes if two .changes or .buildinfo files are
specified, otherwise no.
- --force-details
- Force recursing into the depths of file formats even if
files have the same content, only really useful for debugging diffoscope.
Default: False
-
--help, -h
- Show this help and exit
- --version
- Show program's version number and exit
-
--list-tools [DISTRO]
- Show external tools required and exit. DISTRO can be one of
{arch, debian, FreeBSD, guix}. If specified, the output will list packages
in that distribution that satisfy these dependencies.
- --list-debian-substvars
- List packages needed for Debian in 'substvar' format.
-
--list-missing-tools [DISTRO]
- Show missing external tools and exit. DISTRO can be one of
{arch, debian, FreeBSD, guix}. If specified, the output will list packages
in that distribution that satisfy these dependencies.
- Android APK files, Android boot images, Android
- package resource table (ARSC), Apple Xcode mobile
provisioning files, ar(1) archives, ASM Function, Berkeley DB database
files, bzip2 archives, character/block devices, ColorSync colour profiles
(.icc), Coreboot CBFS filesystem images, cpio archives, Dalvik .dex files,
Debian .buildinfo files, Debian .changes files, Debian source packages
(.dsc), Device Tree Compiler blob files, directories, ELF binaries,
ext2/ext3/ext4/btrfs/fat filesystems, Flattened Image Tree blob files,
FreeDesktop Fontconfig cache files, FreePascal files (.ppu), Gettext
message catalogues, GHC Haskell .hi files, GIF image files, Git
repositories, GNU R database files (.rdb), GNU R Rscript files (.rds),
Gnumeric spreadsheets, GPG keybox databases, Gzipped files, Hierarchical
Data Format database, HTML files (.html), ISO 9660 CD images, Java .class
files, Java .jmod modules, JavaScript files, JPEG images, JSON files,
Linux kernel images, LLVM IR bitcode files, local (UNIX domain) sockets
and named pipes (FIFOs), LZ4 compressed files, lzip compressed files,
macOS binaries, Microsoft Windows icon files, Microsoft Word .docx files,
Mono 'Portable Executable' files, Mozilla-optimized .ZIP archives,
Multimedia metadata, OCaml interface files, Ogg Vorbis audio files,
OpenOffice .odt files, OpenSSH public keys, OpenWRT package archives
(.ipk), PDF documents, PE32 files, PGP signatures, PGP signed/encrypted
messages, PNG images, PostScript documents, Public Key Cryptography
Standards (PKCS) files (version #7), Python .pyc files, RPM archives, Rust
object files (.deflate), Sphinx inventory files, SQLite databases,
SquashFS filesystems, symlinks, tape archives (.tar), tcpdump capture
files (.pcap), text files, TrueType font files, U-Boot legacy image files,
WebAssembly binary module, XML binary schemas (.xsb), XML files, XMLB
files, XZ compressed files, ZIP archives and Zstandard compressed
files.
- <https://diffoscope.org/>
- <https://salsa.debian.org/reproducible-builds/diffoscope/issues>
Exit status is 0 if inputs are the same, 1 if different, 2 if trouble.
To compare two files in-depth and produce an HTML report, run something like:
$ diffoscope --html output.html build1.changes build2.changes
diffoscope will exit with 0 if there's no differences and 1 if there are.
To get all possible options, run:
If you have enough RAM, you can improve performance by running:
$ TMPDIR=/run/shm diffoscope very-big-input-0/ very-big-input-1/
By default this allowed to use up half of RAM; for more add something like:
tmpfs /run/shm tmpfs size=80% 0 0
to your
/etc/fstab; see
man mount for details.
diffoscope requires Python 3 and the following modules available on PyPI:
libarchive-c,
python-magic.
The various comparators rely on external commands being available. To get a list
of them, please run:
$ diffoscope --list-tools
Lunar, Reiner Herrmann, Chris Lamb, Mattia Rizzolo, Ximin Luo, Helmut Grohne,
Holger Levsen, Daniel Kahn Gillmor, Paul Gevers, Peter De Wachter, Yasushi
SHOJI, Clemens Lang, Ed Maste, Joachim Breitner, Mike McQuaid. Baptiste
Daroussin, Levente Polyak.
The preferred way to report bugs about
diffoscope, as well as suggest
fixes and requests for improvements is to submit reports to the issue tracker
at:
For more instructions, see
CONTRIBUTING.rst in this directory.
Join the users and developers mailing-list: <
https://lists.reproducible-builds.org/listinfo/diffoscope>
diffoscope website is at <
https://diffoscope.org/>
diffoscope is free software: you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.
diffoscope is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with
diffoscope. If not, see <
https://www.gnu.org/licenses/>.