borg - deduplicating and encrypting backup tool
borg [common options] <command> [options] [arguments]
BorgBackup (short: Borg) is a deduplicating backup program. Optionally, it
supports compression and authenticated encryption.
The main goal of Borg is to provide an efficient and secure way to back data up.
The data deduplication technique used makes Borg suitable for daily backups
since only changes are stored. The authenticated encryption technique makes it
suitable for backups to targets not fully trusted.
Borg stores a set of files in an
archive. A
repository is a
collection of
archives. The format of repositories is Borg-specific.
Borg does not distinguish archives from each other in any way other than their
name, it does not matter when or where archives were created (e.g. different
hosts).
- 1.
- Before a backup can be made, a repository has to be
initialized:
$ borg -r /path/to/repo rcreate --encryption=repokey-aes-ocb
- 2.
- Back up the ~/src and ~/Documents directories
into an archive called Monday:
$ borg -r /path/to/repo create Monday ~/src ~/Documents
- 3.
- The next day create a new archive called
Tuesday:
$ borg -r /path/to/repo create --stats Tuesday ~/src ~/Documents
This backup will be a lot quicker and a lot smaller since only new, never before
seen data is stored. The
--stats option causes Borg to output
statistics about the newly created archive such as the deduplicated size (the
amount of unique data not shared with other archives):
Repository: /path/to/repo
Archive name: Tuesday
Archive fingerprint: bcd1b53f9b4991b7afc2b339f851b7ffe3c6d030688936fe4552eccc1877718d
Time (start): Sat, 2022-06-25 20:21:43
Time (end): Sat, 2022-06-25 20:21:43
Duration: 0.07 seconds
Utilization of max. archive size: 0%
Number of files: 699
Original size: 31.14 MB
Deduplicated size: 502 B
- 4.
- List all archives in the repository:
$ borg -r /path/to/repo rlist
Monday Sat, 2022-06-25 20:21:14 [b80e24d2...b179f298]
Tuesday Sat, 2022-06-25 20:21:43 [bcd1b53f...1877718d]
- 5.
- List the contents of the Monday archive:
$ borg -r /path/to/repo list Monday
drwxr-xr-x user group 0 Mon, 2016-02-15 18:22:30 home/user/Documents
-rw-r--r-- user group 7961 Mon, 2016-02-15 18:22:30 home/user/Documents/Important.doc
...
- 6.
- Restore the Monday archive by extracting the files
relative to the current directory:
$ borg -r /path/to/repo extract Monday
- 7.
- Delete the Monday archive (please note that this
does not free repo disk space):
$ borg -r /path/to/repo delete -a Monday
Please note the
-a option here (short for
--glob-archives) which
enables you to give a globbing pattern to delete multiple archives, like
-a
'oldcrap-*'. You can also combine this with
--first,
--last
and
--sort-by. Be careful, always first use with
--dry-run and
--list!
- 8.
- Recover disk space by compacting the segment files in the
repo:
$ borg -r /path/to/repo compact
NOTE:
Borg is quiet by default (it defaults to
WARNING log level). You can use options like --progress or
--list to get specific reports during command execution. You can also
add the -v (or --verbose or --info) option to adjust the
log level to INFO to get other informational messages.
Borg only supports taking options (
-s and
--progress in the
example) to the left or right of all positional arguments (
repo::archive and
path in the example), but not in between them:
borg create -s --progress archive path # good and preferred
borg create archive path -s --progress # also works
borg create -s archive path --progress # works, but ugly
borg create archive -s --progress path # BAD
This is due to a problem in the argparse module:
https://bugs.python.org/issue15112
Local filesystem (or locally mounted network filesystem):
/path/to/repo - filesystem path to repo directory, absolute path
path/to/repo - filesystem path to repo directory, relative path
Also, stuff like
~/path/to/repo or
~other/path/to/repo works (this
is expanded by your shell).
Note: you may also prepend a
file:// to a filesystem path to get URL
style.
Remote repositories accessed via ssh
user@host:
ssh://user@host:port/path/to/repo - absolute path`
ssh://user@host:port/./path/to/repo - path relative to current directory
ssh://user@host:port/~/path/to/repo - path relative to user's home
directory
If you frequently need the same repo URL, it is a good idea to set the
BORG_REPO environment variable to set a default for the repo URL:
export BORG_REPO='ssh://user@host:port/path/to/repo'
Then just leave away the
--repo option if you want to use the default -
it will be read from BORG_REPO then.
Many commands need to know the repository location, give it via
-r /
--repo or use the
BORG_REPO environment variable.
Commands needing one or two archive names usually get them as positional
argument.
Commands working with an arbitrary amount of archives, usually take
-a
ARCH_GLOB.
Archive names must not contain the
/ (slash) character. For simplicity,
maybe also avoid blanks or other characters that have special meaning on the
shell or in a filesystem (borg mount will use the archive name as directory
name).
Borg writes all log output to stderr by default. But please note that something
showing up on stderr does
not indicate an error condition just because
it is on stderr. Please check the log levels of the messages and the return
code of borg for determining error, warning or success conditions.
If you want to capture the log output to a file, just redirect it:
borg create repo::archive myfiles 2>> logfile
Custom logging configurations can be implemented via BORG_LOGGING_CONF.
The log level of the builtin logging configuration defaults to WARNING. This is
because we want Borg to be mostly silent and only output warnings, errors and
critical messages, unless output has been requested by supplying an option
that implies output (e.g.
--list or
--progress).
Log levels: DEBUG < INFO < WARNING < ERROR < CRITICAL
Use
--debug to set DEBUG log level - to get debug, info, warning, error
and critical level output.
Use
--info (or
-v or
--verbose) to set INFO log level - to
get info, warning, error and critical level output.
Use
--warning (default) to set WARNING log level - to get warning, error
and critical level output.
Use
--error to set ERROR log level - to get error and critical level
output.
Use
--critical to set CRITICAL log level - to get critical level output.
While you can set misc. log levels, do not expect that every command will give
different output on different log levels - it's just a possibility.
WARNING:
Options --critical and --error
are provided for completeness, their usage is not recommended as you might
miss important information.
Borg can exit with the following return codes (rc):
|
Return code |
Meaning |
|
0 |
success (logged as INFO) |
|
1 |
warning (operation reached its normal end, but there were warnings --
you should check the log, logged as WARNING) |
|
2 |
error (like a fatal error, a local or remote exception, the operation
did not reach its normal end, logged as ERROR) |
|
128+N |
killed by signal N (e.g. 137 == kill -9) |
|
If you use
--show-rc, the return code is also logged at the indicated
level as the last log entry.
Borg uses some environment variables for automation:
- General:
- BORG_REPO
- When set, use the value to give the default repository
location. Use this so you do not need to type --repo
/path/to/my/repo all the time.
- BORG_OTHER_REPO
- Similar to BORG_REPO, but gives the default for
--other-repo.
- BORG_PASSPHRASE
- When set, use the value to answer the passphrase question
for encrypted repositories. It is used when a passphrase is needed to
access an encrypted repo as well as when a new passphrase should be
initially set when initializing an encrypted repo. See also
BORG_NEW_PASSPHRASE.
- BORG_PASSCOMMAND
- When set, use the standard output of the command (trailing
newlines are stripped) to answer the passphrase question for encrypted
repositories. It is used when a passphrase is needed to access an
encrypted repo as well as when a new passphrase should be initially set
when initializing an encrypted repo. Note that the command is executed
without a shell. So variables, like $HOME will work, but ~
won't. If BORG_PASSPHRASE is also set, it takes precedence. See also
BORG_NEW_PASSPHRASE.
- BORG_PASSPHRASE_FD
- When set, specifies a file descriptor to read a passphrase
from. Programs starting borg may choose to open an anonymous pipe and use
it to pass a passphrase. This is safer than passing via BORG_PASSPHRASE,
because on some systems (e.g. Linux) environment can be examined by other
processes. If BORG_PASSPHRASE or BORG_PASSCOMMAND are also set, they take
precedence.
- BORG_NEW_PASSPHRASE
- When set, use the value to answer the passphrase question
when a new passphrase is asked for. This variable is checked first.
If it is not set, BORG_PASSPHRASE and BORG_PASSCOMMAND will also be
checked. Main usecase for this is to automate fully borg
change-passphrase.
- BORG_DISPLAY_PASSPHRASE
- When set, use the value to answer the "display the
passphrase for verification" question when defining a new passphrase
for encrypted repositories.
- BORG_HOST_ID
- Borg usually computes a host id from the FQDN plus the
results of uuid.getnode() (which usually returns a unique id based
on the MAC address of the network interface. Except if that MAC happens to
be all-zero - in that case it returns a random value, which is not what we
want (because it kills automatic stale lock removal). So, if you have a
all-zero MAC address or other reasons to control better externally the
host id, just set this environment variable to a unique value. If all your
FQDNs are unique, you can just use the FQDN. If not, use
fqdn@uniqueid.
- BORG_LOCK_WAIT
- You can set the default value for the --lock-wait
option with this, so you do not need to give it as a commandline
option.
- BORG_LOGGING_CONF
- When set, use the given filename as INI-style
logging configuration. A basic example conf can be found at
docs/misc/logging.conf.
- BORG_RSH
- When set, use this command instead of ssh. This can
be used to specify ssh options, such as a custom identity file ssh -i
/path/to/private/key. See man ssh for other options. Using the
--rsh CMD commandline option overrides the environment
variable.
- BORG_REMOTE_PATH
- When set, use the given path as borg executable on the
remote (defaults to "borg" if unset). Using --remote-path
PATH commandline option overrides the environment variable.
- BORG_FILES_CACHE_SUFFIX
- When set to a value at least one character long, instructs
borg to use a specifically named (based on the suffix) alternative files
cache. This can be used to avoid loading and saving cache entries for
backup sources other than the current sources.
- BORG_FILES_CACHE_TTL
- When set to a numeric value, this determines the maximum
"time to live" for the files cache entries (default: 20). The
files cache is used to determine quickly whether a file is unchanged. The
FAQ explains this more detailed in: always_chunking
- BORG_SHOW_SYSINFO
- When set to no (default: yes), system information (like OS,
Python version, ...) in exceptions is not shown. Please only use for good
reasons as it makes issues harder to analyze.
- BORG_FUSE_IMPL
- Choose the lowlevel FUSE implementation borg shall use for
borg mount. This is a comma-separated list of implementation names,
they are tried in the given order, e.g.:
- •
-
pyfuse3,llfuse: default, first try to load pyfuse3,
then try to load llfuse.
- •
-
llfuse,pyfuse3: first try to load llfuse, then try
to load pyfuse3.
- •
-
pyfuse3: only try to load pyfuse3
- •
-
llfuse: only try to load llfuse
- •
-
none: do not try to load an implementation
- BORG_SELFTEST
- This can be used to influence borg's builtin self-tests.
The default is to execute the tests at the beginning of each borg command
invocation.
BORG_SELFTEST=disabled can be used to switch off the tests and rather save
some time. Disabling is not recommended for normal borg users, but large
scale borg storage providers can use this to optimize production servers
after at least doing a one-time test borg (with selftests not disabled)
when installing or upgrading machines / OS / borg.
- BORG_WORKAROUNDS
- A list of comma separated strings that trigger workarounds
in borg, e.g. to work around bugs in other software.
Currently known strings are:
- basesyncfile
- Use the more simple BaseSyncFile code to avoid issues with
sync_file_range. You might need this to run borg on WSL (Windows Subsystem
for Linux) or in systemd.nspawn containers on some architectures (e.g.
ARM). Using this does not affect data safety, but might result in a more
bursty write to disk behaviour (not continuously streaming to disk).
- retry_erofs
- Retry opening a file without O_NOATIME if opening a file
with O_NOATIME caused EROFS. You will need this to make archives from
volume shadow copies in WSL1 (Windows Subsystem for Linux 1).
- Some automatic "answerers" (if set, they
automatically answer confirmation questions):
- BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=no (or
=yes)
- For "Warning: Attempting to access a previously
unknown unencrypted repository"
- BORG_RELOCATED_REPO_ACCESS_IS_OK=no (or =yes)
- For "Warning: The repository at location ... was
previously located at ..."
- BORG_CHECK_I_KNOW_WHAT_I_AM_DOING=NO (or =YES)
- For "This is a potentially dangerous function..."
(check --repair)
- BORG_DELETE_I_KNOW_WHAT_I_AM_DOING=NO (or =YES)
- For "You requested to DELETE the repository completely
including all archives it contains:"
Note: answers are case sensitive. setting an invalid answer value might either
give the default answer or ask you interactively, depending on whether retries
are allowed (they by default are allowed). So please test your scripts
interactively before making them a non-interactive script.
- Directories and files:
- BORG_BASE_DIR
- Defaults to $HOME or ~$USER or ~ (in
that order). If you want to move all borg-specific folders to a custom
path at once, all you need to do is to modify BORG_BASE_DIR: the
other paths for cache, config etc. will adapt accordingly (assuming you
didn't set them to a different custom value).
- BORG_CACHE_DIR
- Defaults to $BORG_BASE_DIR/.cache/borg. If
BORG_BASE_DIR is not explicitly set while XDG env var
XDG_CACHE_HOME is set, then $XDG_CACHE_HOME/borg is being
used instead. This directory contains the local cache and might need a lot
of space for dealing with big repositories. Make sure you're aware of the
associated security aspects of the cache location:
cache_security
- BORG_CONFIG_DIR
- Defaults to $BORG_BASE_DIR/.config/borg. If
BORG_BASE_DIR is not explicitly set while XDG env var
XDG_CONFIG_HOME is set, then $XDG_CONFIG_HOME/borg is being
used instead. This directory contains all borg configuration directories,
see the FAQ for a security advisory about the data in this directory:
home_config_borg
- BORG_SECURITY_DIR
- Defaults to $BORG_CONFIG_DIR/security. This
directory contains information borg uses to track its usage of NONCES
("numbers used once" - usually in encryption context) and other
security relevant data.
- BORG_KEYS_DIR
- Defaults to $BORG_CONFIG_DIR/keys. This directory
contains keys for encrypted repositories.
- BORG_KEY_FILE
- When set, use the given filename as repository key
file.
- TMPDIR
- This is where temporary files are stored (might need a lot
of temporary space for some operations), see tempfile for
details.
- Building:
- BORG_OPENSSL_PREFIX
- Adds given OpenSSL header file directory to the default
locations (setup.py).
- BORG_LIBLZ4_PREFIX
- Adds given prefix directory to the default locations. If a
'include/lz4.h' is found Borg will be linked against the system liblz4
instead of a bundled implementation. (setup.py)
- BORG_LIBZSTD_PREFIX
- Adds given prefix directory to the default locations. If a
'include/zstd.h' is found Borg will be linked against the system libzstd
instead of a bundled implementation. (setup.py)
Please note:
- •
- Be very careful when using the "yes" sayers, the
warnings with prompt exist for your / your data's security/safety.
- •
- Also be very careful when putting your passphrase into a
script, make sure it has appropriate file permissions (e.g. mode 600,
root:root).
We strongly recommend against using Borg (or any other database-like software)
on non-journaling file systems like FAT, since it is not possible to assume
any consistency in case of power failures (or a sudden disconnect of an
external drive or similar failures).
While Borg uses a data store that is resilient against these failures when used
on journaling file systems, it is not possible to guarantee this with some
hardware -- independent of the software used. We don't know a list of affected
hardware.
If you are suspicious whether your Borg repository is still consistent and
readable after one of the failures mentioned above occurred, run
borg check
--verify-data to make sure it is consistent. Requirements for Borg
repository file systems
- •
- Long file names
- •
- At least three directory levels with short names
- •
- Typically, file sizes up to a few hundred MB. Large
repositories may require large files (>2 GB).
- •
- Up to 1000 files per directory (10000 for repositories
initialized with Borg 1.0)
- •
- rename(2) / MoveFile(Ex) should work as specified, i.e. on
the same file system it should be a move (not a copy) operation, and in
case of a directory it should fail if the destination exists and is not an
empty directory, since this is used for locking.
- •
- Also hardlinks are used for more safe and secure file
updating (e.g. of the repo config file), but the code tries to work also
if hardlinks are not supported.
To display quantities, Borg takes care of respecting the usual conventions of
scale. Disk sizes are displayed in
decimal, using powers of ten (so
kB means 1000 bytes). For memory usage,
binary prefixes are
used, and are indicated using the
IEC binary prefixes, using powers of
two (so
KiB means 1024 bytes).
We format date and time conforming to ISO-8601, that is: YYYY-MM-DD and HH:MM:SS
(24h clock).
For more information about that, see:
https://xkcd.com/1179/
Unless otherwise noted, we display local date and time. Internally, we store and
process date and time as UTC.
Borg might use a lot of resources depending on the size of the data set it is
dealing with.
If one uses Borg in a client/server way (with a ssh: repository), the resource
usage occurs in part on the client and in another part on the server.
If one uses Borg as a single process (with a filesystem repo), all the resource
usage occurs in that one process, so just add up client + server to get the
approximate resource usage.
- CPU client:
- •
-
borg create: does chunking, hashing, compression,
crypto (high CPU usage)
- •
-
chunks cache sync: quite heavy on CPU, doing lots of
hashtable operations.
- •
-
borg extract: crypto, decompression (medium to high
CPU usage)
- •
-
borg check: similar to extract, but depends on
options given.
- •
-
borg prune / borg delete archive: low to medium CPU
usage
- •
-
borg delete repo: done on the server
It won't go beyond 100% of 1 core as the code is currently single-threaded.
Especially higher zlib and lzma compression levels use significant amounts of
CPU cycles. Crypto might be cheap on the CPU (if hardware accelerated) or
expensive (if not).
- CPU server:
- It usually doesn't need much CPU, it just deals with the
key/value store (repository) and uses the repository index for that.
borg check: the repository check computes the checksums of all chunks
(medium CPU usage) borg delete repo: low CPU usage
- CPU (only for client/server operation):
- When using borg in a client/server way with a
ssh:-type repo, the ssh processes used for the transport layer will
need some CPU on the client and on the server due to the crypto they are
doing - esp. if you are pumping big amounts of data.
- Memory (RAM) client:
- The chunks index and the files index are read into memory
for performance reasons. Might need big amounts of memory (see below).
Compression, esp. lzma compression with high levels might need substantial
amounts of memory.
- Memory (RAM) server:
- The server process will load the repository index into
memory. Might need considerable amounts of memory, but less than on the
client (see below).
- Chunks index (client only):
- Proportional to the amount of data chunks in your repo.
Lots of chunks in your repo imply a big chunks index. It is possible to
tweak the chunker params (see create options).
- Files index (client only):
- Proportional to the amount of files in your last backups.
Can be switched off (see create options), but next backup might be much
slower if you do. The speed benefit of using the files cache is
proportional to file size.
- Repository index (server only):
- Proportional to the amount of data chunks in your repo.
Lots of chunks in your repo imply a big repository index. It is possible
to tweak the chunker params (see create options) to influence the amount
of chunks being created.
- Temporary files (client):
- Reading data and metadata from a FUSE mounted repository
will consume up to the size of all deduplicated, small chunks in the
repository. Big chunks won't be locally cached.
- Temporary files (server):
- A non-trivial amount of data will be stored on the remote
temp directory for each client that connects to it. For some remotes, this
can fill the default temporary directory at /tmp. This can be remediated
by ensuring the $TMPDIR, $TEMP, or $TMP environment variable is properly
set for the sshd process. For some OSes, this can be done just by setting
the correct value in the .bashrc (or equivalent login config file for
other shells), however in other cases it may be necessary to first enable
PermitUserEnvironment yes in your sshd_config file, then add
environment="TMPDIR=/my/big/tmpdir" at the start of the
public key to be used in the authorized_hosts file.
- Cache files (client only):
- Contains the chunks index and files index (plus a
collection of single- archive chunk indexes which might need huge amounts
of disk space, depending on archive count and size - see FAQ about how to
reduce).
- Network (only for client/server operation):
- If your repository is remote, all deduplicated (and
optionally compressed/ encrypted) data of course has to go over the
connection ( ssh:// repo url). If you use a locally mounted network
filesystem, additionally some copy operations used for transaction support
also go over the connection. If you back up multiple sources to one target
repository, additional traffic happens for cache resynchronization.
Besides regular file and directory structures, Borg can preserve
- •
- symlinks (stored as symlink, the symlink is not
followed)
- •
- special files:
- •
- character and block device files (restored via mknod)
- •
- FIFOs ("named pipes")
- •
- special file contents can be backed up in
--read-special mode. By default the metadata to create them with
mknod(2), mkfifo(2) etc. is stored.
- •
- hardlinked regular files, devices, symlinks, FIFOs
(considering all items in the same archive)
- •
- timestamps in nanosecond precision: mtime, atime,
ctime
- •
- other timestamps: birthtime (on platforms supporting
it)
- •
- permissions:
- •
- IDs of owning user and owning group
- •
- names of owning user and owning group (if the IDs can be
resolved)
- •
- Unix Mode/Permissions (u/g/o permissions, suid, sgid,
sticky)
On some platforms additional features are supported:
|
Platform |
ACLs [5] |
xattr [6] |
Flags [7] |
|
Linux |
Yes |
Yes |
Yes [1] |
|
Mac OS X |
Yes |
Yes |
Yes (all) |
|
FreeBSD |
Yes |
Yes |
Yes (all) |
|
OpenBSD |
n/a |
n/a |
Yes (all) |
|
NetBSD |
n/a |
No [2] |
Yes (all) |
|
Solaris and derivatives |
No [3] |
No [3] |
n/a |
|
Windows (cygwin) |
No [4] |
No |
No |
|
Other Unix-like operating systems may work as well, but have not been tested at
all.
Note that most of the platform-dependent features also depend on the file
system. For example, ntfs-3g on Linux isn't able to convey NTFS ACLs.
- [1]
- Only "nodump", "immutable",
"compressed" and "append" are supported. Feature
request #618 for more flags.
- [2]
- Feature request #1332
- [3]
- Feature request #1337
- [4]
- Cygwin tries to map NTFS ACLs to permissions with varying
degrees of success.
- [5]
- The native access control list mechanism of the OS. This
normally limits access to non-native ACLs. For example, NTFS ACLs aren't
completely accessible on Linux with ntfs-3g.
- [6]
- extended attributes; key-value pairs attached to a file,
mainly used by the OS. This includes resource forks on Mac OS X.
- [7]
- aka BSD flags. The Linux set of flags [1] is
portable across platforms. The BSDs define additional flags.
borg-common(1) for common command line options
borg-rcreate(1),
borg-rdelete(1),
borg-rlist(1),
borg-rinfo(1),
borg-create(1),
borg-mount(1),
borg-extract(1),
borg-list(1),
borg-info(1),
borg-delete(1),
borg-prune(1),
borg-compact(1),
borg-recreate(1)
borg-compression(1),
borg-patterns(1),
borg-placeholders(1)
The Borg Collective
orphan: