Originally published on 17 May 2020
Contents
I currently back up my main laptop to an external USB drive using Ubuntu's built in "Backups" application. It works reasonably well but has a few drawbacks:
It can take several hours to back up less than 500 GB worth of data. The application spends several hours scanning the files and then several more to back them up. This is true even when performing incremental backups.
Every few backup cycles, it has to recreate the entire archive which takes even longer.
The backups are not automated. They only kick off when I remember to plug in the USB drive.
I used to keep the USB drive at the office which served as my "off site" backup. Now that I no longer go into the office, the USB drive is typically sitting no more than a few feet away from my laptop rendering it as susceptible to a local disaster such as theft or fire as my laptop.
Therefore, I thought it would be a good time to look into something that would offer me a few more advantages to my current backup regimen, specifically something that would allow me to run automated, encrypted off site backups as an added layer of protection. I've heard good things about BorgBackup in a number of forums and sites and so, without conducting any more research or due diligence, decided to jump right in and see if I could configure it to safely backup my laptop to an Azure-based cloud VPS. Here's how I went about it.
You can install BorgBackup (or Borg, as I'll refer to it going forward), from your distribution's packages but chances are, it will be a slightly older version. I went ahead and installed the latest release (version 1.1.11 at the time of writing) on both my local machine (laptop I wanted to backup) and the Ubuntu VPS which will store the backups via the standalone binary available on the project's GitHub site:
$ curl -L -O https://github.com/borgbackup/borg/releases/download/1.1.11/borg-linux64 $ sudo mv borg-linux64 /usr/local/bin/borg $ sudo chmod +rx /usr/local/bin/borg $ borg --version borg 1.1.11
I don't think it matters but I tried to ensure that I had the same version of Borg installed on both the local and remote machines.
Before we can create any backups, we first have to create, or initialize, a repository on our remote machine. According to the excellent Borg documentation, repositories (or repos for short), are
filesystem directories acting as self-contained stores of archives. Repositories can be accessed locally via path or remotely via ssh. Under the hood, repositories contain data blocks and a manifest tracking which blocks are in each archive. If some data hasn’t changed from one backup to another, Borg can simply reference an already uploaded data chunk (deduplication).
In my initial testing and experimentation, I found that the directory which will contain our repo must be completely empty. In my earlier post describing how to add disks to an Azure virtual machine for more storage, we created a new mountpoint creatively called mountpoint
to which the new disk was attached. If you look closely at the contents of that directory, you'll see that there is a lost+found
directory that already exists:
$ ls -all ~/mountpoint/ total 24 drwxr-xr-x 3 root root 4096 May 8 18:43 . drwx------ 2 root root 16384 May 8 18:43 lost+found
Therefore, you'll need to store your new Borg repo in it's own directory. You can handily specify this by using the borg init
command which takes the form of:
# Local repository, repokey encryption, BLAKE2b (often faster, since Borg 1.1) $ borg init --encryption=repokey-blake2 /path/to/repo
If your repo is stored on a remote machine, you can initialize it via:
# Remote repository (accesses a remote borg via ssh) $ borg init --encryption=repokey-blake2 user@ip.address:/path/to/repo
If the remote machine in which your repo is hosted uses a non-standard port (i.e. something other than 22) for SSH connections, you can use the following syntax to initialize a new repo:
$ borg init --encryption=repokey-blake2 ssh://user@ip.address:port/path/to/repo
For example, to create a new repository on a remote host using the non-standard port of 2222 to access SSH in the new directory newrepo
, you would run the following command:
$ borg init --encryption=repokey-blake2 ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo
Upon successful execution of the borg init
command, you should see something like the following message:
Enter new passphrase:
Enter same passphrase again:
Do you want your passphrase to be displayed for verification? [yN]: n
By default repositories initialized with this version will produce security
errors if written to with an older version (up to and including Borg 1.0.8).
If you want to use these older versions, you can disable the check by running:
borg upgrade --disable-tam ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo
See https://borgbackup.readthedocs.io/en/stable/changes.html#pre-1-0-9-manifest-spoofing-vulnerability for details about the security implications.
IMPORTANT: you will need both KEY AND PASSPHRASE to access this repo!
Use "borg key export" to export the key, optionally in printable format.
Write down the passphrase. Store both at safe place(s).
We can now create our first backup, or archive.
Archives are "snapshot[s] of the data of the files 'inside' it. One can later extract or mount an archive to restore from a backup." An archive is the result of a single backup, or an "instance" of a backup. Every time you backup a folder or filesystem on a local machine, you create an archive. The basic syntax to create an archive is based on the borg create
command which looks like this:
# Backup ~/Documents into an archive named "my-documents" $ borg create /path/to/repo::my-documents ~/Documents
Again, if you are creating an archive on a remote machine accessed with a non-standard SSH port, you would slightly modify the syntax as follows:
# Backup ~/Documents into an archive named "my-documents" $ borg create ssh://user@ip.address:port/path/to/repo::my-documents ~/Documents
Following along from our earlier example, we would enter:
$ borg create ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo::archive1 /home/borg/Documents Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo:
The next hour or day, we can create another archive called archive2
:
$ borg create ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo::archive2 /home/borg/Documents Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo:
Because Borg uses duplication to reduce the amount of data that must be encrypted, transmitted and stored to your remote repository, this backup should finish faster than the original one.
Now that we have created a couple of archives in our repository, we view a list of them using the borg list
command:
$ borg list /path/to/repo
For remote repos such as the one we created above, the command would be:
$ borg list ssh://user@ip.address:port/path/to/repo
Or more specifically,
$ borg list ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo: archive1 Tue, 2020-05-12 21:35:49 [9b299a10e72f2f848827e25a8b0898e2609976b8404f54fe2245017cdfb74ed7] archive2 Tue, 2020-05-12 21:44:55 [23e522ffaae1fce50fde17aed8641d2860befb115fb65ffabd460e8a8a50a8c8]
To see what files and directory are contained within a specific archive, just add the archive's name at the end. For example:
$ borg list ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo::archive1
To display metadata information about an archive, such as the creation date/time, number of files and size, use the borg info
command:
$ borg info /path/to/repo
For remote repos such as the one we created above, the command would be:
$ borg info ssh://user@ip.address:port/path/to/repo
Or more specifically,
$ borg info ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo: Repository ID: f977dd383c62882b32f679d809fd65b7a4fdfaf0e4f8d70e06833158ae24522c Location: ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo Encrypted: Yes (repokey BLAKE2b) Cache: /home/user/.cache/borg/f977dd383c62882b32f679d809fd65b7a4fdfaf0e4f8d70e06833158ae24522c Security dir: /home/user/.config/borg/security/f977dd383c62882b32f679d809fd65b7a4fdfaf0e4f8d70e06833158ae24522c ------------------------------------------------------------------------------ Original size Compressed size Deduplicated size All archives: 4.68 GB 4.58 GB 1.63 GB Unique chunks Total chunks Chunk index: 1643 4071
As with the borg list
command, if you want to get specific metadata information about a particular archive, just append the archive's name to the end of the borg info
command:
$ borg info ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo::archive1 Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo: Archive name: archive1 Archive fingerprint: 9b299a10e72f2f848827e25a8b0898e2609976b8404f54fe2245017cdfb74ed7 Comment: Hostname: localhost Username: user Time (start): Tue, 2020-05-12 21:35:49 Time (end): Tue, 2020-05-12 21:37:15 Duration: 1 minutes 26.47 seconds Number of files: 1168 Command line: borg create ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo::archive1 /home/user/tmp Utilization of maximum supported archive size: 0% ------------------------------------------------------------------------------ Original size Compressed size Deduplicated size This archive: 2.34 GB 2.29 GB 141.12 kB All archives: 4.68 GB 4.58 GB 1.63 GB Unique chunks Total chunks Chunk index: 1643 4071
"Your backup is only as good as your last restore" a wise sysadmin once said so it's good to regularly test your ability to do this. It's cleaner to restore your files to a new directory so that's what we'll do here:
$ mkdir ~/restore $ cd ~/restore
The actual restore is executed using the borg extract
command. In it's simplest form, you just need to enter the following to extract all files and directories from an entire archive:
$ borg extract --verbose --list /path/to/repo::archive
Note that the --verbose
and --list
options are not required but can help confirm whether you're restoring the proper files.
To extract a specific directory (e.g. src
), run:
$ borg extract --verbose --list /path/to/repo::archive home/USERNAME/src
To continue along with our example from above, here is how we would restore all of the contents of archive2
from our remote repository onto our local machine:
$ borg extract --verbose --list ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo::archive1
To manually delete an archive you no longer need from a repository, you can use the borg delete
command:
# delete a single backup archive: $ borg delete /path/to/repo::archive1
To delete the whole repository and all of the archives within it, run the same command without specifying the archive:
# delete the whole repository and the related local cache: $ borg delete /path/to/repo
For example, to completely delete our example repository and all of the archives within it, we would run the following command:
$ borg delete --dry-run ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo: You requested to completely DELETE the repository *including* all archives it contains: archive1 Tue, 2020-05-12 21:35:49 [9b299a10e72f2f848827e25a8b0898e2609976b8404f54fe2245017cdfb74ed7] archive2 Tue, 2020-05-12 21:44:55 [23e522ffaae1fce50fde17aed8641d2860befb115fb65ffabd460e8a8a50a8c8] Type 'YES' if you understand this and want to continue: YES
Note that in the example above, we used the --dry-run
option flag to simulate what would really happen. To permanently delete the repository, run the borg delete
command without the --dry-run
option.
The borg delete
command is fine to use if you want to manually cull your archives to free up space. If you plan on automating your backups, which you should and which we'll cover below, you should also get familiar with the borg prune
command. This allows you to selectively keep a certain selection of existing archives and delete those that do not match a defined set of criteria.
For example, to retain 7 daily, 4 weekly and 12 monthly archives in a given repository, you would run the following command:
$ borg prune \ --verbose \ --list \ --dry-run \ --show-rc \ --keep-daily 7 \ --keep-weekly 4 \ --keep-monthly 12 \ /path/to/repo
To prune the example repository we have been using throughout this post, the borg prune
command we'd use is:
$ borg prune \ --verbose \ --list \ --dry-run \ --show-rc \ --keep-daily 7 \ --keep-weekly 4 \ --keep-monthly 12 \ ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo Enter passphrase for key ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo: Keeping archive: archive2 Tue, 2020-05-12 21:44:55 [23e522ffaae1fce50fde17aed8641d2860befb115fb65ffabd460e8a8a50a8c8] Would prune: archive1 Tue, 2020-05-12 21:35:49 [9b299a10e72f2f848827e25a8b0898e2609976b8404f54fe2245017cdfb74ed7] terminating with success status, rc 0
Note that in both examples above, we have included the --dry-run
option to first see what Borg will prune before actually committing anything. If these commands are to be included in a script for automation, remove the --dry-run
flag.
Borg handily provides a set of environment variables that you can define to help simplify the various borg
commands you use to manage your backups. You can set these each time you open up a shell or place them in your .bashrc
file located in your $HOME
directory. You should definitely include them in any scripts you create to automatically create and manage your backups. The environment variables I tend to use the most are:
BORG_RSH which is used to specify a custom SSH private key instead of the default id_rsa
file. This is handy for when you have multiple SSH keys on your machine, some with a passphrase set and others without it. Use an SSH key without a passphrase if you are using a script to automate your backups to a remote machine accessible via SSH. An example command would be:
export BORG_SSH='ssh -i /home/user/.ssh/id2_rsa'
BORG_REPO. This is probably the most useful of all environment variables. It allows you to eliminate having the provide the full path to a repository, especially when you are backing up to a remote host via SSH and using a non-standard port (i.e. the BORG_REPO
environment variable allows you to eliminate typing ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo
for every command which can get quite tedious.) You can declare this environment variable via:
export BORG_REPO='ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo'
Then, going foward, your borg
commands would look like:
borg init --encryption=repokey-blake2 borg create ::archiveName /folder/to/back/up borg list borg delete
BORG_PASSPHRASE which allows you to store your archive passphrase in this variable. For example, you can set this variable like this:
export BORG_PASSPHRASE='superSecretAndLongPassphrase'
If you are not comfortable including your backup password in plain text in either your .bashrc
file or automation script, you can set the BORG_PASSCOMMAND
environment variable to use GPG to obtain the repository password from a gpg-encrypted file.
Now that we have covered how to initialize a repository, create and prune archives, and set environment variables, we can put all of this together in a script which we can use to automate our backups. I first created the following directories in my $HOME
directory:
borgBackup/
├── log
└── scripts
In the $HOME/borgBackup/scripts
directory, create a script called borgBackup.sh
with the following:
#!/bin/bash
# Backup a folder or folders to a remote address using borg.
# Usage: ./backup-borg.sh
# To restore: borg extract $BORG_REPO::computer-and-date
#-------------------------------------------------------
# MAKE SURE YOU HAVE INITIALIZED A REPOSITORY BEFORE
# RUNNING THIS SCRIPT
#-------------------------------------------------------
NOW=$(date +"%Y%m%d%H%M")
LOGDIR=/home/user/borgBackup/log
set -eu
export BORG_RSH='ssh -i /home/user/.ssh/id2_rsa'
export BORG_REPO='ssh://borg@123.45.678.90:2222/home/borg/mountpoint/newrepo'
export BORG_PASSPHRASE='superSecretAndLongPassphrase'
export BORG_REMOTE_PATH=/usr/local/bin/borg
# backup miscellaneous files and directors in $HOME
/usr/local/bin/borg create \
--verbose \
--list \
--progress \
--stats \
::$NOW-$(hostname)-home-other \
/home/user \
--exclude '/home/user/borgBackup/log' \
--exclude '/home/user/tmp' \
--exclude '/home/user/.cache' \
--exclude '/home/user/.config' \
--exclude '/home/user/Documents' \
--exclude '/home/user/Downloads' \
--exclude '/home/user/.local' \
--exclude '/home/user/.mozilla' \
--exclude '/home/user/snap' \
--exclude '/home/user/Videos' \
2>> "$LOGDIR/$NOW-log-home-other"
# update backup log
echo "$(date -Is) $NOW-home-other backup finished" >> $LOGDIR/backup.log
# backup Documents directory
/usr/local/bin/borg create \
--verbose \
--list \
--progress \
--stats \
::$NOW-$(hostname)-home-Documents \
/home/user/Documents \
2>> "$LOGDIR/$NOW-log-home-Documents"
# update backup log
echo "$(date -Is) $NOW-home-Documents backup finished" >> $LOGDIR/backup.log
# backup Videos directory
/usr/local/bin/borg create \
--verbose \
--list \
--progress \
--stats \
::$NOW-$(hostname)-home-Videos \
/home/user/Videos \
2>> "$LOGDIR/$NOW-log-home-Videos"
# update backup log
echo "$(date -Is) $NOW-home-Videos backup finished" >> $LOGDIR/backup.log
# backup Downloads directory
/usr/local/bin/borg create \
--verbose \
--list \
--progress \
--stats \
::$NOW-$(hostname)-home-Downloads \
/home/user/Downloads \
2>> "$LOGDIR/$NOW-log-home-Downloads"
# update backup log
echo "$(date -Is) $NOW-home-Downloads backup finished" >> $LOGDIR/backup.log
# prune archives to keep 30 daily, 8 weekly and 12 monthly snapshots
/usr/local/bin/borg prune \
--verbose \
--list \
--stats \
--show-rc \
--keep-daily 30 \
--keep-weekly 8 \
--keep-monthly 12 \
2>> "$LOGDIR/$NOW-log-prune"
# update backup log
echo "$(date -Is) $NOW-pruning finished" >> $LOGDIR/backup.log
To make it executable, run:
$ chmod +x borgBackup.sh
To have your system automatically run this backup script each night, update your crontab
via crontab -e
and add a line similar to the one below to run your backup script each night, or at a time and frequency of your choosing:
# run Borg backup script every day at 01:00
0 1 * * * /home/user/borgBackup/scripts/borgBackup.sh
I hope you have found this introduction to Borg useful. If you have any questions, comments or find any errors, please be sure to reach out to me!