Native Backup

Backup of native machines is much harder than for virtual machines, that’s just the way it goes. With VMs, your hypervisor should have a builtin backup feature which can suspend (pause) the entire machine, make a snapshot of the disk and the entire RAM contents, resume the machine, and continue backing up the snapshot in the background. If you have the luxury of running your server as a VM (e.g., Proxmox PVE, or any cloud provider), it makes backups so much easier.

When running Docker natively on a Raspberry Pi, you do not have that option. This chapter deals with the more intricate procedure of the backup of a non-virtual machine.

Restic

Restic is a modern open source backup tool, which has many great features including incremental backups, encryption, and offsite upload. It can maintain backups of huge sizes, and it only needs to upload the changes since the last backup.

One of the best ways to install Restic is with the script found on this blog post:

Daily backups to S3 with Restic and systemd timers

Warning

The only problem with backing up files this way is that with containers that are always running, you need to make sure that the files are flushed to disk before the backup starts, otherwise your backup could become corrupted. For files that are only written to once (photos, videos, etc.) this might not be such a big deal, but for files that are constantly changing (e.g., databases) this is a problem.

Tip

Evaluating this Restic script:

Pros:

  • It’s a self contained script not dependent on Docker.
  • It can backup several directories and upload to an offsite S3 bucket. Point the script at any directory and it will make a backup of it (e.g., /home/pi/, /var/lib/docker/volumes [see Cons].)
  • It supports a space-efficient incremental backup strategy.
  • It’s a good option for backup of home directories and large media folders.

Cons:

  • No integration with Docker; it cannot shutdown containers before backup. Backup of /var/lib/docker/volumes is not 100% safe. Files that are modified during the backup may become corrupted.
  • Restoration requires the original script and for you to re-install Restic.

Backup-Volume

Backup-Volume is another backup tool that is specifically configured to backup Docker volumes and uploading archives to offsite storage (S3, SSH, DropBox).

Backup-Volume can only handle complete backups (no incremental storage). For small datasets this is ideal, because each backup gets stored in a separate backup-XXXX.tar.gz, and its easy to restore with one file. For larger datasets, the duplication of backup files would be prohibitively expensive/wasteful (although you can tune the retention and pruning parameters to save some space, it won’t compare to the efficiency of Restic).

Backup-Volume has a trick it can use in its favor: it can automatically stop and start containers before and after the backup runs. This makes this style of backup much safer for write intensive volumes (e.g., databases) and ensures that the data gets flushed before the backup starts.

Tip

Evaluating Backup-Volume:

Pros:

  • Specifically designed to backup Docker volumes on a cron-like schedule.
  • Manages the lifecycle of the containers to shut them down before a backup starts and to restart them afterward.
  • Each backup is contained in a single file (backup-XXXX.tar.gz) which is uploaded to your S3 provider. Restoration is easy, just download the latest tarball and extract it.
  • Automatic pruning of old archives helps to save some space.

Cons:

  • No incremental backup support. Each backup duplicates the entire dataset. Ill-suited for large datasets.

Setup Backup-Volume

Prepare an S3 bucket offsite

You may want to use your own minio S3 service (preferably installed on a separate offsite server), or a third party provider (AWS S3, DigitalOcean Spaces, Wasabi, etc.)

You will need to provide the S3 bucket and credentials that the backup process will use when uploading archives:

  • S3 Endpoint domain. e.g., s3.example.com.
  • S3 bucket name. e.g., test
  • S3 access key id. e.g., test.
  • S3 secret key. e.g., xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.

Configure Backup-Volume

Run this on your Raspberry Pi
## Configures the default backup-volume instance:
pi make backup-volume config

Select multiple existing volumes to backup together as one archive:

(stdout)
? Select all the volumes to backup
> [x] test1_data
  [ ] forgejo_data
  [x] icecast_config
  [ ] icecast_logs
  [ ] mosquitto_mosquitto
  [ ] traefik_geoip_database
v [ ] traefik_traefik

Choose the backup schedule in cron format :

(stdout)
BACKUP_CRON_EXPRESSION: Enter the cron expression (eg. @daily)

: @every 24h
Tip

Other example schedules:

Choose the retention length (number of days) to keep backup archives before automatic pruning happens:

(stdout)
BACKUP_RETENTION_DAYS: Rotate backups older than how many days? (eg. 30)

: 30

You can choose any of the supported storage mechanisms. For demo purposes, choose S3:

(stdout)
> Which remote storage do you want to use? s3

BACKUP_AWS_ENDPOINT: Enter the S3 endpoint (e.g., s3.example.com)

: s3.d.example.com

BACKUP_AWS_S3_BUCKET_NAME: Enter the S3 bucket name (e.g., my-bucket)

: backup-test-1

BACKUP_AWS_ACCESS_KEY_ID: Enter the S3 access key id (e.g., my-access-key)

: backup-test-1

BACKUP_AWS_SECRET_ACCESS_KEY: Enter the S3 secret access key

: OEuL3lMSdvdoFyVjEQTM4Trj/7VhHq7Q7cOFEpQPuxMHxsTVK3Hxne7st6Ty

BACKUP_AWS_S3_PATH: Choose a directory inside the bucket (blank for root)

:
Tip

You should use a dedicated bucket for each backup instance, or you can share the same bucket between several instances, as long as you are careful to configure a unique BACKUP_AWS_S3_PATH bucket sub-directory for each instance.

You may optionally preserve an additional copy of the archive in a local volume:

(stdout)
> Do you want to keep a local backup in addition to the remote one? No

You can choose to turn on notifications (see separate instructions below.)

(stdout)
? Do you want to receive notifications for backup failure?
> No.
  Yes, via email.
  Yes, via webhook.

Install

Run this on your Raspberry Pi
## installs the default backup instance:
pi make backup-volume install

Instances

All volume selections will backup to the same archive on the same schedule. To back up different volumes on different schedules, you should create more than one instance of Backup-Volume to create separate configs:

Run this on your Raspberry Pi
## Creates a new backup instance named test:
pi make backup-volume instance instance=test
pi make backup-volume install instance=test

Verify backup schedule

Run this on your Raspberry Pi
pi make backup-volume logs
(stdout)
backup-1  | 2024-10-16T02:37:00.263838944Z time=2024-10-16T02:37:00.262Z level=INFO msg="Successfully scheduled backup from environment with expression @daily"
backup-1  | 2024-10-16T02:37:00.266773318Z time=2024-10-16T02:37:00.266Z level=INFO msg="The backup will start at 12:00 AM"
Tip

You should see a plain text log message describing when the backup will occur (The backup will start at 12:00 AM), except it will be ommitted if you use the @every syntax.

Restore

To restore a volume from a backup, simply untar the archive into the appropriate directory under /var/lib/docker/volumes.

Notifications

In the event of a failed backup job, a notification can be sent to a configurable receiver with Shoutrrr (including Email, Matrix, Discord, Ntfy, IFTTT etc.)

Email notifications

Info

To enable email notifications, you must setup postfix-relay separately and configure it to allow clients from the the backup-volume network to send mail.

(stdout)
? Do you want to receive notifications for backup failure?
  No.
> Yes, via email.
  Yes, via webhook.

Enter the sender email address

: backup-volume-default@pi.example.com

Enter the recipient email address

: test@example.com

Matrix notifications via webhook

You will need to setup the Matrix Hookshot bot in order to receive a generic webhook from Backup-Volume that notifies the Matrix room you share with the bot.

  • Create a new room and invite the hookshot bot to it.
  • Message the room: !hookshot setup-widget.
  • Open the room Extensions tab and click on the Hookshot extension.
  • Create a new Inbound (Generic) Webhook.
  • Name the webhook backup-volume.example.com after your backup-volume instance.
  • Copy the long URL it gives you, e.g., https://matrix.example.com/hookshot/webhooks/webhook/xxxxxx.
  • Click Save.

Reconfigure the .env_{CONTEXT}_{INSTANCE} file of the backup-volume instance, setting the notification URL as a Generic webhook:

Run this on your Raspberry Pi
pi make backup-volume reconfigure var=BACKUP_NOTIFICATION_URLS="generic://matrix.example.com/hookshot/webhooks/webhook/xxxxx?template=json"
Tip

Important: the Shoutrrr webhook notification URL should start with generic:// (not https://). It should end with ?template=json.

Re-install to load the new config:

Run this on your Raspberry Pi
pi make backup-volume install

To test the notification, you can consider setting BACKUP_CRON_EXPRESSION=@every 1m and change the S3 credentials so that they are incorrect, thus triggering a failure notification to be sent.

You should see the JSON notification in the channel, which unfortunately is not formatted very nicely :

Example matrix message:

{ “message”: “Running backup-volume failed with error: main.(*script).copyArchive: error copying archive: s3.(*s3Storage).Copy: error uploading backup to remote storage: [Message]: ‘The specified bucket does not exist.’, [Code]: NoSuchBucket, [StatusCode]: 404\n\nLog output of the failed run was:\n\ntime=2024-10-16T19:45:27.092Z level=INFO msg=\“Stopping 1 out of 26 running container(s) as they were labeled backup-volume.stop-during-backup=true.\"\ntime=2024-10-16T19:45:27.713Z level=INFO msg=\“Created backup of `/backup` at `/tmp/backup-default-2024-10-16T19-45-27.tar.gz`.\"\ntime=2024-10-16T19:45:27.972Z level=INFO msg=\“Restarted 1 container(s).\"\ntime=2024-10-16T19:45:28.259Z level=INFO msg=\“Removed tar file `/tmp/backup-default-2024-10-16T19-45-27.tar.gz`.\"\n”, “title”: “Failure running backup-volume at 2024-10-16T19:45:27Z” }