smoke me a kipper

i’ll be back(up) for breakfast

Earlier this year I wrote about my backup setup and recently I had to put it to the test.

My PC is a tower that I have on a small stand next to my desk. In the past I had kept the case on my desk but it is rather large and dominates the space a bit too much. The other day my 1 year old toddled into the study and started pushing the power button on my PC, power cycling the machine a few times in quick succession. This was unknown to me until the next morning when I booted up my PC and noticed it was very sluggish and it crashed trying to open my browser. After it happened again I started digging through the logs and noticed some filesystem corruption.

As I described in my backup setup post, I have a 3 disk RAID 5 array as my $HOME. Because of the size I only nightly backup important documents, etc. A full backup is done periodically to an external drive I keep in my bug out bag. Unfortunately I had not done a full back in a while, but I knew my nightly backups were good so nothing too important was lost.

I had used xfs on my $HOME, so I unmounted the device and started an xfs_repair. The repair tool very quickly got to Phase 3, showing the output

Phase 3 - for each AG...
        - scan and clear agi unlinked lists
        - 09:50:01: scanning agi unlinked lists - 0 of 32 allocation groups done

The last line was repeated every 15 minutes, for over 36 hours, never changing from “0 allocation groups done”. I don’t think it was doing anything. Eventually I stopped it and ran the repair in check mode. This caused a segmentation fault at Phase 3. I tried again but got the same segfault.

After a few days of digging around and trying different things I decided the effort wasn’t worth it. Reluctantly I accepted my losses and started the recovery.

Once the RAID array was reformatted I began the data copy from my external drive. This put me back to when it was last backed up. Then I could rclone my nightly backups from the last time it ran (before the corruption) and bring that data up to date.

This got me to a relatively good position. Okay I had lost some random downloads and a little bit of code that hadn’t been pushed to my git server, but nothing serious. It is a little disappointing though, my backup setup is not good enough.

I would like to give zfs a try, or even attempt a mini ceph setup, but that would need some planning and some equipment purchases. I need something in the interim.

An external drive was purchased, which now sits permanently plugged into my PC. Instead of using dedup again I opted for an alternative tool, deciding on BorgBackup.

After installing borg I initialised a new repo and kicked off a full backup

 ──── ─ borg init -e repokey /media/backup/borg-kinakuta
 ──── ─ borg create -v --stats /media/backup/borg-kinakuta::$(date +%Y%m%d) $HOME

The first time this ran it froze my system after about an hour and a half.

On the second attempt it did the same thing. This was frustrating, and looking at some help online there was indication of a bad disk.

When I rebooted after the second freeze my system dropped into maintenance mode, unable to mount /home. I checked the RAID array and sure enough one of the disks was missing. I reassembled the array ignoring the faulty disk and got back up and running, then ordered another drive..

I decided to give borg another shot so kicked of the backup again. This time it succeeded.

 ──── ─ borg create -v --stats /media/backup/borg-kinakuta::$(date +%Y%m%d) $HOME
Enter passphrase for key /media/backup/borg-kinakuta:
Creating archive at "/media/backup/borg-kinakuta::20221023"
------------------------------------------------------------------------------
Repository: /media/backup/borg-kinakuta
Archive name: 20221023
Archive fingerprint: 125d8f26a952dadb0053e17c8c73bb70852f509c3db4b340021c99f5f8daa8ff
Time (start): Sun, 2022-10-23 10:58:05
Time (end):   Sun, 2022-10-23 20:34:51
Duration: 9 hours 36 minutes 46.09 seconds
Number of files: 1343775
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                2.36 TB              2.18 TB              1.98 TB
All archives:                2.36 TB              2.18 TB              1.98 TB

                       Unique chunks         Total chunks
Chunk index:                 1620226              2214299
------------------------------------------------------------------------------

Nine and a half hours was quicker than I was expecting. Over the next couple of days I ran backups each evening after I finished work.

 ──── ─ borg create -v --stats /media/backup/borg-kinakuta::$(date +%Y%m%d) $HOME
Enter passphrase for key /media/backup/borg-kinakuta:
Creating archive at "/media/backup/borg-kinakuta::20221024"
------------------------------------------------------------------------------
Repository: /media/backup/borg-kinakuta
Archive name: 20221024
Archive fingerprint: ec9b9744bcc1baf896eb2b68c14278128688b2e4100172051bc839e57d02fd03
Time (start): Mon, 2022-10-24 17:35:58
Time (end):   Mon, 2022-10-24 17:50:05
Duration: 14 minutes 7.59 seconds
Number of files: 1348078
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                2.36 TB              2.18 TB            325.62 MB
All archives:                4.72 TB              4.37 TB              1.98 TB

                       Unique chunks         Total chunks
Chunk index:                 1627368              4419526
------------------------------------------------------------------------------
 ──── ─ borg create -x -v --stats /media/backup/borg-kinakuta::$(date +%Y%m%d) $HOME
Enter passphrase for key /media/backup/borg-kinakuta:
Creating archive at "/media/backup/borg-kinakuta::20221025"
------------------------------------------------------------------------------
Repository: /media/backup/borg-kinakuta
Archive name: 20221025
Archive fingerprint: 2ff6f558ca5e93f1c0dbfce317deada9d522e5a43538c05f38dea560e110bf9f
Time (start): Tue, 2022-10-25 17:32:16
Time (end):   Tue, 2022-10-25 17:44:37
Duration: 12 minutes 20.92 seconds
Number of files: 1202775
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                2.35 TB              2.18 TB            277.59 MB
All archives:                7.08 TB              6.55 TB              1.98 TB

                       Unique chunks         Total chunks
Chunk index:                 1634630              6477362
------------------------------------------------------------------------------

Fourteen and twelve minutes to backup changes is great. I decided to leave it for two days and observe the time again.

------------------------------------------------------------------------------
Repository: /media/backup/borg-kinakuta
Archive name: 20221027
Archive fingerprint: a3cdca656ee4ee62a31cc135f794ef7de71b62ce7246496908ae529356092b97
Time (start): Thu, 2022-10-27 17:41:28
Time (end):   Thu, 2022-10-27 17:54:23
Duration: 12 minutes 54.20 seconds
Number of files: 1208594
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                2.35 TB              2.18 TB            643.01 MB
All archives:                9.43 TB              8.72 TB              1.99 TB

                       Unique chunks         Total chunks
Chunk index:                 1647966              8540968
------------------------------------------------------------------------------

Almost thirteen minutes for two days of changes, pretty good.

I am really happy with these results from borg. When I get chance the next step is to play around with borgmatic to automate the backups.

Another full backup will still be done to the drive in my bug out bag, I just have to be better at doing it more regularly. At least now if I need to restore I will be able to recover all of $HOME and not only the important things.

My PC has also been moved back onto the desk, away from little button pushers.