1. 22

I am setting up a personal backup system. My idea is to produce a full snapshot of my directory tree regularly. In between snapshots, I want to upload incremental files, i.e. binary diffs of the directory tree.

I’m using pax as my snapshot format, created by GNU tar. For the incremental format, I’m thinking of using rsync “batch mode” but I wanted to ping lobsters in case there is a better and more standardized alternative. Is there?

  1.  

  2. 9

    I’m using Restic on a couple of servers and it’s working very nice.

    Really easy to use, all backups are encrypted and can be used to connect directly to many storage providers (Amazon S3, Backblaze B2, Azure Storage, Google Cloud…) or use rclone to reach more alternatives.

    1. 3

      +1 to restic. Backblaze B2 backend is also easy to configure, cheap, fast, and basically perfect for personal backups

    2. 8

      borg + borgmatic :

      • incremental
      • client-side encrypted
      • deduplication
      • and so much more :-)
      1. 3

        Borg looks good, similar to Duplicity, but I’m looking for a backup format not a backup system. I am under some unique constraints, which precludes the use of comprehensive solutions like that.

        1. 1

          Are you able to clarify your constraints any further? Reading down, it looks like:

          • You’re looking for long-term storage
          • You have your own storage backend
          • You want a format that will let you upload incremental changes to [any|your] storage backend and is reasonably platform-agnostic, for future-proofing (eg might be file-based, but not tied to a specific filesystem)

          Are these correct/did I miss anything?

          1. 2

            Pretty spot on!

        2. 2

          I have a shitty bash script I run. Borgmatic seems great, thanks. I have mine pointed at rsync.net, which gives you a discount on storage if you use Borg.

        3. 6

          rsync has a –link-dest flag that you can use for this purpose. It copies files that changed and hardlinks files that didn’t. The links make it look like all of the trees (the full/original and the incrementals) are full backups which makes finding a specific version of any file easy.

          1. 5

            We use Tarsnap to back up https://www.laarc.io/ every few hours.

            They have a tarsnap-gui program which can be configured to run each day and to do incremental backups of specific directories.

            1. 2

              I don’t believe tarsnap is an option for me, I have my own storage service. Can tarsnap easily be configured to output an archive file instead of backing up to their service?

              1. 2

                Hmm, no, but the person running it (Colin Percival aka cperciva) is very reliable. I’d be skeptical that my own storage service would outlast his, given the amount of thought and effort he’s put into it.

                One way to roll your own might be to use bsdiff (also by cperciva) in some fashion on the snapshots you produce.

            2. 3

              Personally, I just do daily incremental btrfs send and then every week, a full new backup to a non-btrfs filesystem. I encrypt the btrfs send output using a gpg container. Obviously this is only useful if the source filesystem you use is btrfs.

              1. 3

                Yes, my source data is on zfs and similarly I do incremental zfs send for on-site backups. This if for long-term off-site backups and I want to use something more platform neutral.

                1. 1

                  Unless zfs receive doesn’t work like btrfs receive (userspace), you should be able to dump the zfs send output to a file as well.

                  1. 1

                    Yes, I’m aware, but for my long-term off-site backup I want something platform-neutral. I may not always use zfs in the future.

                    1. 0

                      ok and? you’d just have to rebuild it locally and then cp –archive, tar whatever to your new storage using whatever. this seems like a pretty trivial problem.

                      1. 2

                        Yes, In the distant future, I could always hunt down the last version of linux to support ZoL, find a machine that’s able to run it, rebuild the kernel module with the last version of gcc to support that codebase, and then find an adapter to connect it to my future storage technology, and then re-instantiate the snapshot… but it doesn’t seem wise to rely on any of that being possible. This is meant to be around for 20+ years. ZFS is not upstreamed.

                        1. 1

                          Again, these are all trivial and self imposed problems. You aren’t going to find some magical file “format” that will solve world hunger. Your filesystem and its dump format are sufficiently advanced and do exactly what you want.

                          1. -1

                            Wow, I won’t find a magical file “format” that will solve world hunger. Damn, that sucks. I was genuinely expecting to solve world hunger with a magical file “format.” Thanks for the reality check.

              2. 2

                duplicity[1] supports quite a few different backends for storing backups, and it performs incremental backups using librsync. I personally use it on all of my personal systems, using S3 (backups encrypted and signed with pgp). Being able to choose the backend storage for backups, and change it easily if necessary, was a major bonus for me so that I am not reliant upon a single source (e.g. tarsnap).

                1. http://duplicity.nongnu.org/
                1. 1

                  I would use Duplicity, unfortunately my backups are very large and I don’t have the temp space for it.

                  1. 1

                    Your incremental backups are that large?

                    1. 1

                      Not my incrementals, but my snapshots are. I considered patching Duplicity to allow for streaming backups but seems like it would require a deep re-architecture.

                2. 2

                  Found bup (https://bup.github.io/) recently but didn’t try it yet:

                  Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).

                  1. 2

                    GNU tar has an incremental format also, which should be long-term supported. It does things on a per-file basis though, rather than diffing files, so wouldn’t be suitable if you regularly have large binary files that undergo small changes.

                    1. 2

                      Yeah I looked at that. Avoiding it for two reasons:

                      • I don’t want to keep around the “listed-incremental” file
                      • It’s only supported by GNU tar
                    2. 1

                      I strongly like the simplicity in rsync to do full tree mirrors, and then rsync with hard links for incremental updates. No other tool needed. Of course, you may have reason to avoid this.

                      1. 1

                        If I understand your goals correctly, this is exactly what rsnapshot does: https://rsnapshot.org

                        1. 1

                          Not looking for a backup system as much I’m looking for a backup format.

                          1. 2

                            Perkeep might do what you’re after; the format is well documented.

                        2. 1

                          I’ve been using Duplicati backing up to a Minio instance using Duplicati’s S3 mode when I’m on a machine on which I cannot use glorious Time Machine.