1. 36
  1. 6

    If you want a similar tool that also supports asymmetric encryption (air gapped decryption keys), then please also give my one a try:

    https://github.com/andrewchambers/bupstash

    1. 1

      Nice! A state of the art feature, as I understand it, is to permit multiple concurrent writers to the same archive. Does bupstash support this?

      Duplicacy supports this via lock free techniques: https://github.com/gilbertchen/duplicacy/wiki/Lock-Free-Deduplication

      1. 1

        It supports multiple concurrent writers with no problem, the lock free removal is interested though - maybe something I can add.

        1. 1

          Out of curiosity, would such a change sidestep the issue in this issue from awhile back?

          1. 1

            Probably not, however I am investigating ways to support filesystems that don’t support file locks. One example is how sqlite3 does it, with an option to use a fallback instead of a filesystem lock (like attempting to create a directory).

            1. 1

              Ok thanks for the detail. I am really excited to see what you come up with. Bupstash looks absolutely amazing to me.

      2. 1

        Another alternative for asymmetric encryption is rdedup, but I haven’t found (or bothered writing myself) the tooling around it to make it a worthwhile backup solution for me.

        I’m currently using restic on my file server, backuping to wasabi s3-compatible storage. Works great.

      3. 7

        restic is part of the current-gen backup tools that uses rolling hashes to get snapshot-oriented block-based deduplication. (See also arqbackup.)

        If you’re on a previous-gen tool such as duplicity, rsnapshot, Apple Time Machine, rdiff-backup, then it’s worth a look.

        1. 5

          Not only rolling hashes, but also content defined chunking https://github.com/restic/chunker which is just magic really. Deduplicating segments not at block boundaries.

          The compression was holding me with Borg, but I’m happy to give restic a try now. I hope they improved with their performance issues…

          1. 3

            What sorts of datasets did you miss compression for? In my experience anything significant already has file-level compression (e.g. jpeg, mpeg, git packfiles, …)

            1. 2

              I haven’t looked into the details, but my laptop backup gets 20% smaller with compression.

              1. 1

                Plenty of things don’t, think sqlite database (or any database really), most configuration files, some PDFs.

                By my experience even standard lz4 can squeeze 5% out of almost pure media file datasets. ZSTD as restic uses is a bit better. In more realistic applications, the compression ratio tends to be 20-50%. I’ve even got a single ZFS dataset that sits at 800% compression. An upside of ZSTD is also that decompression speed is not a function of the compression settings. So using zstd at it’s maximum (bearable) settings (already upgraded my restic repo) is well worth it for backups.

              2. 1

                I just read the article linked from that github (https://restic.net/blog/2015-09-12/restic-foundation1-cdc/) and it looks like “content-defined chunking” is just a new term for the same rolling hash concept used by borg (and bitbottle). Is there some reference that can explain the difference?

                1. 4

                  It think that linked article already does it, but doesn’t call out explicitly. They’re completely different concepts, just used together in this case. You could have CDC without a rolling hash (for example calculating sha each time) and you can use rolling hash for whole fixed size blocks without doing chunking. Restic and borg use them both to achieve the same thing.

                2. 1

                  Do you know by chance if kopia also has “content defined chunking”?

              3. 3

                If you happen to only have one file to back up (such as database), you can stick zstd --rsyncable - in your pipeline. I save around 2/3rds size via deduplication (on top of the compression) using this method

                1. 1

                  I have switch between a few backup tools over the years. Probably have used restic the most. Need to set that up again. Haven’t since reinstalling my machine. No critical data, but definitely some data worth backing up.

                  1. 1

                    I use a different tool. Also written in Go. It suffers from bad memory usage and an occasional OOM. A quick glance at restic’s issues reveals that it suffers from the same issue.

                    Do you have any recommendations for a backup tool that supports deduplication and provides solid encryption? Preferably with low disk space requirements for index on the source machine.

                    1. 4

                      Borg/borgmatic (cw: github) satisfies those requirements for me.

                      1. 2

                        Not sure if this is the other tool you were referring to, but I’ve been a Kopia user for the better part of a year and it’s worked great for me so far!

                        https://github.com/kopia/kopia

                        Supports deduplication and encryption out of the box, along with sinks to popular cloud providers (I use Backblaze B2).

                        1. 1

                          Yes, I use kopia. As I said, it does OOM for me once in a while. It also uses 10GB of cache while backing up about 50GB of files. Seems a bit excessive.

                          1. 2

                            I’ve personally never had an OOM with a ~600GB backup working set.

                            That said, I have 64GB RAM on both my desktop and laptop (each of which is backed up using Kopia).

                      2. 1

                        Does anybody backup sqlite files with restic? I noticed that it’s not just as a simple as pointing restic to the folder that contains the DB, as the content might get corrupted.

                        1. 3

                          it’s not just as a simple as pointing restic to the folder that contains the DB

                          Definitely don’t do that.

                          https://www.sqlite.org/backup.html

                          1. 3

                            You should be backing up file system snapshots if you want to avoid backup smearing, the backup tool can’t coordinate with modifications in progress.

                            1. 1

                              That’s in general a very bad idea. There’s situations when you can backup a running database, when both the database make sure that a finished write won’t leave the database in an inconsistent state (most serious databases) and the files system is able to make snapshots in a certain point, not in the middle of a write (ZFS can do that for example).

                              And good software tends to have ways related to backups documented. I’d strongly recommend reading that for SQLite, but also for Postgres (it’s way too common to just go with an SQL dump, which has a great number of downsides).

                              Don’t blindly back up anything that might write. It could mean that your backup is worthless.