I’m looking for a backup utility with incremental backups, but in a more complicated way.
I tried rsync, but it doesn’t seem to be able to do what I want, or more likely, I don’t know how to make it do that.
So this is an example of what I want to achieve with it.
I have the following files:
testdir
├── picture1
├── randomfile1
├── randomfile2
└── textfile1
I want to run the backup utility and basically create an archive (or a tarball) of all of these files in a different directory:
$ mystery-command testdir/ testbak
testbak
└── 2020-02-16--05-10-45--testdir.tar
Now, let’s say the following day, I add a file, such that my structure looks like:
testdir
├── picture1
├── randomfile1
├── randomfile2
├── randomfile3
└── textfile1
Now when I run the mystery command, I will get another tarball for that day:
$ mystery-command testdir/ testbak
testbak
├── 2020-02-16--05-10-45--testdir.tar
└── 2020-02-17--03-24-16--testdir.tar
Here’s the kicker: I want the backup utility to detect the fact that picture1
, randomfile1
, randomfile2
and textfile1
have not been changed since last backup, and only backup the new/changed files, which in this case is randomfile3
, such that:
tester@raspberrypi:~ $ tar -tf testbak/2020-02-16--05-10-45--testdir.tar
testdir/
testdir/randomfile1
testdir/textfile1
testdir/randomfile2
testdir/picture1
tester@raspberrypi:~ $ tar -tf testbak/2020-02-17--03-24-16--testdir.tar
testdir/randomfile3
So as a last example, let’s say the next day I changed textfile1
, and added picture2
and picture3
:
$ mystery-command testdir/ testbak
testbak/
├── 2020-02-16--05-10-45--testdir.tar
├── 2020-02-17--03-24-16--testdir.tar
└── 2020-02-18--01-54-41--testdir.tar
tester@raspberrypi:~ $ tar -tf testbak/2020-02-16--05-10-45--testdir.tar
testdir/
testdir/randomfile1
testdir/textfile1
testdir/randomfile2
testdir/picture1
tester@raspberrypi:~ $ tar -tf testbak/2020-02-17--03-24-16--testdir.tar
testdir/randomfile3
tester@raspberrypi:~ $ tar -tf testbak/2020-02-18--01-54-41--testdir.tar
testdir/textfile1
testdir/picture2
testdir/picture3
With this system, I would save space by only backing up the incremental changes between each backup (with obviously the master backup that has all the initial files), and I would have backups of the incremental changes, so for example if I made a change on day 2, and changed the same thing again on day 3, I can still get the file with the change from day 2, but before the change from day 3.
I think it’s kinda like how GitHub works 🙂
I know I could probably create a script that runs a diff and then selects the files to backup based on the result (or more efficiently, just get a checksum and compare), but I want to know if there’s any utility that can do this a tad easier 🙂