Incremental backups and efficiency (continued)

Continuing the discussion from Incremental backups and efficiency:

Summary of that previous thread: Using a filter in order to achieve an incremental backup (e.g., rclone copy src Backup:src --max-age 10d) was unexpectedly slow, even when the filter ended up matching no files at all.

Various options were suggested, with various results. The conclusion was to run a first step of identifying the files locally and then feed that list into the copy, like so:

rclone lsf --max-age 10d src > files-to-copy
rclone copy --files-from files-to-copy src Backup:src

Then yesterday this comment came in from @Ciantic

I think the --no-traverse was put back, since it's in the docs. So I think the command is just copy with no traverse flag.

So...my reply (which I hope @Ciantic sees) :

Thanks much for the idea!

Yes it looks like that flag has returned as of v1.46 and it is promising. It was restored for Using --files-from with Drive hammers the API

There's a lot in that thread so I'm not sure what applies and what doesn't. (It looks like the problem there was that the --files-from flag was unexpectedly slow, where in my case it was the solution to the slowdown!)

So...

  • Can anyone say if copy --no-traverse is appropriate for my situation?
  • Are there any minuses to using this flag?
  • [Bonus question] Is this the default behavior for copy? If not, why not?

Thanks.

There is a problem with this strategy: if you delete a folder or file from the source, which is older than 10 days, this will not be reflected in the destination.

And there is also the problem of folders or files that were just moved (which does not change the file timestamp)

This, combined with the use of "copying" (rather than "syncing"), can cause the backup to become "bloated" over time with multiple old files.

I've been through this in the past:

The solution is (if you want to use this strategy), from time to time, perform a "full sync"

Yes, I concur.

To write a proper incremental backup, the best choice would be to utilize some kind of filesystem log which would explicitly list all the changes since a given time. But for a quick-and-dirty backup, a frequent copy combined with a periodic sync is sufficient.

Thanks.

The goal would be to use the right tool for the job. It seems you are trying to use rclone like a hammer when you really want a screwdriver.

If you want to backup incremental, there are other options which work much better.

https://www.duplicati.com/ is free and works with GD out of the box and does encryption and is a backup tool.

https://restic.net/ is another option but I find that less easy to use out of the box as it's nice, but a little complex at first.

Completely agree.

I'm a very satisfied user of duplicacy.com

Yeah, you have a good point. I use rclone because I like it and it was really easy to set up and maintain.

I'll look into these alternatives and give them a spin.

Thanks.