Summary of that previous thread: Using a filter in order to achieve an incremental backup (e.g., rclone copy src Backup:src --max-age 10d ) was unexpectedly slow, even when the filter ended up matching no files at all.
Various options were suggested, with various results. The conclusion was to run a first step of identifying the files locally and then feed that list into the copy, like so:
There's a lot in that thread so I'm not sure what applies and what doesn't. (It looks like the problem there was that the --files-from flag was unexpectedly slow, where in my case it was the solution to the slowdown!)
So...
Can anyone say if copy --no-traverse is appropriate for my situation?
Are there any minuses to using this flag?
[Bonus question] Is this the default behavior for copy? If not, why not?
There is a problem with this strategy: if you delete a folder or file from the source, which is older than 10 days, this will not be reflected in the destination.
And there is also the problem of folders or files that were just moved (which does not change the file timestamp)
This, combined with the use of "copying" (rather than "syncing"), can cause the backup to become "bloated" over time with multiple old files.
I've been through this in the past:
The solution is (if you want to use this strategy), from time to time, perform a "full sync"
To write a proper incremental backup, the best choice would be to utilize some kind of filesystem log which would explicitly list all the changes since a given time. But for a quick-and-dirty backup, a frequent copy combined with a periodic sync is sufficient.
Since nobody has yet answered these questions I decided to do a test myself locally. My tests showed that the --no-traverse flag did what I wanted (and was not the default behavior), and the performance was excellent. So thanks again, @Ciantic, for bringing this to my attention.
In some cases it will lead to slower checking of files, while in other it will be much faster.
I could give an explanation here based on my (probably flawed) understanding but I know NCW has directly answered this question (and probably more than once). I would suggest you search for his answer(s) to understand how it really works rather than take the info second-hand.
I think the TLDR of it is that --no-traverse is faster when you are uploading jut a few files to a large folder, while it is potentially detrimental to performance when you have to do work on many/most of the files inside a folder. --no-traverse basically skips listing everything in a folder and only checks the one thing it needs to do right now. This is faster for that one operation, but if you end up checking a lot of individual files like this inside the same folder it would have been more efficient for rclone to just list them at the start.