Hello, just working on optimizing a rather large daily one way backup of a file server to box.com, total of maybe 20TB and 5-6 million files. Using copy instead of sync due to the quirks of how box deals with versions / deleted items,
What has helped immensely, since this is a one way backup, is to cache the box side of things with a very long age time (I think I set it to 6 months) , and doing size only. I was wondering if everything is running on one box if there would be any benefit to caching the local FS (with obviously somewhat shorter max age) the daily delta isn’t too bad but of course scanning the entire data set for changes takes a very long time. I have played around with say doing an rclone check and generating a specific list of files to copy, but I guess in practice it would be a wash?
EDIT: perhaps a check with time since last change locally? So stuff that hasn’t been touched in ages is skipped wholesale?
It is pretty quick scanning the local FS. The only think you could speed up significantly would be calculating checksums if you are doing a --checksum sync. I see you are doing a --size-only sync so I wouldn’t have thought it would help much, but it would be interesting to measure!
Some people use rclone lsf -R --max-age 1d /path/to/local > files.txt to get a list of potentials (files modified within one day) then feed that into rclone copy --files-from files.txt /path/to/local remote: - that doesn’t do any remote directory scanning (which has been improved further in the latest beta) so that is quite a quick way of getting incremental updates.
Rclone should really have a sync mode which does this (It used to with --no-traverse but I had to take that out after re-arranging the internals!)
as a quick follow up - it’s been a while since I used -files-from and I know I am just missing something incredibly simple even re-re-reading the various posts about this
Will test, I didn’t have a chance to figure out what I was doing wrong but lsf was not seeing a brand new folder and lsf max age didn’t find it, I’ll re-test from scratch with latest beta and advise
ok so quick updated I tested v1.44-099-g26e2f1a9-beta
and it seems to work fine, however going back it seems to work fine with 1.44
I think where I went wrong is inconsistently using cmd or powershell without thinking, for example lsf “M:\PATH” does not work in CMD at all, but works in powershell, I think the > delta.txt comes out with different encoding under powershell also - will do a full run and advise
Basically it appears that double quotes is wha really screws up CDM and is generally helpful in powershell, I don’t have the error in front of me but with quotes on the beta it explicitly said that the internal UNC translation it does was broken, I’ll try to capture it at some point - my two cents probably a feature request is to maybe to piping internally to the app ? On a related note, wouldn’t just doing a copy with max age produce the same results and performance as lsf / files from?
I’ve written a bit of code to bring back the --no-traverse flag - that be the equivalent. So just make the copy you want and use the --no-traverse flag.