When debugging my backup process, I spent 90% of the time waiting for rclone to skip over (check) existing files. I am syncing millions of small files onto some remote destination. Sometimes online (B2). Other times on a network share.
In order to speed this up, we can create a cache file containing key-value pairs where:
key = hash of the command-line that triggered checking of the directory
value = timestamp when the check completed
Then, when the user runs a command we’d skip over any directories that we already processed in the past X milliseconds (with some reasonable default value). When running incremental backups, I would happily skip over any directories that have been processed in the past hour as I only run backups once a day in production (shorter time periods indicate I am debugging).
The file checks are pretty quick depending on what you’re syncing. Provide your command and which remote type you are having issues with. You can also use --fast-list if you have plenty of RAM as that will make the directory listings more efficient.
Alternatively, you can ‘work around’ your desire to sync files changed in the past hour by generating a list of files to sync using a command line tool that looks at modification times and then use the ‘–files-from’ parameter with that generated list to sync only those files.
where [source] is a entry point I wish to back up recursively and [target] is configures as follows:
type = local
nounc = false
The machines are linked over a 802.11ac wifi connection, Windows share, connected to an external drive over a USB3 connection. When backing up large files I get high speeds (upwards of 60Mbps). When running file checks I get only about 3Mbps.
I didn’t know about this feature before now. Yes, it does what I’m looking for with one big caveat: you cannot hit CTRL+C or break out of a recursive command. The entire reason I am trying to cache is because I am debugging my backup script. I need to abort it as soon as I see something wrong.