I am using AutoRclone which uses Rclone to copy files from source folder to my Team Drive. There are 39W+ small files (most of them are .jpg and .txt) in source folder. For now I have more than 1K service accounts to help me do this automatically (once the quota limitation, i.e., 750G every 24 hours, for current service account is reached, switch to next service account automatically).
But the problem is that after switched to a new service account it takes a very long time for Rclone to read source/destination folder to skip some already copied files.
So I am wondering that if it is possible to save some status for current service account (for example the file list the service account has copied or file list the service account has not copied) in one task. Then for the next service account that will continually copy files in a new task can resume from the saved status thus skip many redundant reading and comparing OP. I mean is it possible to add some flags like --save_status and --resume?
Thanks. Have not used that. Will try. The introduction of rclone copy is
Copy files from source to dest, skipping already copied
Does it compare at the same time checksum & size & mod-time & size to skip files that already copied? I have looked up the global flags,
--checksum Skip based on checksum (if available) & size, not mod-time & size
--ignore-checksum Skip post copy check of checksums.
--ignore-size Ignore size when skipping use mod-time or checksum.
--ignore-existing Skip all files that exist on destination
--ignore-times Don't skip files that match size and time - transfer all files
--size-only Skip based on size only, not mod-time or checksum
Can it cache only the metadata? Does this happen if you just set chunk size to 0?
does the cache by the way also cache the non-standard attributes that aren't normally included in a listing? I never thought about that, but I suppose that would be plausible, which could be useful to know about.
I'm not sure that fixes the main problem though, which is having to re-list several times if you run rcloen several times.
What would be really nice if is there was a function to dump the remaining transfer list to a file when the operation ends (for whatever reason).That would allow pretty seamless handovers - and also ultra-fast resumes. The user would have to be careful about not using very old listings of course - but in the short term it would be a great tool.
That's just another random idea though.
@xyou365 I think the closest you can do to this currently is do rclone lsf source: , save that to file, and then use --files-from during the transfer. Then you will not have to re-list the source again. The problem is the destination will still have to be re-listed each time, so it probably does not end saving very much time in a full sync. It may save a decent amount of time in a limited copy however...
Yes, it runs the same comparison again each time.
By default, if the size is identical and the modtime is "identical" (ie. within the margins of error) then it just doesn't copy it because it would be redundant. Checks should be very fast to request normally - and the comparison on the CPU is trivial (seconds for tens of thousands of files). On backends that can support it --fast-list will make it even faster (as much as 15x) to list. However, full syncs on very large collections of data can still take a couple of minutes.
It is also possible to for --checksum comparison, but this typically happens automatically if it is possible to do it easily (if both sides already have precaclulated checksums).
But it does need to re-list each time to compare because rclone does not remember doing that already last time, so the listings are technically redundant (after doing it the first time). As I mentioned above, if there was a way to save the transfer list then we would not have to list and could just start copying the files we already know we checked and compared.
@ncw Or does there exist some way to use rclone check/cryptcheck for this and dump a pre-compared list to a --files-from compatible format? maybe? ...
The way to import such a list already exists ( --files-from) , but there is currently no way to dump a transfer list in progress. I only know of it being possible to make a list by using rclone lsf . but this will not be a compared list, this will just be a list of all files.
If you wanted to make a compared list (ie. same as the transfer list would be) you could do that in scripting. Listing both locations, then comparing them yourself. It is very possible - and some people have shown scripts that do this for you already - but it would certainly be a nice feature to have built-in. I do agree with that
What you want is an output like the comm function where you can say whether you want unique files in source, common files, or unique files in dest. I guess there is another column too that is common files which differ.
I'm sure there is an issue with that idea in too but I couldn't find it!