Rclone sync while files change


#1

question:
I am running a task with RCLONE SYNC to sync all my NAS to GDrive.
I have just 20 Mbps upload speed and 4 TB to be synced, so it takes… lot of days.

What happen if I work in write mode on my NAS (adding and modifying files) during the SYNC?
Does Rclone takes some kind of “snapshot” when it starts?
Does Rclone calls VSS on Windows?


#2

rclone will upload the files as it finds them

If it detects that the file changed while it was uploading it will abort the transfer and retry (from v1.40)

No

Not yet. It would be a nice feature, however I’d need help implementing it as I’m not a windows developer!


#3

It seems rclone does a full directory walk of the destination folder as part of doing a sync. It may be better to store a digest of folder after the prev copy/sync is done. Read the digest from the destination and then determine the delta to kickstart the next episode of sync.
Comments?


#4

You’ll see lots of similar ideas on the issue tracker :smile:

There are workarounds though

You can do a partial sync using --max-age and --no-traverse (you’ll need the latest beta for that) which is great for fast top-up syncs.

You can also use the cache backend which implements a full local copy of the metadata.


#5

Is the cache persistent and kept in the same location as the config file? I will try out cache backend and see performance improvement.

One more quick question on initial copy. I tried copying ~1M files (totaling to 1TB)to Google buckets. I see reasonable good throughput ( ~29MB/s) for almost 970 GB transfer. For the remaining 30GB the through drops down to single digit MB. How can I get more insight into the reason for the drop? Do we do another walk to compare the hashes on either side to make sure that there is no error during transmission.


#6

Check out --cache-dir and rclone help flags cache

How many transfers/checkers are you using?

What may have happened is that you have lots of files on one directory (which will be scanned in one thread only).


#7

I will try out the cache–dir option.

I am not specifying any transfer/checkers. Just using the default values. You are right. I do have lot of files in one directory. I will try increasing the number of transfer.checker threads. Assuming there is a help option to describe the number of transfer/checkers. When I use the -P option I can see 4 streams being displayed simultaneously. Are these corresponding to 4 threads?

Does increasing the transfers/checkers help in increasing the threads for one directory?


#8

That corresponds to 4 --transfers. Increasing that will help a lot I would have thought.


#9

From the documentation I can get that Rclone metadata comprises of path, size, mtime and for a file. Do we also store permissions at file or directory ( folder level) in the meta-data that goes over to the destination.


#10

I do not believe that gets stored directly. You could take a snapshot of that to a file and upload that for later. the cloud providers don’t understand those permissions.


#11

Thanks for the input Calisro. I will add that details for my work.


#12

In case of copy or sync does restart ( due to to any interruption) require full destination folder walk or rclone saves some state. If NOT do we have a config parameter or enable the cache parameter for quick resume?