I am syncing directly to a remote and experimenting with creating a real time /home folder sync. Currently it takes 2m15s to run an rclone sync command and check all of my files.
What are some strategies for flags to safely speed up this time? I assume that I do need to check the mod times and checksums every time to know if the file needs to be updated.
For example, if I increase --checkers from 8 to 50 the sync time decreases by half. Is this a safe with pCloud move or could it lead to getting throttled?
For others who are using pCloud, what is the safest number of checkers and other performance based flags that you use which do not lead to throttling? Is any of this information published anywhere that I can reference?
Also, is it dangerous to increase checkers and other flags that could increase my performance. For example, could I increase checkers= 1000 or would I risk getting throttled with pCloud?
Does setting these flags allow me to exceed the API transaction limit or does rclone's pCloud remote implementation already limit the number of transactions I am allowed?
I am not getting throttled yet but I want to understand more about how it happens, what is safe to do to prevent it. I will eventually change the log level to NOTICE when I am finished.
rclone cannot exceed hard limits set and enforced by the cloud provider.
if rclone hit a limit, the server would tell rclone.
rclone, on some backends, will self-throttle itself.
the debug log would tell you all you need to know.
Okay this is good to hear and I suspected that in order to implement a particular cloud API, rclone would need to respect the defined limits. However, if rclone doesn't allow a combination of flags to exceed the number of transactions allowed by the cloud provider API then how are other user reporting setting flags and getting throttled or banned?
Also, if it is safe to increase checkers and clearly it increases performance, then why is the default for checkers set to a low value?
I haven't seen any throttling messages in my debug log, what does a message look like so I can search for it?
So I am clearly getting better performance. I assume that if I set checkers to be some arbitrarily large value like 10,000 I would get diminishing returns and eventually reach some sort of bottleneck. As long as it is safe to do so.
I have been experimenting with this. Because I syncing so frequently, most of the the sync command spends is on checking the existing files. Usually there is only 1 or 2 files to actually transfer.
Setting the checkers to a very high value seems to have made the biggest difference. Can you confirm the information that asdffdsa gave above from a code point of view? Specifically is it safe to set checkers to a very high value? Is there any risk of rate limiting or throttling in this case?
--checkers is used in rclone as a general measure of concurrency. In this case setting it higher is doing multiple directory traversals at once which is probably helping.
--checkers can do networks calls - for example if you are listing pCloud - it will be how many directories are listed at once.
However for the top-up sync if there are only a few files to transfer each time then it won't cause you to be rate limited as you'll only be doing a few API requests each time.
There are only a few times to transfer each time however, everytime the sync command runs it checks thousands of files to make the determination that it only needs to transfer a few files.
So every time any change is made in the filesystem, the sync will check all files in the hierarchy and then only transfer the one or two that have changed. This approach seems inefficient however setting --checkers to a very high value really does seem to help, it takes less than a minute to check those thousands of files. As long as this is a safe operation.
What defines an API call, is each file checked an API call in this case or is the entire operation an API call?