In the past it always did a complete check. After the checking finished it would start transferring.
These days Rclone starts checking and as soon as it finds files it can transfer it will start transferring and performing the checking and transferring in parallel.
Is it possible to force it to finish checking before it starts transferring?
The reason for this is that Rclone seems to use only one full core of the CPU (Windows x64 version of Rclone). The work is spread out across the cores but the total usage is still only one core (12.5% for me)
This results in the transfer (using crypt) and the checking slowing down over time. Once checking finishes the transfer will speed up again. Crypt does not seem to ask much CPU time so the most of it goes to checking.
If there are other ways to optimize this please let me know.
With --dry-run it takes around 2 minutes 13 seconds
Without --dry-run transfers start at some point and checking takes around 9-25 minutes (depending on how soon transfers were started) and getting slower and slower the longer it takes and affecting the transfer speeds as they drop far below 3 MB. Though the transfer speed drop only happens when only one transfer is active. If 4 transfers are busy, transfer speed does not drop
As you can see my bandwidth is limited to 3MB so disk access for transfers should be limited
CPU usage is only 0.5-1% for the Rclone process while transferring, not checking. If Rclone is still checking CPU usage for the Rclone process is (as earlier mentioned) 12.5%-13%
Disks are Hardware raid 5 using 4 * 8TB enterprise grade HDD's
22TB of used data spread over ~183000 files
I did some testing. The disks are 100% active in the beginning until the transfers start. While 3 transfers were active disk activity had dropped to 20% while CPU usage stayed the same at 12.5%-13% while checking. CPU usage dropped after checking was finished
Does Rclone slow down disk activity for checking while transferring?
Does --bwlimit affect the speed of disk reading?
with --dry-run red rectangle shows the runtime of the rclone process. Around 2 minutes 13 seconds in this case
First rclone does checking until it fills up the transfer queue. At this point the checking will slow down to the speed of the transfer as it will need one file to be transferred so it can add another to the checking queue.
You can see this in the --progress display - the checking will max out at 10,000 if this theory is correct.
So if you add --max-backlog 200000 rclone will do the checking as quickly as it can. It will still start transfers as soon as it can but the speed of the transfers won't delay the checking.
Rclone doesn't. your OS may though depending on IO scheduling. Rclone is doing more IO when it is transferring AND checking.
Yes it does - the disk will be read at that speed on average.
I did have a request for rclone to do the checking phase before the transfer phase - it would be relatively easy to do...
Does this mean that it will slow down at 4 active transfers or when 10000 (backlog) transfers are scheduled?
In my example the total scheduled for the full task was 184005 Checks, 7 transferred, 6 deleted
The weird thing is that while Rclone slows down at some point, as you said. In my example the CPU usage was not affected. You would assume that slower checking would mean lower CPU usage
Of course, it reads only as much as it can send but does it actually limit it or should it be able to read faster (above bwlimit) if that was necessary?
Transfers take very little CPU time. After Checking has finished then CPU usage drops to 0.5-1%
What do these numbers mean?
The left number is google drives check and the right one the local drive check? It's hammering against the 10000 backlog limit and the HDD's are barely doing anything.
Tried adding --max-backlog 200000. It made the right number shoot up like a rocket. The left stayed the same.
Removed --bwlimit. made no change in the speed of the left number. The moment the transfers start checking starts slowing down
Left number goes really fast (2 seconds) from 0 to ~35000. Then it encounters 2 transfers (out of 4 available transfer slots)
The transfers start and the left checking number increases at ~2000 p/s. But getting slower and slower.
After a minute it only increases by ~200 p/s
After a couple minutes of transferring the transfers start getting slower.
After 5 minutes it only increases by ~100 p/s
After 8 minutes it only increases by ~80 p/s and transfers are at 2/3 of available bandwidth (still only 2 transfers)
After 14 minutes it only increases by ~70 p/s and transfers are at 1/2 of available bandwidth (still only 2 transfers)
The CPU usage stays at 12.5% (one total core) all through this. Disk usage never went higher then 10%
sidenote: I doubled all the left checking numbers from what i saw in --progress. As --progress shows at half second intervals
How long the checking takes really depends on when it encounters anything to transfer. Is it quite soon like the above or more to the end which were the screenshots in earlier posts.
Well i guess you get the point with this behavior.
You can slow all this down with the tps-limit flag. That's the number of transactions per second.
This means the number of checks complete over the number of checks it knows needs to happen. It will attempt to keep the number you've specified in backlog based on the number you've specified (10000 be default). The Right number will increase over time and the left as it actually checks them.
For your issue, I think you'd likely benefit from tps-limit and a lower backlog.
Yeah i know that much. Thats logical. Rclone needs to find out how many checks it needs todo and how many its done. But what is its basis. Is the right number based on files it found locally and the left number how many files it has checked on GDrive?.
The idea is that if i run into the backlog limit, the left number is not increasing fast enough to keep up with the right. That would mean that Rclone and GDrive are not communicating fast enough. Which would mean increasing tps-limit and various other pacer limits might be necessary. On the other hand because CPU-usage stays the same while performance overal drops, it might just be the CPU equivalent of a memory leak... if that makes any sense
So my question is. Why are INFO and NOTICE log levels not working at full speed while DEBUG is?
I would have thought DEBUG would have affected performance more then NOTICE and INFO. So really surprised about this result.
I also tested the --check-first flag. It fixes the problem as well
So did a quick test with DEBUG and
--check-first does not work as intended with DEBUG. Transfers start before checking is finished
It does work as intended with INFO. Transfers do not start unless checking is fully finished
This is based on what --progress shows me.
I'm surprised here as well since if log level only affects the logging. Why does it affect --check-first then?
Both DEBUG and INFO showed in the logs
2020/05/16 18:55:30 INFO : Encrypted drive 'cryptGDrive:D': Running all checks before starting transfers