Rclone copy Gdrive to local (multi threaded download) slow

Rclone Version

rclone v1.53.3
- os/arch: linux/amd64
- go version: go1.15.5

.
rclone by default uses multi threaded copy when downloading from gdrive to local disk
With small files I don't really notice any difference on any systems, But with larger files (files around 10GB or greater) download gets stuck at 100%

So I tried to run it with -vvv flag and it looks like it comes to halt after it finishes all the download.

2020-12-16 00:52:35 DEBUG : example_file.bin: multi-thread copy: stream 3/4 (8260288512-12390432768) size 3.846G finished
2020-12-16 00:52:36 DEBUG : example_file.bin: multi-thread copy: stream 4/4 (12390432768-16520465867) size 3.846G finished
2020-12-16 00:52:36 DEBUG : example_file.bin: multi-thread copy: stream 2/4 (4150853632-8301707264) size 3.866G finished
2020-12-16 00:52:37 DEBUG : example_file.bin: multi-thread copy: stream 4/4 (12452560896-16603268429) size 3.866G finished
2020-12-16 00:52:38 DEBUG : example_file.bin: multi-thread copy: stream 4/4 (12785614848-17047406003) size 3.969G finished
2020-12-16 00:52:39 DEBUG : example_file.bin: multi-thread copy: stream 2/4 (3986030592-7972061184) size 3.712G finished
2020-12-16 00:52:39 DEBUG : example_file.bin: Finished multi-thread copy with 4 parts of size 3.712G
2020-12-16 00:52:50 DEBUG : example_file.bin: multi-thread copy: stream 3/4 (8301707264-12452560896) size 3.866G finished
2020-12-16 00:52:50 DEBUG : example_file.bin: multi-thread copy: stream 1/4 (0-4150853632) size 3.866G finished
2020-12-16 00:52:50 DEBUG : example_file.bin: Finished multi-thread copy with 4 parts of size 3.866G
2020-12-16 00:53:43 DEBUG : example_file.bin: multi-thread copy: stream 2/4 (4261871616-8523743232) size 3.969G finished
2020-12-16 00:55:40 DEBUG : example_file.bin: multi-thread copy: stream 3/4 (8523743232-12785614848) size 3.969G finished
2020-12-16 00:57:47 DEBUG : example_file.bin: multi-thread copy: stream 1/4 (0-4130144256) size 3.846G finished
2020-12-16 00:57:47 DEBUG : example_file.bin: Finished multi-thread copy with 4 parts of size 3.846G
2020-12-16 00:57:54 DEBUG : example_file.bin: multi-thread copy: stream 1/4 (0-4261871616) size 3.969G finished
2020-12-16 00:57:54 DEBUG : example_file.bin: Finished multi-thread copy with 4 parts of size 3.969G
Transferred:      118.866G / 196.931 GBytes, 60%, 115.902 MBytes/s, ETA 11m29s
Checks:                17 / 17, 100%  
Transferred:            0 / 13, 0%
Elapsed time:     17m33.1s
Transferring:
 * example_file.bin:100% /15.463G, 0/s, 0s
 * example_file.bin:100% /14.028G, 0/s, 0s
 * example_file.bin:100% /14.849G, 0/s, 0s
 * example_file.bin:100% /15.877G, 0/s, 0s
 * example_file.bin:100% /14.155G, 0/s, 0s
 * example_file.bin:100% /14.279G, 0/s, 0s
 * example_file.bin:100% /15.386G, 0/s, 0s
 * example_file.bin:100% /14.831G, 0/s, 0s

I'm not sure if its stitching together the multi part file or something else but entirely defeats the purpose of multi threaded download for me since the time it takes to stich together or whatever rclone is doing is multiple times more than the extra speed gained by downloading the files using multi-thread copy

example:
I have a 10gig server with a HDD as main disk and SSD cache.
running rclone copy gdrive:folder_with_test_files local_folder_path -P --transfers=8 -vvv downloads with a speed of around 320MBps but takes over a hour to finish the downloads since its stuck at 100%

but running rclone copy gdrive:folder_with_test_files local_folder_path -P --transfers=8 -vvv --multi-thread-streams 0 downloads at a avg speed of 160MBps which is about half the speed of multi-thread copy but it doesn't spend any time stuck at 100% as soon as the download is finished the program quits

So could any one explain what exactly is rclone doing when its stuck at 100%?
it doesn't look like its stitching together the parts (similar to what IDM does) is it calculating the md5 hash for the file ?

also is using the --multi-thread-streams 0 best choice I have to avoid this or is there a better solution that I haven't come across ?

I haven't tried the --ignore-checksum option maybe i'll try it next time i have to download huge set of files
If that works then it would confirm my doubts about it trying calculating the md5 hash for the file

At 100%, it's calculating the checksum. You can see that by the disk IO as well on the system as I guess you are running on slower disk.

Without a rclone.conf, I'm guessing you aren't using an encrypted remote?

Yes not using encrypted remote and the disk speeds are not bad 160MBps avg for a 100GB sequential read.
but even just calculating checksum for multi thread download after 1 download takes unreasonably long time so i guess I can just use ignore-checksum then for now

Maybe I will time the time it takes using md5sum vs rclone takes right after it finishes the download to see if there are any major differences (since AFAIK rclone uses md5 checksum)

Spinning disk or SSD?

You can test by running a md5sum on a file and see how long it takes.

Spinning disk on a 10GB file for me takes:

Which seems reasonable.

Spinning disk I think the problem its running into here is since its finished downloading 8 big files at a time and its trying to calculate checksum for all the files at the same time the hard disk is probably having a tough time keeping up with it and slowing down the process even more

That would also do it as well. You can reduce the transfers to help with that.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.