Bad windows 10 download behavior

When I download files from google drive to local (windows 10) it will "cache" the full file size and then starts downloading. This is really bad.
E.g. A full 1gb file will be created before the download starts. And then the download will start.

For me, my drives hit 100% as soon as I enter the copy command.

HOWEVER, when I use ubuntu to download, My drives will never reach above 50%. It's not transferring cache of the full file as temporary storage. Its downloading and saving as it comes.

E.g. The same 1gb file will have this behavior, 0kb>1kb>10kb>1MB>100MB>500MB>1GB.

You get the idea.

Is there a workaround for windows? Or is it a limitation that's making it impossible. Its extremely inefficient, speed wise.

What is your rclone version (output from rclone version)

Which cloud storage system are you using? (eg Google Drive)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp)

1 Like

the copy command will use 100% as soon as the copy starts.
you can limit that by tweaking flag --multi-thread-streams

as for the file size behavior,
i am finding that as soon as i start to copy command to download a 40GB file, the free space on my hard drives drops by 40GB immediately.

i am w10pro.64, rclone 1.51. local hdd is ssd.

rclone.exe copy wasabieast2:vserver03/en07.veaam/rclone/backup/EN07/EN072019-11-09T113532.vbk .\ --log-level=DEBUG --log-file=log.txt --progress --multi-thread-streams=1

and what is worse, is that i am not able to stop the rclone process.
if i try to cancel rclone via ctrl+c, it does not respond. i cannot stop rclone
if i type ctr+c over and over, and wait, after a few minutes, rclone will finally exit on it own.
the fact that i cannot stop rclone via task manager is not a good situation.
and this is the log file

2020/03/02 17:18:00 DEBUG : rclone: Version "v1.51.0" starting with parameters ["c:\\data\\rclone\\scripts\\rclone.exe" "copy" "wasabieast2:vserver03/en07.veaam/rclone/backup/EN07/EN072019-11-09T113532.vbk" ".\\" "--log-level=DEBUG" "--log-file=log.txt" "--progress"]
2020/03/02 17:18:00 DEBUG : Using RCLONE_CONFIG_PASS password.
2020/03/02 17:18:00 DEBUG : Using config file from "c:\\data\\rclone\\scripts\\rclone.conf"
2020/03/02 17:18:00 DEBUG : EN072019-11-09T113532.vbk: Sizes differ (src 40287141888 vs dst 14675968)
2020/03/02 17:18:00 DEBUG : EN072019-11-09T113532.vbk: Starting multi-thread copy with 4 parts of size 9.380G
2020/03/02 17:18:00 DEBUG : EN072019-11-09T113532.vbk: multi-thread copy: stream 4/4 (30215503872-40287141888) size 9.380G starting
2020/03/02 17:18:00 DEBUG : EN072019-11-09T113532.vbk: multi-thread copy: stream 2/4 (10071834624-20143669248) size 9.380G starting
2020/03/02 17:18:00 DEBUG : EN072019-11-09T113532.vbk: multi-thread copy: stream 1/4 (0-10071834624) size 9.380G starting
2020/03/02 17:18:00 DEBUG : EN072019-11-09T113532.vbk: multi-thread copy: stream 3/4 (20143669248-30215503872) size 9.380G starting

image

if i limit the number of threads from 4 to 1, then i am able to get ctrl+c to work again.

also, rclone will not update the progress correctly during the copy when four threads are in use

https://rclone.org/docs/#multi-thread-cutoff-size

I think this documents it. I haven't tried it but if it doesn't use multi-threads I'd guess it may not preallocate?

So --multi-thread-streams=0 might disable it.

@calisro --multi-thread-streams=1 solved it. thanks! Although I can still use multi thread on linux. ¯_(ツ)_/¯

thanks but i just tested --multi-thread-streams=0 and rclone still preallocates the total space on windows.
not that this is a problem.

Incase you want to read the issue that introduced this.

1 Like

thanks, good read.

so i did another test and there were two fragments, not one.
but again, this is not a problem.

image

This, as noted above, is rclone pre-allocating the space. This speeds up writing, makes for less fragmented disks and makes sure that the file can be written - you don't have to download 1GB of data to get a disk full error at the end.

I'm just wondering why this is causing you a problem - is it taking ages to create the 1GB file initially? It should be instant, so maybe this isn't working as designed.

What file system are you using? NTFS?

@ncw,
the problem i am experiencing, as i noted above,

  1. the inability to kill rclone, when downloading. even task manager cannot kill it.
  2. the progress info does not reflect what rclone is doing.

if i reduce the number of threads to two, then the above problems do not occur.
thanks much,

the file system is ntfs and hard drive is ssd

It sounds like the system is overloaded with IO if reducing the number of threads helps. Can you check that?

hi,
not sure what you want me to check? let me know and i will test.

in the post above, i mentioned that if i reduce the number of threads, ctrl+c and progress works again.

i have verizon fios 1Gbps, fast machine and ssd.

thanks
never saw any other application behave as rclone is.

What happens if you put a low bw_limit on it and keep the multithreads.

i ran rclone copy command 6 times,

3 different file sizes were downloaded.

ran 3 times with bwlimit=10M
ran 3 times with no bwlimit

it seems that

  1. the larger the file, the longer the delay until ctrl+c responds
  2. the larger the bandwidth limit, the longer the delay until ctrl+c responds
bwlimit MB size GB delay sec
10 1.7 2
10 6.8 10
10 46 105
0 1.7 0
0 6.8 6
0 46 32

and the progress is not updated accurately.

conclusion:
perhaps that each thread downloads a chunk at a time and each threads is locked until that chunk is downloaded.
once another chunk is about to be downloaded, the thread yields to respond to ctrl+c

curious what you think?

Look at IO load in the task manager? I'm not a Windows expert so I don't know what you can measure easily! I'd use iostat or vmstat on linux.

It certainly looks like that.

As far as I'm aware the go interrupt handler runs in a different thread though so I don't understand this behaviour!

Does rclone do anything when you press CTRL-C - any logs with -vv?

Are these copies with or without --multi-thread-streams 0? Can you try with and without?

@ncw its not instant. This makes the drive write at maximum speed until the 1GB is reached. For example an SSD, around 400MB/s. For HDD, 150MB/s. So, downloading to HDD will be much slower with this. Its very bad for HDD as it will be 100% utilisation for awhile until the 1GB is reached.

But as per the solutions above, --multi-thread-streams=1 fixed it, albeit being slower overall download speed, not local..

That makes me think that maybe rclone should be calling pre-allocate for multi-thread downloads too....

I just checked - it does.

So both --multi-thread-streams > 1 and <= 1 are using pre-allocate, so that probably isn't the problem.

When rclone is doing multithread downloads it is writing in multiple parts of the same file, effectively making a sparse file. In linux, the sparse file is only allocated on demand however that may be different on Windows..

I just had a quick test and it looks like if you seek to the end of a large file and start writing then Windows writes 0s from the start of the file. I can tell this because the larger seek you put in, the longer it takes.

Test program: https://play.golang.org/p/es9ogbTtrQe

So it looks like Windows doesn't create sparse files unless you set a specific flag or use an ioctl: https://stackoverflow.com/questions/4011508/how-to-create-a-sparse-file-on-ntfs

Try this with --multi-thread-downloads > 1 - it marks the file as sparse so it should stop it writing it all at the start.

https://beta.rclone.org/branch/v1.51.0-087-geed61a29-fix-windows-sparse-beta/ (uploaded in 15-30 mins)

1 Like

I take it that solution worked? If so I'll merge it to master?

excellent!

i ran
%rcmd% copy %remote%/%testfile% .\ --log-level=DEBUG --log-file=%logfile% --progress --multi-thread-streams=10

  1. the file created was pre-allocated, but not the full amount.
    the total file size is 46GB
    just after i ran the command, immedialy the file size was 41.7 and creeps up.

  2. ctrl+c work with no delay

  3. the progress output was good

  1. hard drive saturation was low and my downloads were very fast, over 100MBs, basically my internet connection, verizon fios 1Gbps is saturated

  2. after the download was completed, rclone did not exit in a timely manner.
    the progress continued to be updated, the download rate failing for a couple of minutes
    but the rclone log showed that the download was completed.

Don't know about this, couldn't get it to work --multi-thread-downloads

However, --multi-thread-streams=1 together with --drive-chunk-size=256M is the solution for Windows. 100% didn't know this, you should put it under windows section for PSA.