Multi-threaded uploads


I've researched how/if multi-threaded uploads are done, but haven't been able to fully understand what the current state is.
Specifically, I'm using Google Drive and getting a max of 100-150mbps upload for a single file.
I'm wondering how I could achieve more, perhaps by uploading multiple chunks simultaneously and combining them together at the remote somehow (what I mean by "multi-threaded uploads").
I've seen discussions about flags that could possibly control this, but none seem to have made it to production as of yet.

Thanks in advance for information.

1 Like

I can usually max out my gigabit with a few transfers.

I use:

/usr/bin/rclone move /local/ gcrypt: --log-file /opt/rclone/logs/upload.log -v --exclude-from /opt/rclone/scripts/excludes --delete-empty-src-dirs --fast-list --max-transfer 700G --drive-chunk-size=128M

I use the default checkers/transfers and the chunk size of 128M helps quite a bit.

Here are my last few days of uploads:

Transferred:   	  130.327G / 130.327 GBytes, 100%, 63.916 MBytes/s, ETA 0s
Transferred:   	  149.769G / 149.769 GBytes, 100%, 92.211 MBytes/s, ETA 0s
Transferred:   	  207.164G / 207.769 GBytes, 100%, 65.285 MBytes/s, ETA 9s
Transferred:   	  207.783G / 207.783 GBytes, 100%, 65.107 MBytes/s, ETA 0s
Transferred:   	   66.877G / 67.073 GBytes, 100%, 80.328 MBytes/s, ETA 2s
Transferred:   	   67.075G / 67.075 GBytes, 100%, 79.994 MBytes/s, ETA 0s
Transferred:   	  134.878G / 134.878 GBytes, 100%, 85.297 MBytes/s, ETA 0s
Transferred:   	   93.174G / 93.174 GBytes, 100%, 80.573 MBytes/s, ETA 0s

Sorry, I just updated my post to clarify that I'm referring to a single large file

I would also like to know about it. Sometimes I have to upload 80-90 GB single large file to Gdrive.

Multi threaded uploads depend on the cloud backend.

Google drive doesn't support them unfortunately - you can only upload one chunk at a time as far as I know.

Note the --drive-chunk-size 128M in animosities command line - that speeds things up a lot at the expense of memory.

I see, that's unfortunate.
I've using --drive-chunk-size 1024M and get "only" 100-150mbps.

What about wrapping it with a chunker remote?
Does that automatically allow multi-thread uploads?

Single file, 1G if you have the memory gives me:

So about 380Mb/s.

Interesting, what's the exact command there?

You can actually see it all actually.

I left everything at default and just added -P --drive-chunk-size 1G

I used a copyto to test a single file.

After playing around for a bit it looks like encryption adds some overhead when it comes to upload speed.

While I haven't tested this extensively, I suspect that there might be some kind of maximum limit for chunk-size at 128M or 256M. At least looking at my bandwidth graph I can't seem to get the spacing between actual chunks transferred (judging by the graph at least) to be any wider at these high levels).

I would suggest doing some testing on this yourself. Too high chunksize may have no effect or even a counterproductive effect. As I said - not certain about this due to lack of true in-depth testing - more research needed.

Based on what I have seen from GCP VMs and feedback from others with much higher bandwidth than myself, there seems to be some sort of cap on the backend for individual transfers, but this should be closer to 40MB/sec pr transfer. And of course, since you can use many parallel transfers this is rarely much of an issue. Since I can't even cap out a single tranfer with my 160Mbit/sec at home, I've not taken the time to dig into this in-depth.

A related curiosity however is that if you set the upload-threshold very high (to avoid chunking and thus spending no memory) the average pr-transfer speed goes way down - to more in the range of 40-80Mbit/sec. It is still something you can solve by paralell transfers of course, but I find this very curious why non-chunked uploads would behave so drastically different. Do you have any idea on what might cause this NCW? This is more of a point of curiosity than an rclone complaint.

It shouldn't - not unless you have a very slow CPU like a Pi or something. If so then just check the hardware status and see if it gets near to maxing out or temperature-throttling during upload.
On half-decent desktop CPU from the last decade it should only take a few % of your CPU cycles to encrypt-as-you-go. (not remotely a problem even on my ancient 2500K)

Once the CPU has encrypted the bits rclone or google doesn't care if the data is encrypted or not, so there should be no reason why encrypted data should be transferred any slower than anything else.

That's really on CPU overhead and very minimal.

I'm tested copying to an encrypted mount.

So having done testing on the chain of my setup (gdrive -> cache -> encryption -> mount), it turns out that it basically all rests upon the vfs-cache-mode flag of the mount.
Having it set to writes and above yields expected performance, but setting it to minimal or off causes almost exactly a 2x time increase for copying to the mount.

My test is as follows:
rclone mount --config ~/.config/rclone/rclone.conf --drive-chunk-size 256M --vfs-cache-mode writes --cache-chunk-no-memory gdrive_encrypted:/ mount_dir/
(Although gdrive_encrypted:/ point to a cached remote as mentioned, I suspect it doesn't matter in this case)
head -c 1GB /dev/zero > test.bin
time cp test.bin mount_dir/

This takes around 20-25 seconds, which corresponds to the ~40MB/s (~350mbps) figure you guys are getting.
On the other hand, running the above mount command with --vfs-cache-mode off results in the copy taking 45-55 seconds.

Can you reproduce this? Is there a way to fix this? I wouldn't want to incur the overhead of essentially doubling the amount of storage I need to account for large files at peak, as I'm working with limited storage.

I don't use my mount for writing as I use rclone move instead.

I don't use the mount to write as it's just unneeded overhead in my setup.

Does this command comes in effect if I am uploading to Google Drive via Crypt remote?

I don't think it is anything rclone is doing - I suspect it must be rate limiting at the google end.

Oh I expect it is something on Google's end too. I just wanted to check if you knew anything about the "why" of this because it is such a curious result. I don't see any obvious reason for why chunks would be treated differently than a continuous stream.

My best (totally baseless) speculation is that maybe direct streams were just not intended to very large files and thus maybe they go direct into final storage while chunks are cached somewhere server-side on faster disks. Who knows :slight_smile: It is nice to be aware of though - especially when dealing with low-RAM systems.

1 Like