I've researched how/if multi-threaded uploads are done, but haven't been able to fully understand what the current state is.
Specifically, I'm using Google Drive and getting a max of 100-150mbps upload for a single file.
I'm wondering how I could achieve more, perhaps by uploading multiple chunks simultaneously and combining them together at the remote somehow (what I mean by "multi-threaded uploads").
I've seen discussions about flags that could possibly control this, but none seem to have made it to production as of yet.
While I haven't tested this extensively, I suspect that there might be some kind of maximum limit for chunk-size at 128M or 256M. At least looking at my bandwidth graph I can't seem to get the spacing between actual chunks transferred (judging by the graph at least) to be any wider at these high levels).
I would suggest doing some testing on this yourself. Too high chunksize may have no effect or even a counterproductive effect. As I said - not certain about this due to lack of true in-depth testing - more research needed.
Based on what I have seen from GCP VMs and feedback from others with much higher bandwidth than myself, there seems to be some sort of cap on the backend for individual transfers, but this should be closer to 40MB/sec pr transfer. And of course, since you can use many parallel transfers this is rarely much of an issue. Since I can't even cap out a single tranfer with my 160Mbit/sec at home, I've not taken the time to dig into this in-depth.
@ncw
A related curiosity however is that if you set the upload-threshold very high (to avoid chunking and thus spending no memory) the average pr-transfer speed goes way down - to more in the range of 40-80Mbit/sec. It is still something you can solve by paralell transfers of course, but I find this very curious why non-chunked uploads would behave so drastically different. Do you have any idea on what might cause this NCW? This is more of a point of curiosity than an rclone complaint.
It shouldn't - not unless you have a very slow CPU like a Pi or something. If so then just check the hardware status and see if it gets near to maxing out or temperature-throttling during upload.
On half-decent desktop CPU from the last decade it should only take a few % of your CPU cycles to encrypt-as-you-go. (not remotely a problem even on my ancient 2500K)
Once the CPU has encrypted the bits rclone or google doesn't care if the data is encrypted or not, so there should be no reason why encrypted data should be transferred any slower than anything else.
So having done testing on the chain of my setup (gdrive -> cache -> encryption -> mount), it turns out that it basically all rests upon the vfs-cache-mode flag of the mount.
Having it set to writes and above yields expected performance, but setting it to minimal or off causes almost exactly a 2x time increase for copying to the mount.
My test is as follows: rclone mount --config ~/.config/rclone/rclone.conf --drive-chunk-size 256M --vfs-cache-mode writes --cache-chunk-no-memory gdrive_encrypted:/ mount_dir/
(Although gdrive_encrypted:/ point to a cached remote as mentioned, I suspect it doesn't matter in this case) head -c 1GB /dev/zero > test.bin time cp test.bin mount_dir/
This takes around 20-25 seconds, which corresponds to the ~40MB/s (~350mbps) figure you guys are getting.
On the other hand, running the above mount command with --vfs-cache-mode off results in the copy taking 45-55 seconds.
Can you reproduce this? Is there a way to fix this? I wouldn't want to incur the overhead of essentially doubling the amount of storage I need to account for large files at peak, as I'm working with limited storage.
Oh I expect it is something on Google's end too. I just wanted to check if you knew anything about the "why" of this because it is such a curious result. I don't see any obvious reason for why chunks would be treated differently than a continuous stream.
My best (totally baseless) speculation is that maybe direct streams were just not intended to very large files and thus maybe they go direct into final storage while chunks are cached somewhere server-side on faster disks. Who knows It is nice to be aware of though - especially when dealing with low-RAM systems.