Rclone download

Hi,

First of all Nick, if you happen to read this, congratulations for this amazing software you made! I'm so impressed by the speed and the many options rclone offers. It's crazy fast!

I'm playing with "rclone mount" in Windows (with WinFSP and NSSM). It works great and it's rock stable.

Thanks to S3 multipart uploads, copying from local disk to S3 goes at an impressive 350Mo/s. But downloading from S3 to local disk hardly reaches 60Mo/s. At that time, S3 log only shows up to 2 GET call. I tried adding --transfers=64 and --cache-workers=64 to the 'mount' command but it didn't help. I couldn't figure out how to speed up the download when using the 'copy' command.

Can Rclone do multipart download and if not, do you plan to implement this feature in the future?

Best regards,
Frédéric.

1 Like

Try with
Multi threaded downloads to local storage (Nick Craig-Wood)

  • controlled with --multi-thread-cutoff and --multi-thread-streams

Hello,

Thank you Alfred. This works great with rclone copy (download speed is now 350MB/s) but it does not seem to help with rclone mount (download speed is still 60MB/s). Here's a debug of rclone mount with --multi-thread-streams=8 : https://pastebin.com/FtyEM2eZ

Mind having a look?

Regards
Frédéric.

How are you copying out of the mount? Unless it does multithread downloading you won't get that speed...

You can use --vfs-cache-mode full which will

  • copy to local disk first
  • then stream to the caller of the mount

This may be useful, I'm not sure!

Hello Nick,

EDIT : Sorry I got confused in my previous post. I meant download speed, not upload speed. That is when downloading a file from the S3 mount to the local storage.

What I'm doing is copy (CTRL+C) the file from S3 mounted network drive and paste (CTRL+V) it to the local directory. I'm using --vfs-cache-mode writes if that matters.

My concern is not having multiple files downloading at the same time (which rclone mount probably handles well I'm not sure) but having a single huge file being downloaded over multiple streams / threads like what's described in the documentation for --multi-thread-cutoff and --multi-thread-streams.

Are these 2 options not considered by the rclone in the mount mode or is it WinFSP that prevents downloading a single file in a multi-threaded way?

I think I figured it out. With the --vfs-cache-mode full the file is downloaded by rclone cache using multithreading (this happens for example when right clicking the file). But with --vfs-cache-mode writes the file is not downloaded by rclone cache for reading and manually downloading the file via copy/paste is apparently not using multithreading. I'm not sure I make myself clear here. Hope you get it. :slight_smile:

Yes that is 100% correct!

You can use rclone copy from the mount and if you set the multithread paramaters it will do multithread from the mount which will in turn do it from your upstream... That may or may not be helpful!

Hi Nick,

Thing is our researchers (I'm working at a University), most of the time, wouldn't use the console in Windows to rclone copy files to or from the distant S3 storage, but rather use the copy/paste method in Windows Explorer :blush:

What seems odd is with --vfs-cache-mode off (cache disabled) to get 6x times better upload speed than download speed. Could it be possible to have rclone mount use multithreaded downloads like it does with --vfs-cache-mode full when the cache is not used for reads, that is in any other mode of the cache?

Bests,
Frédéric.

Understood!

That is quite odd! With VFS cache mode off it will be streaming files single threaded both up and down.

Multithreaded downloads will only work to a temporary file on disk. That is what cache mode full lets you do.

I can't think of a way of making them work without.

Did you want to avoid the files on disk?

Hi Nick,

I think this is specific to the S3 storage type and S3 multipart uploads (the --s3-chunk-size --s3-upload-concurrency). I doubt that the per file "super cali fragilistic expialidocious" multi-threading amazing code implementation you've done recently in rclone comes into play here. :slight_smile:

I do. Well for big files at least. I did not make up my mind yet on what cache mode would better suits our researcher needs. Might depend on the size of their data, but I know some users having many big files that came from research experiments. Maybe a new option like --vfs-cache-max-filesize-read would allow using --vfs-cache-mode full with multi-threading downloads enabled without caching too many files.

I'm not sure I understand "Multithreaded downloads will only work to a temporary file on disk" because of this from the documentation:

https://github.com/rclone/rclone/blob/master/docs/content/docs.md#--multi-thread-cutoffsize "Rclone preallocates the file (using fallocate(FALLOC_FL_KEEP_SIZE) on unix or NTSetInformationFile on Windows both of which takes no time) then each thread writes directly into the file at the correct place. This means that rclone won't create fragmented or sparse files and there won't be any assembly time at the end of the transfer."

Tell me if I'm wrong:

When copying a file from S3 rclone mounted storage to local directory from Windows Explorer rclone is called by the operating system to download the file, right? If rclone can download the file using the new multi-threading code when using --vfs-cache-mode full, why could it not use the same code to download the file when --vfs-cache-mode <anything_else>?

And also... If rclone copy, that don't use the cache, is able to use the multi-threading code, why would rclone mount not be able to use it when read cache is off?

Hope you don't mind my questions. Love your code.

Regards,
Frédéric.

Hmm, not a bad idea!

It is all to do with the filing system interface... When copying using Windows explorer the copy goes

Explorer -> Windows Kernel -> WinFSP -> rclone

The API from Windows kernel -> WinFSP does not supply any info about the local file. All rclone can do is stream data sequentially.

I hope that makes sense!

Oh now I get it! Makes sens :blush:.

So when --vfs-cache-mode <anything_else> rclone directly streams the file to WinFSP sequentially and when --vfs-cache-mode full, rclone first downloads the file to the cache then streams the cached file back to WinFSP, right?

Could this get any better with changes to WinFSP code? Or is the issue on the kernel side?

Regarding the cache, maybe what I want is already covered by --vfs-cache-max-size and --vfs-cache-max-age. No big deal if huge files goes through the cache as they will be remove from it at the end of --vfs-cache-max-age. So I will probably use --vfs-cache-mode full in the end to workaround the download speed "issue".

Yes, that is right.

It is how filesystems work and I don't think there will ever be a work around. It's the same on Linux

That should work with the one slight annoyance that rclone will download the entire file before giving any of our to the user.

Nick,

Thank you for this talk. I'll go with --vfs-cache-mode full and will let you know if caching huge files for reads happen to be an issue.

Bests,
Frédéric.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.