Why rclone mount will be faster that the original s3 interface

I am truly amazed to discover that reading data from multiple files using rclone mount, which essentially mounts an S3 bucket as a local directory, significantly outperforms accessing the same files directly through S3's native interface.
I'm curious to know the magic behind this. While I suspect that multi-threading plays a crucial role in enabling this performance boost, I'm wondering if there are other specific techniques or optimizations that rclone employs to achieve such superior performance.
If you have any insights or could point me in the right direction, I would be genuinely grateful.

welcome to the forum,

When rclone reads files from a remote it reads them in chunks.
This means that rather than requesting the whole file rclone reads the chunk specified

I'm truly grateful for your response. Would it be alright if I ask another perhaps naive question? Why is reading data in chunks faster than reading the entire file at once?

both rclone copy and rclone mount` do chunked reading.

the primary difference,
rclone copy transfers the entire file, chunk by chunk.
rclone mount transfers just the individual chunks as requested by the application, such as plex/notepad/vlc

what are you using rclone mount for?
downloading files, stream media or what?

My real scenario is that there are hundreds to thousands of log files that need to be analyzed centrally on a server. The log files are stored in a bucket in s3. I was surprised to find that the speed of downloading to the local analysis using the s3 interface is far inferior to directly clone mount to the current server for direct analysis.
I'm curious how rclone mount achieves this effect.

rclone does not do anything magical:) I would rather think why whatever you call s3 interface is so slow. But it can be as simple as default parameters - maybe rclone choice of defaults is much better match for what you are doing. At the end both tools use the same S3 provider API.

Can be also that your logs analyser is poorly designed to operate on online data - in such case rclone mount caching can come into play speeding up all process.

I have also done some time comparisons. I am analyzing log files of 20GB size (the total size of thousands of files). If we use the native interface to download, it takes 4 minutes, and the analysis program takes about 15 seconds. If we use rclone mount, the whole process is about 1 and a half minutes.
I also think that rclone does not have a special configuration, but the fact is that it is much faster.
So I want to ask for help to understand the reason

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.