Limit write threads on copy, sync with dedicated write queue

Hello,

Use case:
The setup containing pretty fast network connection, amount of RAM and comparably low IOPS disk system on rotational HDDs (but fine in bandwidth terms).

To utilize network I have to increase amount of transfers or enable multi-threaded downloads while both of them causes slower IO on rotational HDDs which causes slower total transfer speed.

Proposal:
Dedicate file write into the queue with specified amount of writers and configurable write block size.
This will allow to utilize rotational HDD bandwidth with lower IOPS rate such as network connection with RAM for that queue.

I'm not sure I understand your proposal - can you elaborate?

Note that --check-first is useful for HDDs also.

I'm not sure I understand your proposal - can you elaborate?

Note that --check-first is useful for HDDs also.

Sure,

--check-first allows to separate reading HDD access pattern (on verifying checksums) from writing HDD access pattern (on actual data copying) which is a good way to increase performance and it works perfect thanks for that! :slight_smile:

My proposal is to add a "writer task queue" between network transfer and actual disk writer to make them asynchronous, allow sequential write pattern (with multiple network transfers) which the fastest possible on rotational HDDs and make them configurable for concurrency independently.

As a side effect this construction will utilize more memory for storing data in such queue but it could be limited in terms of data size and chan length to avoid out of memory scenario.

Here how it could look like in diagram:

I see what you mean.

There was some discussion of this problem here: Prevent file fragmentation on Windows by pre-allocating files · Issue #2469 · rclone/rclone · GitHub

Rclone already has the buffering with its --buffer parameter.

It would be possible quite easily to serialize the writes in the local backend by adding a mutex here: rclone/local.go at 31a8211afafeb5d5c189e909a7ec2e5a91a7a47b · rclone/rclone · GitHub

This would need to break the copy down into read and write chunks of a given size and take the mutex before writing a chunk.

You could experiment with that if you wanted to?

I'm not quite sure it will help since the issue is not only serialising writes but also limit their concurrency, AFAICS --buffer helps on huge files in the scenario above but on small files for better network utilisation I have to set higher concurrency for transfers with --transfers=64 (actual value for 128M files I'm using with macOS sparsebundle images, btw) and since transfer is a synchronous operation it could cause up to 64 simultaneous writes to local disk. While on SSDs it works fine (I've already checked) on rotational drives it's dramatically slow.

Btw, I have a kinda workaround for that which helps a bit: to set --buffer-size=<higher value for file size> --multi-thread-cutoff=<lower value than file size> --multi-thread-streams=<something about 4-6-8> to increase RAM usage for file construction before write.

I could play a bit with local backend but I'm almost sure it's not about implementation only it's mainly about architecture and separating writes from reads.

I'm not quite understand how mutex could help here but please give me some time to look at the code of rclone, since I haven't checked out how it's implemented yet starting from architecture proposal based on observable rclone behaviour.

Thanks.

The bit of code I linked above is where there writes to disk for the local backend happen, so that is where the change would need to go.

I'm suggesting that you don't need extra buffering sync rclone already has the --buffer

So all you need is to serialise the writes there.