Limit IO Wait - rclone copy

agneev · June 10, 2020, 12:42pm

What is the problem you are having with rclone?

High IO_Wait with rclone copy

This occurs when more than one file finishes downloading 100%.

I was under the impression that checkers is used to control the number of files being read...

What is your rclone version (output from `rclone version`)

rclone v1.52.0
- os/arch: linux/arm
- go version: go1.14.3

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Raspberry Pi OS - armhf

Which cloud storage system are you using? (eg Google Drive)

<Put.io>

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone copy source:path dest:path --checkers=1

Animosity022 · June 10, 2020, 12:53pm

Checkers are how many files are being checked.
Transfers still defaults to a higher number and transfers are going on.

There is no log so it's a bit hard to tell.

IOWait is generally always going to happen with a cloud remote as you are waiting for the cloud remote to do it's thing.

agneev · June 10, 2020, 12:56pm

--transfers=2.

IOWait is happening because multiple files are being read from the disk at the same time, which is reducing throughput (19MB/s instead of 110-120MB/s) and causing delays.

Animosity022 · June 10, 2020, 12:59pm

I'm not sure what that means, the default is 4:

     --transfers int                        Number of file transfers to run in parallel. (default 4)

You need to run with a debug log and share the log.

agneev · June 10, 2020, 1:03pm

Should async_reads on mergerFS prevent this?

Animosity022 · June 10, 2020, 1:10pm

I'm not sure what you are trying to prevent.

Earlier versions of rclone did sync_reads, which means the OS read say parts 1 2 3 4 in order and got returned back in order parts 1 2 3 4.

New kernel versions have aync reads available along with fuse so that means you can ask for 1 2 3 4 and you may get back 1 3 2 4 and rclone puts it back together.

There isn't a yes/no answer to IOWait as it depends on what's is happening on the system and what the reads are doing coming in.

A database, for example, uses async reads as it does many, many requests at the same time so it has to service things as quickly as possible.

If a copy a file from a to b, it's "probably" better to do that sync and read in order as I'm sequentially reading a file. If I expand that to reading 10 files at a time with a program, maybe async is better or maybe it isn't as it has to be tested.

My use case is normally streaming media and that works pretty well for my use so I currently use sync_reads. At some point, I'll go back and test out async and it really should be better but I personally haven't done that yet.

If you can describe you actual use case and setup and share a log, we can take a look and provide feedback. These questions are not simple yes/no type things and more info is needed.

system · August 10, 2020, 9:10am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.