Constantly high IOWAIT (add log)

Anon · March 24, 2020, 6:22pm

omg I expierenced kinda similiar issue first with my dedicated server and the rtorrent MergerFS Mount with an underlying Rclone mount and now also on my Homeserver using Plex directly on the Rclone mount and I just debugged it down to the same cause, unnecessary high IOwait. I already gave up until I just thought of checking here.

Linux deskmini 5.4.0-0.bpo.4-amd64 #1 SMP Debian 5.4.19-1~bpo10+1 (2020-03-09) x86_64 GNU/Linux

ExecStart=/usr/bin/rclone mount "x-gd:/" /mnt/google/x-gd \
   --allow-other \
   --attr-timeout 1000h \
   --buffer-size 32M \
   --dir-cache-time 1000h \
   --drive-chunk-size 32M \
   --log-level INFO \
   --log-file /home/scripts/logs/mount-x.log \
   --poll-interval 15s \
   --rc \
   --rc-addr 127.0.0.1:5573 \
   --stats 0 \
   --timeout 1h
   --use-mmap

What would be the smartest idea to do here? Reverting to 1.50 settings or adjusting --vfs-read-wait?

Dual-O · March 24, 2020, 6:49pm

update after raise from 40ms to 50ms.

looks bad. I think...

Anon · March 24, 2020, 7:16pm

--async-read=false for now resolves it. 0 IOwait.

Will wait for now I think...

ncw · March 25, 2020, 10:18am

IOWait isn't a good measure in my opinion! If you use --async-read=true (the default) you are going to get IOWait, but you will get faster performance provided you don't see those failed to wait for in-sequence read messages. Those are what really kill the performance. If you don't want to see IOWait then set --async-read=false and it will all disappear along with some performance.

Dual-O · March 25, 2020, 11:41am

is i possible to find an option with --async-read=true without failed to wait for in-sequence read messages? Should I raise --vfs-read-wait higher than 50ms? Any Ideas?

ncw · March 25, 2020, 4:49pm

Yes, keep raising it until you don't get those messages. It will help a bit with the IOWait but not a lot for the reasons above.

Dual-O · March 29, 2020, 6:55pm

I raised it till --vfs-read-wait 1000ms and still got the failed to wait for in-sequence read Message in debug log. (40 times in 1 minute under high load)

ncw · March 30, 2020, 9:56am

I think it is time for plan B.

Using --async-read=false will fix the problems at the cost of some performance. Meanwhile I'll warm up the proper fix I did and post a beta here.

bibyfok · April 5, 2020, 8:27am

Hello everyone.
Running 1.51 on an Ubuntu server and have the exact same issue.
Few stupid question:

Why do I see IO Wait on Netdata but nothing on iotop -o ?
Is rclone writing constantly on the disk with this bug ? --> I have expensive NVMe, should I better downgrade or is it safe to continue run this version until there is a patch ? I mean I don't want this bug causing an infinite write on my NVMe until the patched version.

Thanks

ncw · April 5, 2020, 11:37am

pass!

No it is waiting for the network.

Animosity022 · April 5, 2020, 12:38pm

iotop is showing active disk utilization and what is consuming active disk IO.

netdata is showing IO Wait, which is a separate measure of a process waiting for disk IO to complete. Anything in IO would eventually show in iotop as being IO if that makes sense.

bibyfok · April 5, 2020, 1:38pm

Thanks for the clarification !
I will wait for the next version then, nothing to worry on my side
Have a nice day

ncw · May 18, 2020, 5:30pm

I have posted the latest beta with a fixes for this

https://beta.rclone.org/v1.51.0-336-g951099db-beta/

I've raised the read timeout to 20ms and along with another fix I think this should be much better.

Comments appreciated!

blizz05 · May 19, 2020, 5:30pm

If im using 1.51 non-beta, to avoid this I have to use async-read=false in rclone mount options? And also in mergerfs?

ncw · May 20, 2020, 7:15am

Yes that is correct

Probably wouldn't hurt but not 100% sure.

Animosity022 · May 20, 2020, 10:57am

I thought that option was not in 1.51 and you had to use a beta?

ncw · May 20, 2020, 3:38pm

You are qutie correct! You'll need the beta for --async-read=false

random404 · May 21, 2020, 6:03am

Why I don't have issues without messing with this async stuff? Sometimes I get 200 or more files open in the mount, and I haven't noticed any issues at all.

And I'm using rclone betas, and kernel 5.6.4

Animosity022 · May 21, 2020, 11:01am

The issue was with stock 1.51 as that defaulted to turning on async reads and there were fixes that went into the beta to 'smooth' it out. It can also be the settings you have are reading such small increments, you are not seeing the issue.

You'd probably want to compare IOWAIT on 1.50.2 and 1.51 and see if you have any changes in it.

blizz05 · May 21, 2020, 12:53pm

Oh I see. So now if I'm using rclone 1.51, besides using the beta, are there any other flags I can add to the rclone and mergerfs file to mitigate this issue, or is rolling back to 1.50 the only way?