Rclone mount performance issue

What is the problem you are having with rclone?

I'm hosting a Java server to serve images via VPS, and use rclone mount to attach OneDrive to my hard drive, when in server's peak hour (every day), rclone mount's performance cannot afford this pressure. I have tested if only use hard drive, there won't be any overload issue. Below is some metrics I collected from my machine:

(Cause new user can only embed one image in post, so I put them in HackMD)
HackMD Metrics images

lshw -class disk -class storage Output

*-scsi:2
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc.
       physical id: 6
       bus info: pci@0000:00:06.0
       version: 00
       width: 64 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:11 ioport:c0c0(size=64) memory:febd4000-febd4fff memory:fe00c000-fe00ffff
     *-virtio3
          description: Virtual I/O device
          physical id: 0
          bus info: virtio@3
          logical name: /dev/vdb
          size: 200GiB (214GB)
          capabilities: partitioned partitioned:dos
          configuration: driver=virtio_blk logicalsectorsize=512 sectorsize=512 signature=7b19055e

Run the command 'rclone version' and share the full output of the command.

rclone --version
rclone v1.64.2

  • os/version: ubuntu 20.04 (64 bit)
  • os/kernel: 5.4.0-166-generic (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.21.3
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Onedrive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone mount odrive: /onedrive --allow-other --transfers 4 --buffer-size 64M --cache-dir /mnt/onedrive_cache --vfs-read-chunk-size 2M --vfs-read-chunk-size-limit 0M --vfs-cache-max-size 150G --vfs-cache-max-age 3360h --allow-non-empty --dir-cache-time 3360h --vfs-cache-mode full --umask 000 --vfs-cache-mode full --vfs-cache-poll-interval 10d --vfs-write-back 8h

The rclone config contents with secrets removed.

[odrive]
type = onedrive
client_id = ffeeed50-ea2e-4a3c-8713-6ec66fe0c696
client_secret = ...
token = {"access_token":"...","token_type":"Bearer","refresh_token":"..."}
drive_id = b!QT-vBfWDykuEFt9u5LUVNJ3gNgu-9H5JtcpAQf5duVOunIoMaRPKRJ_ekSnZctYl
drive_type = business

A log from the command with the -vv flag

Cause the log file is too big (180 MB), can't upload to gist, so I upload it to OneDrive. Want to mention that the log file time zone is UTC±00:00, and the metrics time zone is UTC+08:00

So, the I/O issue might happen between 11/29 16:00 ~ 17:30 (UTC±00:00)

[log file](https://1drv.ms/t/s!AqLRwqw4zL1sjZgX7bKLa0O7zQMYKg?e=atbZXx)

welcome to the forum,

sorry, not sure what you mean?
what makes you think this is a hard drive issue.

i took a quick look at the huge log,

there are a few network errors with onedrive, looks like nothing fatal.
all the errors seem to be http2: client connection force closed via ClientConn.Close
that message does not come from rclone, instead the message comes from the go language http2 libary.
https://go.googlesource.com/net/+/master/http2/transport.go#1130

but let's see what other forum members think?


--vfs-cache-mode full
you used that flag twice in the same command?

Thanks for reply!

I mean if I not using rclone mount, and serve file directly from hard drive, there won't have any overload problem, so it's not hard drive R/W speed issue, it's about rclone mount.

From the metric I post above, there is dramatically drop of I/O Utilization, and PSI shows that most of processes are in I/O stall.

I forgot to remove the duplicate one XD, but I thought there won't make any difference.

Again, thanks for your help!

sure.

yes, that is correct

sorry, i missed that until now...
i lack the skill set to interpet that.
tho i am most curious how other forum members will interpet...

rclone mount odrive: /onedrive \
--allow-other \
--allow-non-empty \
--umask 000 \
--transfers 4 \

--cache-dir /mnt/onedrive_cache \
--vfs-cache-max-size 150G \
--vfs-cache-mode full \
--buffer-size 64M \

--vfs-read-chunk-size 2M \
--vfs-read-chunk-size-limit 0M \

--dir-cache-time 3360h \
--vfs-cache-max-age 3360h \
--vfs-cache-poll-interval 10d \
--vfs-write-back 8h

Is it right to assume that your mount is pretty much used for reads only? And you use it as sort of buffer to much larger underlying cloud storage?

I would try to speed it up. Install rclone v1.65 and add --vfs-refresh flag - it will load all directories cache to memory.

What are typical file sizes this mount serves? How many files are there? All in one folder or not?
Maybe we can try to optimise it a bit more.

Does not mean that all these optimisations help enough. But they are low hanging fruit.

Using fuse comes at the cost - it is user space software and both context switches and sys calls are expensive operations. It creates overhead and slow down which might tip the scale in your use case. Fuse was developed for convenience not for speed.

Now I would run test to determine if problem is with cloud or local part.

Run your load test with all files already in cache - simply run twice the test using the same files. Second run with your mount setting will have all data already cached.

Then maybe rclone mount can be optimised to meet your requirements. Or it is possible that you have to find another solution. Impossible to say right now.

Yes, most of the use is reads, but there is also 6~10 writes per minute.

Here is my storage structure:

hard drive  <---> OneDrive
(20G/200G)        (20G/5T)

So OneDrive is like the expansion of my hard drive, and sure, everything is on the cache now (vfs-cache-max-size has 150G)

This is a good idea! I will try to use this flag to run test, currently I'm trying to write a script that can reproduced the issue.

Every image size is between 300KB ~ 800KB, and there are about 60000 files evenly distributed in the folder, the folder structure is nested folder of 2 depth, like onedrive/data/1a/1c/example.jpg

I see, but there doesn't have any better option for my use case I think : (

1 Like

After days of research, I finally found the reason of server overload, below is my java pseudo code:

if (file.exist()){
    while(bytes need to read > current read bytes){
        read file bytes
    }
}

And here is some logs that use grep filtered:

2023/12/08 07:50:25 DEBUG : cache/66/e0/: Lookup: name="66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg"
2023/12/08 07:50:25 DEBUG : cache/66/e0/: >Lookup: node=cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg, err=<nil>
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=86103 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Open: flags=OpenReadOnly
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Open: flags=O_RDONLY
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: newRWFileHandle: 
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >newRWFileHandle: err=<nil>
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Open: fd=cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw), err=<nil>
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Open: fh=&{cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw)}, err=<nil>
2023/12/08 07:50:25 DEBUG : &{cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw)}: Read: len=90112, offset=0
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg(0xc00ab9bbc0): _readAt: size=90112, off=0
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg(0xc00ab9bbc0): openPending: 
2023/12/08 07:50:25 INFO  : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: vfs cache: removed cache file as stale (remote deleted)
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: vfs cache: removed metadata from cache as stale (remote deleted)
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: vfs cache: truncate to size=0 (not needed as size correct)
2023/12/08 07:50:25 DEBUG : cache/66/e0: Added virtual directory entry vAddFile: "66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg"
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg(0xc00ab9bbc0): >openPending: err=<nil>
2023/12/08 07:50:25 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg(0xc00ab9bbc0): >_readAt: n=0, err=EOF
2023/12/08 07:50:25 DEBUG : &{cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw)}: >Read: read=0, err=<nil>
2023/12/08 07:50:26 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:26 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:27 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:27 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:28 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:28 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:29 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:29 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:30 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:30 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:31 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:31 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:32 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:32 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:33 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:33 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:34 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:34 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:35 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:35 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:36 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:36 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:37 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:37 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:38 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:38 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:39 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:39 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:40 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:40 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:41 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:41 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:42 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:42 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:43 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:43 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:44 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:44 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:45 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:45 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:46 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:46 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:47 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:47 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:48 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:48 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:49 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:49 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:50 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:50 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:51 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:51 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:52 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:52 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:53 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:53 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:54 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:54 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:55 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:55 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:56 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:56 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:57 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:57 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:58 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:58 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:50:59 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:50:59 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:00 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:00 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:01 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:01 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:02 DEBUG : cache/66/e0/: Lookup: name="66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg"
2023/12/08 07:51:02 DEBUG : cache/66/e0/: >Lookup: node=cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg, err=<nil>
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Open: flags=OpenReadOnly
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Open: flags=O_RDONLY
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: newRWFileHandle: 
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >newRWFileHandle: err=<nil>
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Open: fd=cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw), err=<nil>
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Open: fh=&{cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw)}, err=<nil>
2023/12/08 07:51:02 DEBUG : &{cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw)}: Read: len=4096, offset=0
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg(0xc00f76be40): _readAt: size=4096, off=0
2023/12/08 07:51:02 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg(0xc00f76be40): >_readAt: n=0, err=EOF
2023/12/08 07:51:02 DEBUG : &{cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg (rw)}: >Read: read=0, err=<nil>
2023/12/08 07:51:03 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:03 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:03 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:03 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:04 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:04 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:05 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:05 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:05 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:05 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>
2023/12/08 07:51:06 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: Attr: 
2023/12/08 07:51:06 DEBUG : cache/66/e0/66e05470f23f0a32c7936b7774f5ddd69be7fea0-86103-1280-853-jpg: >Attr: a=valid=1s ino=0 size=0 mode=-rw-rw-rw-, err=<nil>

As noted above, when check file exist, rclone return ture, then when read bytes from rclone, rclone delete file as stale cache? And file read thread will stuck at while loop, And because rclone added virtual directory entry that file size was 0, the following file exist check still return true. As result, more and more thread stall:

image

Firstly thank you for sharing your findings. It is always appreciated.

But secondly I am not 100% sure I get it... Do you suspect it is rclone bug?

Yes, cause if I move all my files from rclone to disk, and serve files directly from disk, this issue will never happen.

I found another weird stuff that different from the issue mentioned above, here is the first rclone log:
first_rclone_log (github.com)

here is the second:
second_rclone_log (github.com)

And this is my sever log for second rclone log:

2023-12-10T23:57:46Z [info] Started to checked file id complete: cd637860967369dab018986e8595068ef7497cf4-144972-966-1046-jpg
 2023-12-10T23:57:49Z [info] Started to checked file id complete: d6e878f252dec2d96cf4810e6fea6017ee8d5292-200849-800-1130-jpg
 2023-12-10T23:57:49Z [info] Started to checked file id complete: 76cda65be5c86ebbbeac367255b966a994ed5817-337159-1280-1811-jpg
 2023-12-10T23:57:50Z [info] Started to checked file id complete: 7f5d3b044cc4a6b67436fb76d30cc6c5d9f09cc4-372681-1200-1691-jpg
 
...

 2023-12-11T00:01:58Z [info] Response checked file id complete: b0e1c1f229e5a148c788feabee51771a3660824a-212195-1280-720-jpg, timed used: 80.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 1d6bace0bbf82b44b841b8609bade18947738510-507390-1280-1859-jpg, timed used: 70.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: ae01b11611703e4f6d287ec70b6fbcc5e00c8ac4-224525-1280-1066-jpg, timed used: 112.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 9d0e904d6f8e1efe7ef54e52d9ddebeaf1f1dab1-60501-780-1038-jpg, timed used: 179.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 213.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: a1ca8f99e74cf990714a58ddb2857beb835e6132-245542-1280-775-jpg, timed used: 180.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 76cdec882146c3885a3a1623f94018bc6a8c740d-66318-780-552-jpg, timed used: 197.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 41916ed599e78cfafa176c19908c8645499c8403-294119-1280-1707-jpg, timed used: 193.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 200.00
 2023-12-11T00:01:58Z [info] {10075/122.206.190.4}   The connection was interrupted or closed by the remote host: java.net.SocketException: Broken pipe
 2023-12-11T00:01:58Z [info] Response checked file id complete: 76cd375802cf5cf4e44428568cbba7c26b43e916-101267-1280-720-jpg, timed used: 169.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 189.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: ce56234a929c3fc3a0e3ec66103423fb98e4dc3d-34725-780-1024-jpg, timed used: 14.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 188.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: ce56234a929c3fc3a0e3ec66103423fb98e4dc3d-34725-780-1024-jpg, timed used: 1.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 0.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 138.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: b1d55488b6d12b25a6d02e41bce28ef758979bc9-481723-1280-1817-jpg, timed used: 145.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: b1d55488b6d12b25a6d02e41bce28ef758979bc9-481723-1280-1817-jpg, timed used: 204.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: b1d55488b6d12b25a6d02e41bce28ef758979bc9-481723-1280-1817-jpg, timed used: 182.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: b0e1c1f229e5a148c788feabee51771a3660824a-212195-1280-720-jpg, timed used: 58.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: b1d55488b6d12b25a6d02e41bce28ef758979bc9-481723-1280-1817-jpg, timed used: 8.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 13eb7753dc5298ff38a0ec6afecd15beb92c2db6-301592-1280-1707-jpg, timed used: 94.00

When server is checking file exist, like;

2023-12-10T23:57:46Z [info] Started to checked file id complete: cd637860967369dab018986e8595068ef7497cf4-144972-966-1046-jpg

The file is at cache/cd/63/cd637860967369dab018986e8595068ef7497cf4-144972-966-1046-jpg
rclone started to lookup for the file, then stall:

2023/12/10 23:57:46 DEBUG : cache/: Lookup: name="cd"
2023/12/10 23:57:48 DEBUG : : Statfs: 
2023/12/10 23:57:48 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:57:53 DEBUG : : Statfs: 
2023/12/10 23:57:53 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:01 DEBUG : : Statfs: 
2023/12/10 23:58:01 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:09 DEBUG : : Statfs: 
2023/12/10 23:58:09 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:17 DEBUG : : Statfs: 
2023/12/10 23:58:17 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:25 DEBUG : : Statfs: 
2023/12/10 23:58:25 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:33 DEBUG : : Statfs: 
2023/12/10 23:58:33 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:41 DEBUG : : Statfs: 
2023/12/10 23:58:41 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:49 DEBUG : : Statfs: 
2023/12/10 23:58:49 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 23:58:57 DEBUG : : Statfs: 
2023/12/10 23:58:57 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>

You can see that server checking file for about 3 minutes long:

 2023-12-11T00:01:58Z [info] Response checked file id complete: b0e1c1f229e5a148c788feabee51771a3660824a-212195-1280-720-jpg, timed used: 80.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 1d6bace0bbf82b44b841b8609bade18947738510-507390-1280-1859-jpg, timed used: 70.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: ae01b11611703e4f6d287ec70b6fbcc5e00c8ac4-224525-1280-1066-jpg, timed used: 112.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 9d0e904d6f8e1efe7ef54e52d9ddebeaf1f1dab1-60501-780-1038-jpg, timed used: 179.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 213.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: a1ca8f99e74cf990714a58ddb2857beb835e6132-245542-1280-775-jpg, timed used: 180.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 76cdec882146c3885a3a1623f94018bc6a8c740d-66318-780-552-jpg, timed used: 197.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: 41916ed599e78cfafa176c19908c8645499c8403-294119-1280-1707-jpg, timed used: 193.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 200.00
 2023-12-11T00:01:58Z [info] {10075/122.206.190.4}   The connection was interrupted or closed by the remote host: java.net.SocketException: Broken pipe
 2023-12-11T00:01:58Z [info] Response checked file id complete: 76cd375802cf5cf4e44428568cbba7c26b43e916-101267-1280-720-jpg, timed used: 169.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 189.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: ce56234a929c3fc3a0e3ec66103423fb98e4dc3d-34725-780-1024-jpg, timed used: 14.00
 2023-12-11T00:01:58Z [info] Response checked file id complete: f7606ad04448cc223246ffb2bd8c92aa8ff4e3dc-410370-1280-1811-jpg, timed used: 188.00

There has some same situations when this issue happened:

  1. rclone is invalidating directory cache at the same time:
2023/12/10 14:19:43 DEBUG : : changeNotify: relativePath="cache", type=0
2023/12/10 14:19:43 DEBUG : cache: invalidating directory cache

2023/12/10 23:57:42 DEBUG : : changeNotify: relativePath="cache", type=0
2023/12/10 23:57:42 DEBUG : cache: invalidating directory cache
  1. Only when rclone finished retrying Onedrive API the lookup started to process:
2023/12/10 14:23:52 DEBUG : pacer: low level retry 1/10 (error Get "https://graph.microsoft.com/v1.0/drives/b!QT-vBfWDykuEFt9u5LUVNJ3gNgu-9H5JtcpAQf5duVOunIoMaRPKRJ_ekSnZctYl/items/01QPGQHXLCYNEU64CERBAJ4K4DYE32FIFE/children?$top=1000": net/http: timeout awaiting response headers)
2023/12/10 14:23:52 DEBUG : pacer: Rate limited, increasing sleep to 20ms
2023/12/10 14:23:56 DEBUG : : Statfs: 
2023/12/10 14:23:56 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/10 14:23:57 DEBUG : pacer: Reducing sleep to 15ms
2023/12/10 14:23:57 DEBUG : cache/d6/: >Lookup: node=cache/d6/2c/, err=<nil>
2023/12/10 14:23:57 DEBUG : cache/d6/2c/: Attr: 
2023/12/10 14:23:57 DEBUG : cache/d6/2c/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxrwx, err=<nil>
2023/12/10 14:23:57 DEBUG : cache/d6: invalidating directory cache
2023/12/10 14:23:57 DEBUG : : >changeNotify: 
2023/12/10 14:23:57 DEBUG : : changeNotify: relativePath="cache/d6/e8", type=0
2023/12/10 14:23:57 DEBUG : cache/: Lookup: name="39"
2023/12/10 14:23:57 DEBUG : cache/: Lookup: name="1d"
2023/12/10 14:23:57 DEBUG : cache/: Lookup: name="6b"
2023/12/10 14:23:57 DEBUG : cache/: Lookup: name="d1"
2023/12/10 14:23:57 DEBUG : cache/: Lookup: name="38"

2023/12/11 00:01:52 DEBUG : pacer: low level retry 1/10 (error Get "https://graph.microsoft.com/v1.0/drives/b!QT-vBfWDykuEFt9u5LUVNJ3gNgu-9H5JtcpAQf5duVOunIoMaRPKRJ_ekSnZctYl/items/01QPGQHXK7LQERLIIJSVBKK7TUYAVAVCGT/children?$top=1000": net/http: timeout awaiting response headers)
2023/12/11 00:01:52 DEBUG : pacer: Rate limited, increasing sleep to 20ms
2023/12/11 00:01:53 DEBUG : : Statfs: 
2023/12/11 00:01:53 DEBUG : : >Statfs: stat={Blocks:1342177280 Bfree:1336293151 Bavail:1336293151 Files:1000000000 Ffree:1000000000 Bsize:4096 Namelen:255 Frsize:4096}, err=<nil>
2023/12/11 00:01:58 DEBUG : pacer: Reducing sleep to 15ms
2023/12/11 00:01:58 DEBUG : cache/cf/: >Lookup: node=cache/cf/4d/, err=<nil>
2023/12/11 00:01:58 DEBUG : cache/cf: invalidating directory cache
2023/12/11 00:01:58 DEBUG : : >changeNotify: 
2023/12/11 00:01:58 DEBUG : : changeNotify: relativePath="cache/cf/4d", type=0
2023/12/11 00:01:58 DEBUG : cache/cf/4d/: Attr: 
2023/12/11 00:01:58 DEBUG : cache/cf/4d/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxrwx, err=<nil>
2023/12/11 00:01:58 DEBUG : cache/: Lookup: name="f7"
2023/12/11 00:01:58 DEBUG : cache/: Lookup: name="7d"
2023/12/11 00:01:58 DEBUG : cache/: Lookup: name="1d"
2023/12/11 00:01:58 DEBUG : cache/: Lookup: name="b1"
2023/12/11 00:01:58 DEBUG : cache/: Lookup: name="cc"

I have found a way to reproduce this issue, I will start another thread, and this thread can be closed.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.