Fuse hang on copyfile

What is the problem you are having with rclone?

Fuse seems to hang when I use rclone rc operations/copyfile with a srcFs set to the path of the mounted crypt drive. Works fine if I use the drive name instead of the path.

I am using systemctl to mount a crypt drive on Ubuntu 20.04 LTS at /Plex.
Here is my rclone.service that I am using on my test server. I have stripped it down from my production server trying to determine where the problem is.

[Unit]
Description=rclone GDrive Mount Service
AssertPathIsDirectory=/Plex
Wants=network-online.target
After=network-online.target

[Service]
Type=notify
Environment=RCLONE_CONFIG=/root/.config/rclone/rclone.conf
KillMode=none
RestartSec=5
ExecStart=/usr/bin/rclone mount GD-crypt: /Plex \
--rc --rc-addr=:5572 --rc-user=XXXX --rc-pass=XXXX \
--debug-fuse \
--log-level=DEBUG \
--log-file=/home/rclonelog.txt
#
ExecStop=/bin/fusermount -uz /Plex
Restart=on-failure

[Install]
WantedBy=multi-user.target

When I run this command, it works fine.

curl -X POST -u XXXX:XXXX http://testrclone.tld:5572/operations/copyfile -d srcFs=GD-crypt: -d srcRemote=testfile.txt -d dstFs=/home -d dstRemote=testfile1.txt

When I run this command, it never completes and the file system seems to be hung up. When I do a df, it never responds. I end up having to reboot my server.

curl -X POST -u XXXX:XXXX http://testrclone.tld:5572/operations/copyfile -d srcFs=/Plex -d srcRemote=testfile.txt -d dstFs=/home -d dstRemote=testfile2.txt

This used to work. I'm guessing that some updates to Ubuntu changed something.
Should it work?
The reason I am interested in using the path instead of the drive is that is how I wrote the rclone monitor in my multi-server fork of Tautulli to verify that the rclone mount is up.
If it should not ever be done this way, I can rethink my code. But since it used to work, I am wondering if there could be a bug in rclone. Trying to figure out where the issue is and what to do with my code.

Run the command 'rclone version' and share the full output of the command.

rclone v1.58.0

  • os/version: ubuntu 20.04 (64 bit)
  • os/kernel: 5.4.0-29-generic (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.17.8
  • go/linking: static
  • go/tags: none

Are you on the latest version of rclone? You can validate by checking the version listed here: -->
Yes.

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

curl -X POST -u XXXX:XXXX http://myhost.tld:5572/operations/copyfile -d srcFs=/Plex -d srcRemote=testfile.txt -d dstFs=/home -d dstRemote=testfile2.txt

The rclone config contents with secrets removed.

root@testRclone:~# rclone config show
[GD]
type = drive
client_id = XXXX.apps.googleusercontent.com
client_secret = XXXX
token = XXXX  
root_folder_id = XXXX

[GD-crypt]
type = crypt
remote = GD:PlexCrypt
filename_encryption = standard
password = XXXX
password2 = XXXX

A log from the command with the -vv flag

2022/03/21 13:08:53 DEBUG : rclone: Version "v1.58.0" starting with parameters ["/usr/bin/rclone" "mount" "GD-crypt:" "/Plex" "--rc" "--rc-addr=:5572" "--rc-user=XXXX" "--rc-pass=XXXX" "--debug-fuse" "--log-level=DEBUG" "--log-file=/home/rclonelog.txt"]
2022/03/21 13:08:53 INFO  : Using --user XXXX --pass XXXX as authenticated user
2022/03/21 13:08:53 NOTICE: Serving remote control on http://[::]:5572/
2022/03/21 13:08:53 DEBUG : Creating backend with remote "GD-crypt:"
2022/03/21 13:08:53 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2022/03/21 13:08:53 DEBUG : Creating backend with remote "GD:PlexCrypt"
2022/03/21 13:08:54 DEBUG : Encrypted drive 'GD-crypt:': Mounting on "/Plex"
2022/03/21 13:08:54 DEBUG : : Root: 
2022/03/21 13:08:54 DEBUG : : >Root: node=/, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: <- Getattr [ID=0x4 Node=0x1 Uid=0 Gid=0 Pid=1094] 0x0 fl=0
2022/03/21 13:08:57 DEBUG : /: Attr: 
2022/03/21 13:08:57 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x4] Getattr valid=1s ino=1 size=0 mode=drwxr-xr-x
2022/03/21 13:08:57 DEBUG : fuse: <- Getxattr [ID=0x6 Node=0x1 Uid=0 Gid=0 Pid=1094] "security.selinux" 255
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x6] Getxattr error=operation not supported
2022/03/21 13:08:57 DEBUG : fuse: <- Getxattr [ID=0x8 Node=0x1 Uid=0 Gid=0 Pid=1094] "system.posix_acl_access" 0
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x8] Getxattr error=operation not supported
2022/03/21 13:08:57 DEBUG : fuse: <- Open [ID=0xa Node=0x1 Uid=0 Gid=0 Pid=1094] dir=true fl=OpenReadOnly+OpenDirectory+OpenNonblock
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0xa] Open 0x1 fl=0
2022/03/21 13:08:57 DEBUG : fuse: <- Read [ID=0xc Node=0x1 Uid=0 Gid=0 Pid=1094] 0x1 4096 @0x0 dir=true fl=0 owner=0 ffl=OpenReadOnly+OpenDirectory+OpenNonblock
2022/03/21 13:08:57 DEBUG : /: ReadDirAll: 
2022/03/21 13:08:57 DEBUG : /: >ReadDirAll: item=14, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0xc] Read 496
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0xe Node=0x1 Uid=0 Gid=0 Pid=1094] "FredFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="FredFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=FredFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : FredFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : FredFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0xe] Lookup 0x2 gen=0 valid=1s attr={valid=1s ino=15717829353103123350 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x10 Node=0x1 Uid=0 Gid=0 Pid=1094] "FredFlix2"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="FredFlix2"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=FredFlix2/, err=<nil>
2022/03/21 13:08:57 DEBUG : FredFlix2/: Attr: 
2022/03/21 13:08:57 DEBUG : FredFlix2/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x10] Lookup 0x3 gen=0 valid=1s attr={valid=1s ino=10475298195129388972 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x12 Node=0x1 Uid=0 Gid=0 Pid=1094] "HeathFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="HeathFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=HeathFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : HeathFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : HeathFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x12] Lookup 0x4 gen=0 valid=1s attr={valid=1s ino=2004382313078097665 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x14 Node=0x1 Uid=0 Gid=0 Pid=1094] "JerryFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="JerryFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=JerryFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : JerryFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : JerryFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x14] Lookup 0x5 gen=0 valid=1s attr={valid=1s ino=2541705083383385871 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x16 Node=0x1 Uid=0 Gid=0 Pid=1094] "JoeFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="JoeFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=JoeFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : JoeFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : JoeFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x16] Lookup 0x6 gen=0 valid=1s attr={valid=1s ino=4609149152991018005 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x18 Node=0x1 Uid=0 Gid=0 Pid=1094] "JohnFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="JohnFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=JohnFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : JohnFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : JohnFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x18] Lookup 0x7 gen=0 valid=1s attr={valid=1s ino=13561213146864862234 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x1a Node=0x1 Uid=0 Gid=0 Pid=1094] "NancePictures"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="NancePictures"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=NancePictures/, err=<nil>
2022/03/21 13:08:57 DEBUG : NancePictures/: Attr: 
2022/03/21 13:08:57 DEBUG : NancePictures/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x1a] Lookup 0x8 gen=0 valid=1s attr={valid=1s ino=3128445353867814786 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x1c Node=0x1 Uid=0 Gid=0 Pid=1094] "NeilFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="NeilFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=NeilFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : NeilFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : NeilFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x1c] Lookup 0x9 gen=0 valid=1s attr={valid=1s ino=14237185541713144927 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x1e Node=0x1 Uid=0 Gid=0 Pid=1094] "PatFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="PatFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=PatFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : PatFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : PatFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x1e] Lookup 0xa gen=0 valid=1s attr={valid=1s ino=4229336788032598706 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x20 Node=0x1 Uid=0 Gid=0 Pid=1094] "PatrickFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="PatrickFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=PatrickFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : PatrickFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : PatrickFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x20] Lookup 0xb gen=0 valid=1s attr={valid=1s ino=11662927799686620677 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x22 Node=0x1 Uid=0 Gid=0 Pid=1094] "TBFlix"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="TBFlix"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=TBFlix/, err=<nil>
2022/03/21 13:08:57 DEBUG : TBFlix/: Attr: 
2022/03/21 13:08:57 DEBUG : TBFlix/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x22] Lookup 0xc gen=0 valid=1s attr={valid=1s ino=6872414671183487661 size=0 mode=drwxr-xr-x}
2022/03/21 13:08:57 DEBUG : fuse: <- Lookup [ID=0x24 Node=0x1 Uid=0 Gid=0 Pid=1094] "testfile.txt"
2022/03/21 13:08:57 DEBUG : /: Lookup: name="testfile.txt"
2022/03/21 13:08:57 DEBUG : /: >Lookup: node=testfile.txt, err=<nil>
2022/03/21 13:08:57 DEBUG : testfile.txt: Attr: 
2022/03/21 13:08:57 DEBUG : testfile.txt: >Attr: a=valid=1s ino=0 size=20 mode=-rw-r--r--, err=<nil>
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x24] Lookup 0xd gen=0 valid=1s attr={valid=1s ino=4456035514278291952 size=20 mode=-rw-r--r--}
2022/03/21 13:08:57 DEBUG : fuse: <- Read [ID=0x26 Node=0x1 Uid=0 Gid=0 Pid=1094] 0x1 4096 @0x1f0 dir=true fl=0 owner=0 ffl=OpenReadOnly+OpenDirectory+OpenNonblock
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x26] Read 0
2022/03/21 13:08:57 DEBUG : fuse: <- Release [ID=0x28 Node=0x1 Uid=0 Gid=0 Pid=0] 0x1 fl=OpenReadOnly+OpenDirectory+OpenNonblock rfl=0 owner=0
2022/03/21 13:08:57 DEBUG : fuse: -> [ID=0x28] Release

2022/03/21 13:09:28 DEBUG : rc: "operations/copyfile": with parameters map[dstFs:/home dstRemote:testfile1.txt srcFs:GD-crypt: srcRemote:testfile.txt]
2022/03/21 13:09:28 DEBUG : Creating backend with remote "/home"
2022/03/21 13:09:28 DEBUG : testfile.txt: Need to transfer - File not found at Destination
2022/03/21 13:09:28 INFO  : testfile.txt: Copied (new) to: testfile1.txt
2022/03/21 13:09:28 DEBUG : rc: "operations/copyfile": reply map[]: <nil>

2022/03/21 13:09:50 DEBUG : rc: "operations/copyfile": with parameters map[dstFs:/home dstRemote:testfile2.txt srcFs:/Plex srcRemote:testfile.txt]
2022/03/21 13:09:50 DEBUG : Creating backend with remote "/Plex"
2022/03/21 13:09:50 DEBUG : fuse: <- Getattr [ID=0x2a Node=0x1 Uid=0 Gid=0 Pid=1078] 0x0 fl=0
2022/03/21 13:09:50 DEBUG : /: Attr: 
2022/03/21 13:09:50 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/21 13:09:50 DEBUG : fuse: -> [ID=0x2a] Getattr valid=1s ino=1 size=0 mode=drwxr-xr-x
2022/03/21 13:09:50 DEBUG : fuse: <- Lookup [ID=0x2c Node=0x1 Uid=0 Gid=0 Pid=1078] "testfile.txt"
2022/03/21 13:09:50 DEBUG : /: Lookup: name="testfile.txt"
2022/03/21 13:09:50 DEBUG : /: >Lookup: node=testfile.txt, err=<nil>
2022/03/21 13:09:50 DEBUG : testfile.txt: Attr: 
2022/03/21 13:09:50 DEBUG : testfile.txt: >Attr: a=valid=1s ino=0 size=20 mode=-rw-r--r--, err=<nil>
2022/03/21 13:09:50 DEBUG : fuse: -> [ID=0x2c] Lookup 0xd gen=0 valid=1s attr={valid=1s ino=4456035514278291952 size=20 mode=-rw-r--r--}
2022/03/21 13:09:50 DEBUG : testfile.txt: Need to transfer - File not found at Destination
2022/03/21 13:09:50 DEBUG : fuse: <- Open [ID=0x2e Node=0xd Uid=0 Gid=0 Pid=1078] dir=false fl=OpenReadOnly
2022/03/21 13:09:50 DEBUG : testfile.txt: Open: flags=OpenReadOnly
2022/03/21 13:09:50 DEBUG : testfile.txt: Open: flags=O_RDONLY
2022/03/21 13:09:50 DEBUG : testfile.txt: >Open: fd=testfile.txt (r), err=<nil>
2022/03/21 13:09:50 DEBUG : testfile.txt: >Open: fh=&{testfile.txt (r)}, err=<nil>
2022/03/21 13:09:50 DEBUG : fuse: -> [ID=0x2e] Open 0x1 fl=0

Just ran another test taking crypt out of the picture. I mounted the unencrypted Google Drive (GD:) at /Plex and it still locks up.

This sounds like it has deadlocked.

Can you post the result of this when it has locked up

curl http://localhost:5572/debug/pprof/goroutine?debug=1

If that doesn't work then lock it up then kill -QUIT rclone and collect the (very long) backtrace and post it here.

So are you saying it should work doing the copyfile against the mount point path instead of the drive?

The curl never responds. The kill doesn't work either. the rclone mount appears to be in a zombie state. I could not kill it.

I am a relative newbie when it comes to linux. Can you point me to instructions to collect the backtrace? I can google it. But I just want to make sure I'm collecting what you are asking for.

If you kill -QUIT , it'll spit something like this.

I found the process ID from a ps -ef | grep rclone

ps -efelix@gemini:~$ ps -ef | grep rclone
felix     146785    9567  0 15:59 pts/0    00:00:00 rclone mount GD: /home/felix/t
felix     146905  146886  0 15:59 pts/1    00:00:00 grep --color=auto rclone
felix@gemini:~$ kill -QUIT 146785

and if you kill -QUIT it, you'll get something like.

and on my other window, it spits out all the output:

felix@gemini:~$ rclone mount GD: /home/felix/t
SIGQUIT: quit
PC=0x46af41 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex()
	runtime/sys_linux_amd64.s:519 +0x21
runtime.futexsleep(0xc000064800, 0x0, 0x0)
	runtime/os_linux.go:44 +0x36
runtime.notesleep(0x2d06ab0)
	runtime/lock_futex.go:160 +0x87
runtime.mPark()
	runtime/proc.go:1441 +0x2a
runtime.stopm()
	runtime/proc.go:2408 +0x78
runtime.findrunnable()
	runtime/proc.go:2984 +0x865
runtime.schedule()
	runtime/proc.go:3367 +0x239
runtime.park_m(0xc0000001a0)
	runtime/proc.go:3516 +0x14d
runtime.mcall()
	runtime/asm_amd64.s:307 +0x43

goroutine 1 [select]:
github.com/rclone/rclone/cmd/mountlib.(*MountPoint).Wait(0xc0000db680)
	github.com/rclone/rclone/cmd/mountlib/mount.go:293 +0x285
github.com/rclone/rclone/cmd/mountlib.NewMountCommand.func1(0xc000376000, {0xc0002fbec0, 0x2, 0x2})
	github.com/rclone/rclone/cmd/mountlib/mount.go:183 +0x298
github.com/spf13/cobra.(*Command).execute(0xc000376000, {0xc0002fbe60, 0x2, 0x2})
	github.com/spf13/cobra@v1.2.1/command.go:860 +0x5f8
github.com/spf13/cobra.(*Command).ExecuteC(0x2ce6e20)
	github.com/spf13/cobra@v1.2.1/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.2.1/command.go:902
github.com/rclone/rclone/cmd.Main()
	github.com/rclone/rclone/cmd/cmd.go:553 +0x76
main.main()
	github.com/rclone/rclone/rclone.go:14 +0x17

goroutine 6 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc000187900)
	go.opencensus.io@v0.23.0/stats/view/worker.go:276 +0xb9
created by go.opencensus.io/stats/view.init.0
	go.opencensus.io@v0.23.0/stats/view/worker.go:34 +0x92

goroutine 26 [syscall]:
os/signal.signal_recv()
	runtime/sigqueue.go:169 +0x98
os/signal.loop()
	os/signal/signal_unix.go:24 +0x19
created by os/signal.Notify.func1.1
	os/signal/signal.go:151 +0x2c

goroutine 16 [IO wait]:
internal/poll.runtime_pollWait(0x7f4a56470608, 0x72)
	runtime/netpoll.go:234 +0x89
internal/poll.(*pollDesc).wait(0xc000ad8500, 0xc000b06000, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000ad8500, {0xc000b06000, 0x1645, 0x1645})
	internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc000ad8500, {0xc000b06000, 0xc000d56640, 0xc000b06005})
	net/fd_posix.go:56 +0x29
net.(*conn).Read(0xc0001480b0, {0xc000b06000, 0xc000b06005, 0x381})
	net/net.go:183 +0x45
github.com/rclone/rclone/fs/fshttp.(*timeoutConn).Read(0xc00000c5d0, {0xc000b06000, 0x0, 0x100000000000000})
	github.com/rclone/rclone/fs/fshttp/dialer.go:104 +0x32
crypto/tls.(*atLeastReader).Read(0xc000d47500, {0xc000b06000, 0x0, 0x409dcd})
	crypto/tls/conn.go:777 +0x3d
bytes.(*Buffer).ReadFrom(0xc000178278, {0x1edd960, 0xc000d47500})
	bytes/buffer.go:204 +0x98
crypto/tls.(*Conn).readFromUntil(0xc000178000, {0x7f4a5642e218, 0xc00000c5d0}, 0x0)
	crypto/tls/conn.go:799 +0xe5
crypto/tls.(*Conn).readRecordOrCCS(0xc000178000, 0x0)
	crypto/tls/conn.go:606 +0x112
crypto/tls.(*Conn).readRecord(...)
	crypto/tls/conn.go:574
crypto/tls.(*Conn).Read(0xc000178000, {0xc000151000, 0x1000, 0x0})
	crypto/tls/conn.go:1277 +0x16f
net/http.(*persistConn).Read(0xc000a88900, {0xc000151000, 0xc000d36cc0, 0xc000d8ed30})
	net/http/transport.go:1926 +0x4e
bufio.(*Reader).fill(0xc000d6b1a0)
	bufio/bufio.go:101 +0x103
bufio.(*Reader).Peek(0xc000d6b1a0, 0x1)
	bufio/bufio.go:139 +0x5d
net/http.(*persistConn).readLoop(0xc000a88900)
	net/http/transport.go:2087 +0x1ac
created by net/http.(*Transport).dialConn
	net/http/transport.go:1747 +0x1e05

goroutine 65 [select]:
net/http.(*persistConn).writeLoop(0xc000a88900)
	net/http/transport.go:2386 +0xfb
created by net/http.(*Transport).dialConn
	net/http/transport.go:1748 +0x1e65

goroutine 81 [select]:
github.com/rclone/rclone/backend/drive.(*Fs).ChangeNotify.func1()
	github.com/rclone/rclone/backend/drive/drive.go:2755 +0x158
created by github.com/rclone/rclone/backend/drive.(*Fs).ChangeNotify
	github.com/rclone/rclone/backend/drive/drive.go:2746 +0xd6

goroutine 83 [chan receive]:
github.com/rclone/rclone/lib/atexit.Register.func1.1()
	github.com/rclone/rclone/lib/atexit/atexit.go:45 +0x2f
created by github.com/rclone/rclone/lib/atexit.Register.func1
	github.com/rclone/rclone/lib/atexit/atexit.go:44 +0x6d

goroutine 24 [syscall]:
syscall.Syscall(0x0, 0xc, 0xc000e80000, 0x21000)
	syscall/asm_linux_amd64.s:20 +0x5
syscall.read(0x2cd7da0, {0xc000e80000, 0x0, 0x0})
	syscall/zsyscall_linux_amd64.go:687 +0x4d
syscall.Read(...)
	syscall/syscall_unix.go:189
bazil.org/fuse.(*Conn).ReadRequest(0xc000d6b560)
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:580 +0xc6
bazil.org/fuse/fs.(*Server).Serve(0xc000b54000, {0x1ee0b60, 0xc000b4c060})
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:501 +0x3d5
github.com/rclone/rclone/cmd/mount.mount.func2()
	github.com/rclone/rclone/cmd/mount/mount.go:100 +0x3e
created by github.com/rclone/rclone/cmd/mount.mount
	github.com/rclone/rclone/cmd/mount/mount.go:99 +0x373

rax    0xca
rbx    0x0
rcx    0x46af43
rdx    0x0
rdi    0x2d06ab0
rsi    0x80
rbp    0x7ffedf0476e8
rsp    0x7ffedf0476a0
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0xc000062000
r13    0x0
r14    0x2d06240
r15    0x0
rip    0x46af41
rflags 0x286
cs     0x33
fs     0x0
gs     0x0
felix@gemini:~$

Yes it should I think.

Is it in state D?

That means it is locked up in the kernel probably :frowning: If you fusermount -z -u /path/to/mountpoint you may be able to release it.

Probably what happened is something like this

  • rclone starts the copyfile
  • the copyfile takes a lock
  • copyfile starts using the mount
  • the kernel calls rclone back via FUSE
  • rclone then tries to do something which involves taking the lock
  • however the lock is already held so it blocks
  • DEADLOCK!

Can you build rclone yourself? It would be worth trying to build rclone with go build -tags cmount which will make the rclone cmount command available. This is an alternative rclone mount command which uses libfuse directly, so you use rclone cmount instead of rclone mount.

I wonder if there is some FUSE reason why this doesn't work that I don't know about - maybe fuse mounts can't read their own filing systems for some kernel reason. That might make more sense copyfile taking a lock because I don't think it takes any locks...

When I issue the rc operations/copyfile, the status does not change. But I never get a completion of the command.
When I issue the fusermount to unmount it, nothing happens.
When I issue the kill -QUIT, the status goes to Z. But the rclone mount never terminates. I do not get the backtrace either. I don't get any output from the kill command.

I can probably figure out how to build rclone. But I would rather continue to use your standard builds.

I have two VPS servers at different host sites. One is a KVM. So I am able to restart my server when it gets hung up like this. But the other one is on a cheaper VPS that is not KVM. They have gotten upset with me in the past when this hang would occur because they have to reboot the entire node in order to free it up thus affecting other customers.

On my non-KVM VPS, this problem started after I upgraded rclone beyond 1.53.3. So I have left it there for quite a while.
On the KVM one, it used to work with 1.53.3. But now it doesn't.

Have you been able to recreate this scenario on your test server? I'm curious if it is the level of Ubuntu. Although I have not installed any updates to Ubuntu.

OK. It is definitely locked up in the kernel.

It would be useful for a test to try to narrow the problem down.

I've tried this with rclone latest on my laptop

rclone v1.59.0-beta.6039.7df9e3f70.fix-vfs-fast-fingerprint

  • os/version: ubuntu 21.10 (64 bit)
  • os/kernel: 5.13.0-30-generic (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.17.2
  • go/linking: dynamic
  • go/tags: none

And I haven't managed to get it to lock up.

It could be a kernel problem which would explain the lockup. Did you see if there was anything in dmesg after the lockup? Maybe there is a kernel oops in there?

I did not find anything in dmesg.

I found the kill -QUIT output. Didn't think to look in the rclone log. :frowning:

2022/03/23 09:35:37 DEBUG : rclone: Version "v1.58.0" starting with parameters ["/usr/bin/rclone" "mount" "GD:" "/Plex" "--rc" "--rc-addr=:5572" "--rc-user=XXXX" "--rc-pass=XXXX" "--debug-fuse" "--log-level=DEBUG" "--log-file=/home/rclonelog.txt"]
2022/03/23 09:35:37 INFO  : Using --user XXXX --pass XXXX as authenticated user
2022/03/23 09:35:37 NOTICE: Serving remote control on http://[::]:5572/
2022/03/23 09:35:37 DEBUG : Creating backend with remote "GD:"
2022/03/23 09:35:37 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2022/03/23 09:35:38 DEBUG : Google drive root '': Mounting on "/Plex"
2022/03/23 09:35:38 DEBUG : : Root: 
2022/03/23 09:35:38 DEBUG : : >Root: node=/, err=<nil>
2022/03/23 09:36:25 DEBUG : rc: "operations/copyfile": with parameters map[dstFs:/home dstRemote:testfile2.txt srcFs:/Plex srcRemote:testfile.txt]
2022/03/23 09:36:25 DEBUG : Creating backend with remote "/Plex"
2022/03/23 09:36:25 DEBUG : fuse: <- Getattr [ID=0x4 Node=0x1 Uid=0 Gid=0 Pid=911] 0x0 fl=0
2022/03/23 09:36:25 DEBUG : /: Attr: 
2022/03/23 09:36:25 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/03/23 09:36:25 DEBUG : fuse: -> [ID=0x4] Getattr valid=1s ino=1 size=0 mode=drwxr-xr-x
2022/03/23 09:36:25 DEBUG : Creating backend with remote "/home"
2022/03/23 09:36:25 DEBUG : fuse: <- Lookup [ID=0x6 Node=0x1 Uid=0 Gid=0 Pid=911] "testfile.txt"
2022/03/23 09:36:25 DEBUG : /: Lookup: name="testfile.txt"
2022/03/23 09:36:26 DEBUG : /: >Lookup: node=testfile.txt, err=<nil>
2022/03/23 09:36:26 DEBUG : testfile.txt: Attr: 
2022/03/23 09:36:26 DEBUG : testfile.txt: >Attr: a=valid=1s ino=0 size=20 mode=-rw-r--r--, err=<nil>
2022/03/23 09:36:26 DEBUG : fuse: -> [ID=0x6] Lookup 0x2 gen=0 valid=1s attr={valid=1s ino=4456035514278291952 size=20 mode=-rw-r--r--}
2022/03/23 09:36:26 DEBUG : testfile.txt: Need to transfer - File not found at Destination
2022/03/23 09:36:26 DEBUG : fuse: <- Open [ID=0x8 Node=0x2 Uid=0 Gid=0 Pid=911] dir=false fl=OpenReadOnly
2022/03/23 09:36:26 DEBUG : testfile.txt: Open: flags=OpenReadOnly
2022/03/23 09:36:26 DEBUG : testfile.txt: Open: flags=O_RDONLY
2022/03/23 09:36:26 DEBUG : testfile.txt: >Open: fd=testfile.txt (r), err=<nil>
2022/03/23 09:36:26 DEBUG : testfile.txt: >Open: fh=&{testfile.txt (r)}, err=<nil>
2022/03/23 09:36:26 DEBUG : fuse: -> [ID=0x8] Open 0x1 fl=0
SIGQUIT: quit
PC=0x46af41 m=0 sigcode=0

goroutine 34 [syscall]:
runtime.notetsleepg(0x43f265, 0x790de5)
	runtime/lock_futex.go:236 +0x34 fp=0xc0000637a0 sp=0xc000063768 pc=0x40bb54
os/signal.signal_recv()
	runtime/sigqueue.go:169 +0x98 fp=0xc0000637c0 sp=0xc0000637a0 pc=0x4657b8
os/signal.loop()
	os/signal/signal_unix.go:24 +0x19 fp=0xc0000637e0 sp=0xc0000637c0 pc=0x75dfb9
runtime.goexit()
	runtime/asm_amd64.s:1581 +0x1 fp=0xc0000637e8 sp=0xc0000637e0 pc=0x469141
created by os/signal.Notify.func1.1
	os/signal/signal.go:151 +0x2c

goroutine 1 [select]:
github.com/rclone/rclone/cmd/mountlib.(*MountPoint).Wait(0xc000393c80)
	github.com/rclone/rclone/cmd/mountlib/mount.go:293 +0x285
github.com/rclone/rclone/cmd/mountlib.NewMountCommand.func1(0xc0002e6500, {0xc000567b90, 0x9, 0x9})
	github.com/rclone/rclone/cmd/mountlib/mount.go:183 +0x298
github.com/spf13/cobra.(*Command).execute(0xc0002e6500, {0xc000567b00, 0x9, 0x9})
	github.com/spf13/cobra@v1.2.1/command.go:860 +0x5f8
github.com/spf13/cobra.(*Command).ExecuteC(0x2ce6e20)
	github.com/spf13/cobra@v1.2.1/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.2.1/command.go:902
github.com/rclone/rclone/cmd.Main()
	github.com/rclone/rclone/cmd/cmd.go:553 +0x76
main.main()
	github.com/rclone/rclone/rclone.go:14 +0x17

goroutine 9 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc0000f5a00)
	go.opencensus.io@v0.23.0/stats/view/worker.go:276 +0xb9
created by go.opencensus.io/stats/view.init.0
	go.opencensus.io@v0.23.0/stats/view/worker.go:34 +0x92

goroutine 19 [select]:
github.com/rclone/rclone/backend/drive.(*Fs).ChangeNotify.func1()
	github.com/rclone/rclone/backend/drive/drive.go:2755 +0x158
created by github.com/rclone/rclone/backend/drive.(*Fs).ChangeNotify
	github.com/rclone/rclone/backend/drive/drive.go:2746 +0xd6

goroutine 18 [IO wait]:
internal/poll.runtime_pollWait(0x7f07d2a49608, 0x72)
	runtime/netpoll.go:234 +0x89
internal/poll.(*pollDesc).wait(0xc0000f5880, 0x203000, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0000f5880)
	internal/poll/fd_unix.go:402 +0x22c
net.(*netFD).accept(0xc0000f5880)
	net/fd_unix.go:173 +0x35
net.(*TCPListener).accept(0xc0002a4ee8)
	net/tcpsock_posix.go:140 +0x28
net.(*TCPListener).Accept(0xc0002a4ee8)
	net/tcpsock.go:262 +0x3d
net/http.(*Server).Serve(0xc0002da380, {0x1f05410, 0xc0002a4ee8})
	net/http/server.go:3002 +0x394
github.com/rclone/rclone/cmd/serve/httplib.(*Server).Serve.func1()
	github.com/rclone/rclone/cmd/serve/httplib/httplib.go:352 +0x318
created by github.com/rclone/rclone/cmd/serve/httplib.(*Server).Serve
	github.com/rclone/rclone/cmd/serve/httplib/httplib.go:328 +0x127

goroutine 32 [runnable]:
syscall.Syscall(0x0, 0xf, 0xc000a00000, 0x21000)
	syscall/asm_linux_amd64.s:20 +0x5
syscall.read(0x2cd7da0, {0xc000a00000, 0xc, 0x0})
	syscall/zsyscall_linux_amd64.go:687 +0x4d
syscall.Read(...)
	syscall/syscall_unix.go:189
bazil.org/fuse.(*Conn).ReadRequest(0xc0009b2c00)
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:580 +0xc6
bazil.org/fuse/fs.(*Server).Serve(0xc00098b0a0, {0x1ee0b60, 0xc000999cb0})
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:501 +0x3d5
github.com/rclone/rclone/cmd/mount.mount.func2()
	github.com/rclone/rclone/cmd/mount/mount.go:100 +0x3e
created by github.com/rclone/rclone/cmd/mount.mount
	github.com/rclone/rclone/cmd/mount/mount.go:99 +0x373

goroutine 28 [IO wait]:
internal/poll.runtime_pollWait(0x7f07d2a496f0, 0x72)
	runtime/netpoll.go:234 +0x89
internal/poll.(*pollDesc).wait(0xc0004d4700, 0xc0001ea800, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0004d4700, {0xc0001ea800, 0x1644, 0x1644})
	internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0004d4700, {0xc0001ea800, 0xc0007a7400, 0xc0001ea805})
	net/fd_posix.go:56 +0x29
net.(*conn).Read(0xc0006cc210, {0xc0001ea800, 0xc0001ea805, 0x28d})
	net/net.go:183 +0x45
github.com/rclone/rclone/fs/fshttp.(*timeoutConn).Read(0xc00000c9f0, {0xc0001ea800, 0x0, 0x100000000000000})
	github.com/rclone/rclone/fs/fshttp/dialer.go:104 +0x32
crypto/tls.(*atLeastReader).Read(0xc00098d890, {0xc0001ea800, 0x0, 0x409dcd})
	crypto/tls/conn.go:777 +0x3d
bytes.(*Buffer).ReadFrom(0xc00074a978, {0x1edd960, 0xc00098d890})
	bytes/buffer.go:204 +0x98
crypto/tls.(*Conn).readFromUntil(0xc00074a700, {0x7f07d287da98, 0xc00000c9f0}, 0x1644)
	crypto/tls/conn.go:799 +0xe5
crypto/tls.(*Conn).readRecordOrCCS(0xc00074a700, 0x0)
	crypto/tls/conn.go:606 +0x112
crypto/tls.(*Conn).readRecord(...)
	crypto/tls/conn.go:574
crypto/tls.(*Conn).Read(0xc00074a700, {0xc000570000, 0x1000, 0x0})
	crypto/tls/conn.go:1277 +0x16f
net/http.(*persistConn).Read(0xc000738ea0, {0xc000570000, 0xc0007e6960, 0xc00076dd30})
	net/http/transport.go:1926 +0x4e
bufio.(*Reader).fill(0xc0009b2900)
	bufio/bufio.go:101 +0x103
bufio.(*Reader).Peek(0xc0009b2900, 0x1)
	bufio/bufio.go:139 +0x5d
net/http.(*persistConn).readLoop(0xc000738ea0)
	net/http/transport.go:2087 +0x1ac
created by net/http.(*Transport).dialConn
	net/http/transport.go:1747 +0x1e05

goroutine 29 [select]:
net/http.(*persistConn).writeLoop(0xc000738ea0)
	net/http/transport.go:2386 +0xfb
created by net/http.(*Transport).dialConn
	net/http/transport.go:1748 +0x1e65

goroutine 35 [chan receive]:
github.com/rclone/rclone/lib/atexit.Register.func1.1()
	github.com/rclone/rclone/lib/atexit/atexit.go:45 +0x2f
created by github.com/rclone/rclone/lib/atexit.Register.func1
	github.com/rclone/rclone/lib/atexit/atexit.go:44 +0x6d

goroutine 36 [running]:
	goroutine running on other thread; stack unavailable
created by net/http.(*Server).Serve
	net/http/server.go:3034 +0x4e8

goroutine 37 [IO wait]:
internal/poll.runtime_pollWait(0x7f07d2a49350, 0x72)
	runtime/netpoll.go:234 +0x89
internal/poll.(*pollDesc).wait(0xc0004d4900, 0xc000999ea1, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0004d4900, {0xc000999ea1, 0x1, 0x1})
	internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0004d4900, {0xc000999ea1, 0x6548c0, 0xc00074a724})
	net/fd_posix.go:56 +0x29
net.(*conn).Read(0xc0006cc620, {0xc000999ea1, 0xc000177310, 0xc000063fd0})
	net/net.go:183 +0x45
net/http.(*connReader).backgroundRead(0xc000999e90)
	net/http/server.go:672 +0x3f
created by net/http.(*connReader).startBackgroundRead
	net/http/server.go:668 +0xcf

goroutine 42 [runnable]:
github.com/rclone/rclone/fs/accounting.(*StatsInfo).startAverageLoop.func1·dwrap·21()
	github.com/rclone/rclone/fs/accounting/stats.go:341
runtime.goexit()
	runtime/asm_amd64.s:1581 +0x1
created by github.com/rclone/rclone/fs/accounting.(*StatsInfo).startAverageLoop.func1
	github.com/rclone/rclone/fs/accounting/stats.go:341 +0x72

goroutine 43 [runnable]:
syscall.Syscall(0x1, 0xf, 0xc0007f8500, 0x20)
	syscall/asm_linux_amd64.s:20 +0x5
syscall.write(0xc0009ef980, {0xc0007f8500, 0x1, 0x50})
	syscall/zsyscall_linux_amd64.go:915 +0x4d
syscall.Write(...)
	syscall/syscall_unix.go:214
bazil.org/fuse.(*Conn).writeToKernel(0x10, {0xc0007f8500, 0x20, 0x20})
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:1165 +0xea
bazil.org/fuse.(*Conn).respond(0x44ce92, {0xc0007f8500, 0x173b200, 0x1})
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:1178 +0x25
bazil.org/fuse.(*Header).respond(0xc000a7c180, {0xc0007f8500, 0xc00071dea0, 0x0})
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:310 +0x45
bazil.org/fuse.(*OpenRequest).Respond(0xc000a7c180, 0xc00071dea0)
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:1808 +0x97
bazil.org/fuse/fs.(*Server).handleRequest(0xc00098b0a0, {0x1f1d250, 0xc000a7c1c0}, {0x1ee0b80, 0xc0009e70f0}, 0xc000a7c140, {0x1efed80, 0xc000a7c180}, 0xc0009eff08)
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:1233 +0x166c
bazil.org/fuse/fs.(*Server).serve(0xc00098b0a0, {0x1efed80, 0xc000a7c180})
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:1015 +0x62c
bazil.org/fuse/fs.(*Server).Serve.func1()
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:512 +0x69
created by bazil.org/fuse/fs.(*Server).Serve
	bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:510 +0x3c5

rax    0xca
rbx    0x0
rcx    0x46af43
rdx    0x0
rdi    0x2d37180
rsi    0x80
rbp    0xc000063718
rsp    0xc0000636d0
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0x7ffd3ff2ced0
r13    0x0
r14    0xc0005511e0
r15    0xffffffffffffffff
rip    0x46af41
rflags 0x286
cs     0x33
fs     0x0
gs     0x0

That looks like it is deadlocked in the kernel or in FUSE. Everything looks normal, but the read from the fuse mount and the write of the data to the kernel has deadlocked.

It would be worth trying the rclone cmount if you can to see if that helps.

Also can you try a newer Ubuntu VM to see if that makes a difference?

I'm somewhat confused on the workflow here.

I've been using vfs-cache-mode full now for weeks removing mergerfs from my setup to simplify things and have been writing files after files with no issue.

I have a local disk and I copy from local disk to rclone mounts probably 50-100 times per day without every having an issues.

Is that the same workflow as above or something different? I'm on Ubuntu 20.04 LTS:

rclone version
rclone v1.58.0
- os/version: ubuntu 20.04 (64 bit)
- os/kernel: 5.13.0-37-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.8
- go/linking: static
- go/tags: none

I am trying to copy from the rclone mount to local disk.

I upgraded to 21.10, pretty much same level that Nick tested on. It still hangs.

rclone v1.58.0
- os/version: ubuntu 21.10 (64 bit)
- os/kernel: 5.13.0-37-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.8
- go/linking: static
- go/tags: none

So the difference then is local machine versus the VPS.

I will try the cmount later today.

1 Like

That's even more confusing as it's reading a file and not writing.

Not that it shouldn't be fixed, but why read from the mount and not directly from the remote? It adds a layer there that isn't needed (apologies if that's answered as I did not read through the whole thread).

When I wrote the multi-server version of Tautulli, I wrote in an rclone monitor by periodically copying a test file from the mount point to the local disk to verify that the rclone mount is up for Plex to use.

I've never had a problem with my rclone mounts to have the need for something like that.

I integrate all my stuff with systemd/services. If the service fails, all the services requiring it would be shutdown.

So for plex, I use an override like:

felix@gemini:/etc/systemd/system/plexmediaserver.service.d$ cat override.conf
[Unit]
After=rclone-movies.service rclone-tv.service
Requires=rclone-movies.service rclone-tv.service

[Service]
User=felix
Group=felix

I have a single server setup that only I use so I run everything as the same user(simple) and if either rclone service goes down, plex would get shutdown.

So all my services (Sonarr/Radarr/etc) have that require as it makes life much easier.

That being said, there should be no problem with copying a file from the mount to local as I don't want to obfuscate the issue.

Only thing I may suggest is using the HWE kernel rather than the older one but still, that shouldn't break it that badly..

cmount works with no deadlocks.

May be this bug: Process hangs when writing to mount point served by the same process · Issue #264 · bazil/fuse · GitHub

Which is caused by this: os, runtime: Go 1.9 assumes epoll is non-blocking · Issue #21014 · golang/go · GitHub

That sounds very promising. Tomorrow, I will try setting up my rclone mount and rclone rc on separate processes and see if that makes any difference.
Thanks for the find.

Very interesting

That describes almost perfectly the problems we have with running the mount tests which lock up on the continuous integration server locking up with GOMAXPROCS=1

To fix this the bazil/fuse project has invented a framework for running the tests in a subprocess - this idea would be relatively easy to adopt to run the mount in a subprocess for the tests...

Anyway back to the topic.

@zSeriesGuy this gives the definitive answer - no you can't access the contents of the rclone mount from the rclone process without risking a deadlock. However you can use rclone cmount since that doesn't use the same mechanism (it uses CGO) so doesn't suffer from the same deadlock problems.

I would expect that to work :slight_smile:

Should the fix from go-fuse also be added to rclone to prevent other users from running into the same issue?

References: