Can't gracefully close rclone mount

Summertime · June 26, 2022, 12:16am

What is the problem you are having with rclone?

When sending sigterm to rclone mount:

2022/06/26 09:30:57 INFO  : Signal received: interrupt
2022/06/26 09:30:57 ERROR : B2: Unmounted rclone mount
2022/06/26 09:30:57 INFO  : Exiting...

And an exit code is generated, 143

Run the command 'rclone version' and share the full output of the command.

$ rclone version
rclone v1.57.0-DEV
- os/version: fedora 36 (64 bit)
- os/kernel: 5.17.13-300.fc36.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.18beta1
- go/linking: dynamic
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

This occurs for both B2 and Dropbox

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone --log-level DEBUG mount --allow-non-empty B2: B2

The rclone config contents with secrets removed.

[B2]
type = crypt
remote = :b2,account=xxx,key=xxx:xxx
password = xxx

[Dropbox]
type = dropbox
token = {"access_token":"xxx","token_type":"bearer","expiry":"xxx"}

A log from the command with the `-vv` flag

2022/06/26 09:29:23 DEBUG : rclone: Version "v1.57.0-DEV" starting with parameters ["rclone" "--log-level" "DEBUG" "mount" "B2:" "B2"]
2022/06/26 09:29:23 DEBUG : Creating backend with remote "B2:"
2022/06/26 09:29:23 DEBUG : Using config file from "/var/home/xxx/.config/rclone/rclone.conf"
2022/06/26 09:29:23 DEBUG : Creating backend with remote ":b2,account=xxx,key=xxx:xxx"
2022/06/26 09:29:23 DEBUG : :b2: detected overridden config - adding "{sNRZI}" suffix to name
2022/06/26 09:29:24 DEBUG : Couldn't decode error response: EOF
2022/06/26 09:29:24 DEBUG : fs cache: renaming cache item ":b2,account=xxx,key=xxx:xxx" to be canonical ":b2{sNRZI}:xxx"
2022/06/26 09:29:24 INFO  : Encrypted drive 'B2:': poll-interval is not supported by this remote
2022/06/26 09:29:24 DEBUG : Encrypted drive 'B2:': Mounting on "B2"
2022/06/26 09:29:24 DEBUG : : Root: 
2022/06/26 09:29:24 DEBUG : : >Root: node=/, err=<nil>
2022/06/26 09:29:24 DEBUG : /: Lookup: name=".Trash"
2022/06/26 09:29:27 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
2022/06/26 09:29:27 DEBUG : /: Attr: 
2022/06/26 09:29:27 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2022/06/26 09:29:27 DEBUG : /: Lookup: name="BDMV"
2022/06/26 09:29:27 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
2022/06/26 09:29:27 DEBUG : /: Lookup: name=".xdg-volume-info"
2022/06/26 09:29:27 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
2022/06/26 09:29:27 DEBUG : /: Lookup: name="autorun.inf"
2022/06/26 09:29:27 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
2022/06/26 09:29:27 DEBUG : /: Lookup: name=".Trash-1000"
2022/06/26 09:29:27 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
2022/06/26 09:29:27 DEBUG : /: ReadDirAll: 
2022/06/26 09:29:27 DEBUG : /: >ReadDirAll: item=15, err=<nil>
2022/06/26 09:29:27 DEBUG : /: Attr: 
2022/06/26 09:29:27 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
[ last 4 lines repeat 24 more times ]
2022/06/26 09:29:27 DEBUG : /: ReadDirAll: 
2022/06/26 09:29:27 DEBUG : /: >ReadDirAll: item=15, err=<nil>
2022/06/26 09:29:27 DEBUG : /: Lookup: name="autorun.inf"
2022/06/26 09:29:27 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
^C2022/06/26 09:29:27 INFO  : Signal received: interrupt
2022/06/26 09:29:27 ERROR : B2: Unmounted rclone mount
2022/06/26 09:29:27 INFO  : Exiting...

asdffdsa · June 26, 2022, 12:21am

hello and welcome to the forum,

--- rclone v1.57.0-DEV
that is old, dev, custom compiled, and using a beta version of go.

--- the only way to get the latest stable rclone, v1.58.1 is
https://rclone.org/downloads/#script-download-and-install

Animosity022 · June 26, 2022, 11:01am

Any fuse mount, you want to make sure you stop all IO/processes to the mount before you stop it.

In service file, I do something like:

ExecStopPre=/bin/fusermount -uz /media/TV
ExecStopPre=/usr/bin/fuser -kMm /media/TV
ExecStop=/bin/fusermount -uz /media/TV

And I make sure any service has a dependency on rclone, stops as well. I'd imagine if gives you a bad exit code, because it could not cleanly stop since there was access going on.

ncw · June 26, 2022, 8:37pm

Is it just the exit code you are worried about?

You could try sending a SIGINT instead.

Summertime · July 11, 2022, 8:38am

killing after unmounting didn't work, since the mount is unmounted so fuser doesn't know what mount is being talked about (even if the mount still has writes happening to it)

And then, if I'm killing writing processes off, then I'm losing data, so not great

Summertime · July 11, 2022, 8:54am

exit code and the fact that it causes an error

SIGINT gives "ERROR : zz: vfs cache: restart download failed: failed to start downloader: failed to open downloader: vfs reader: failed to open source file: invalid seek position" on the next run, if I'm actively writing

SIGTERM doesn't cause cache issues as far as I can tell, so it seems to be graceful to a degree? if its a graceful shutdown, then it shouldn't show an error and exit code, unless the shutdown had an error. I might be a bit obtuse with this, but, it does result in systemd/journalctl telling me everything is on fire, which is a bit ugh

(and if I did tell systemd to ignore that exit-code, would that then hide actual errors in the future?)

Animosity022 · July 11, 2022, 11:06am

I'm not quite sure what you are expecting to happen.

For my fuse/rclone mounts, I have requirements for all my services to I specifically do not have to kill the mount as you'll get into situations like you are talking about.

You have to stop all the other services before stopping the rclone that so the IO is clean.

It's much like having a regular mount as the OS won't let you unmount it if there is an active process against it, but with a normal mount, you can't kill it.

You have to quiet it by stopping all IO and then stop it. As I shared, I use require in my systemd.

Summertime · July 11, 2022, 12:00pm

I'm wanting a graceful shutdown, not a halt-the-planet shutdown

do not allow new opens, wait until all currently open files are closed, then cleanup and exit.

Noting that the lines you provided didn't solve: fuser -Mm returned nothing, so there were no procs interacting with the mount, and the error and exit code still occurred.

fusermount -uz prevented new writes, but doesn't stop pre-existing writes, so "After the commands configured in this option are run, it is implied that the service is stopped" from the systemd docs, which is not the case, fusermount -uz will almost always terminate before rclone does, so I'd have to maintain a separate rclone-mount-stop script to actually get it to close gracefully which... I just wanna use sigterm, please

What would be expected:

sigterm/sigint once: stops new opens, waits for currently opened files to be closed (if under systemd, requests additional closing time from systemd during this period, see EXTEND_TIMEOUT_USEC, cleans up and closes
sigterm/sigint twice: grabs what is given from any open files, cleans up and closes

tl;dr: i want the close button to not be violent

Animosity022 · July 11, 2022, 12:07pm

If you are using systemd, you'd have requirements to ensure things do not stop out of order and you quiesce the mount point before stopping it.

The only time I'm using a SIGTERM/SIGINT means something bad happened and you still have to clean up the IO at some point be a reboot or some other process clean up.

If there isn't any IO going on, the fusermount will unmount cleanly and nothing else is required.

SIGTERM really isn't the answer imo as you are trying to take a hammer to solve a problem is a clean fashion.

Summertime · July 11, 2022, 12:19pm

If you are using systemd, you'd have requirements to ensure things do not stop out of order and you quiesce the mount point before stopping it.

There is nothing depending on it, nothing writing on it, nothing reading on it, nothing. I already said this. Error and exit code appears regardless.

The only time I'm using a SIGTERM

SIGTERM is the process equivilent of pressing the close button in the top right of a GUI program, SIGKILL is the hammer

https://www.freedesktop.org/software/systemd/man/daemon.html#New-Style%20Daemons

If SIGTERM is received, shut down the daemon and exit cleanly.

To do this otherwise, as your method, but graceful, would require, opening the pidfd (to avoid any pid replacement issues), calling fusermount -uz, then polling the pidfd.

I just want to use the gently-close-the-program-button to gently close the program

Animosity022 · July 11, 2022, 12:34pm

If you send a kill, you generally get a 143.

felix@gemini:~$ rclone mount GD: /home/felix/test -vvv
2022/07/11 08:28:08 DEBUG : Setting --config "/opt/rclone/rclone.conf" from environment variable RCLONE_CONFIG="/opt/rclone/rclone.conf"
2022/07/11 08:28:08 DEBUG : rclone: Version "v1.59.0" starting with parameters ["rclone" "mount" "GD:" "/home/felix/test" "-vvv"]
2022/07/11 08:28:08 DEBUG : Creating backend with remote "GD:"
2022/07/11 08:28:08 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2022/07/11 08:28:08 DEBUG : GD: Loaded invalid token from config file - ignoring
2022/07/11 08:28:08 DEBUG : Saving config "token" in section "GD" of the config file
2022/07/11 08:28:08 DEBUG : GD: Saved new token in config file
2022/07/11 08:28:08 DEBUG : Google drive root '': Mounting on "/home/felix/test"
2022/07/11 08:28:08 DEBUG : : Root:
2022/07/11 08:28:08 DEBUG : : >Root: node=/, err=<nil>
2022/07/11 08:28:32 INFO  : Signal received: terminated
2022/07/11 08:28:32 ERROR : /home/felix/test: Unmounted rclone mount
2022/07/11 08:28:32 INFO  : Exiting...
felix@gemini:~$ echo $?
143

As that's a normal exit code for the OS killing something:

If you use fusermount:

felix@gemini:~$ rclone mount GD: /home/felix/test -vvv
2022/07/11 08:31:59 DEBUG : Setting --config "/opt/rclone/rclone.conf" from environment variable RCLONE_CONFIG="/opt/rclone/rclone.conf"
2022/07/11 08:31:59 DEBUG : rclone: Version "v1.59.0" starting with parameters ["rclone" "mount" "GD:" "/home/felix/test" "-vvv"]
2022/07/11 08:31:59 DEBUG : Creating backend with remote "GD:"
2022/07/11 08:31:59 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2022/07/11 08:32:00 DEBUG : Google drive root '': Mounting on "/home/felix/test"
2022/07/11 08:32:00 DEBUG : : Root:
2022/07/11 08:32:00 DEBUG : : >Root: node=/, err=<nil>
2022/07/11 08:32:05 DEBUG : /home/felix/test: Unmounted externally. Just exit now.
2022/07/11 08:32:05 DEBUG : rclone: Version "v1.59.0" finishing with parameters ["rclone" "mount" "GD:" "/home/felix/test" "-vvv"]
felix@gemini:~$ echo $?
0

Summertime · July 11, 2022, 12:43pm

So is this a documentation issue then?

When the program ends while in foreground mode, either via Ctrl+C or receiving a SIGINT or SIGTERM signal, the mount should be automatically stopped.

should be removed from the documentation, as it is not a clean way of closing the mount.

Pardon my sarcasm.

And pardon my rudeness: Can only responses here on out be discussing if this is to be fixable or notfixable?

Animosity022 · July 11, 2022, 12:52pm

I'm trying to assist here so the sarcasm is not called for.

It's always amazing to me when I volunteer time and try to assist someone that are rude/sarcastic in their responses as my goal here is to help you by giving up my personal time to go through your question/problem.

That's the goal right? We have to understand the problem by having a conversation and teaching each other things and learning from each other. That's part of the process.

I'm not sure as that probably requires a consensus on what 'clean' means there as the mount is stopped properly and most documentation I've seen shows that exit code 143 for an OS killing a process which is what we've done. The mount can be remounted and it does seem to be clean, but not an exit code 0. That could be a documentation update and/or more clearly in the documentation.

If you'd like to submit a PR to update it to be more clear, please feel free. The PRs are another spot that discussion can happen as well.

@ncw - I did read a few other things and I'm not sure if that's a fuse item or rclone item causing the 143 as I found a few examples and that was adjusted via code changes.

exit code 143 on graceful shutdown · Issue #30 · neo4j/docker-neo4j (github.com)

Summertime · July 11, 2022, 1:17pm

It felt like you were telling me repeatedly that didn't know what I wanted. I just wanted to make a bug report that "it shows an error when you close the program according to the documentation", being told repeatedly to do something else, as if what I brought up was not valid. felt insulting and demeaning.

Perhaps I could have titled the thread better or something.

Generally for services under Linux, start with the LSB:

https://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

For all other init-script actions, the init script shall return an exit status of zero if the action was successful.

hence why systemd expects 0 on SIGTERM, as that is the equivalent of LSB's service stop

(and then after LSB, daemon for info specific to systemd)

SIGTERM does seem to do what is needed, however the fact that it produces a ERROR line worries me, however if that error is just a mis-leveled logging, IMO changing that to be INFO or DEBUG, and having the exit code = 0

If the ERROR is an error though, then that is a bit of a worry

SIGINT seems to not leave the cache in a safe order for my current binary, could be a problem for both and by chance its only happened when I used SIGINT, dunno. I'll swap to a new binary in the future and re-test

Animosity022 · July 11, 2022, 2:31pm

I have to ask questions to get to my own understanding of the issue. The goal for me I always assume good intent unless someone gives me a reason (i.e. pardon my sarcasm / pardon my annoyance) as my goal still was/is to help.

So by asking those questions, I think we have a good understanding of the specifics / clear instructions on reproducing it and if ncw gets a chance, he can chime in and see if it is a fix.

Summertime · July 11, 2022, 2:41pm

none of the questions seemed relevant other than "What would you expect to happen in response to sigterm and why?" which was phrased as "I'm not quite sure what you are expecting to happen." Which is not a question.

Actually, reading through, you asked zero questions.

Animosity022 · July 11, 2022, 2:45pm

"Thanks for helping me out and collecting a good bug report"

Much appreciated man. I'm happy to have helped you.

Summertime · July 11, 2022, 2:52pm

Please do not falsely quote me.

Animosity022 · July 11, 2022, 3:00pm

I didn't quote you as you can see there aren't any quote tags around it.

It was a suggestion on how to take the conversation and thank the guy for helping you.

Summertime · July 11, 2022, 3:31pm

you only began to help after I ~~requested you~~ (eh I probably didn't request it whatever) to stop harassing me with a repeated non-solution.

I didn't even know if it was reproducible until you posted a log to continue to tell me off for not using fusermount

and out of genuinely trying to be helpful, you might want to check your journal while stopping your service:

ExecStopPre=/bin/fusermount -uz /media/TV
ExecStopPre=/usr/bin/fuser -kMm /media/TV
ExecStop=/bin/fusermount -uz /media/TV

a) there is no such thing as ExecStopPre for service files (it exists for socket files)
b) execstop runs, finishes, but rclone is still running as fusermount is lazy, hence we get

rclone@Dropbox.service: Main process exited, code=exited, status=143/n/a

showing that, even with you doing the fusermount -uz, rclone still gets SIGTERM'd

so, ALL of everything, you are quite probably having the exact same closing condition as I am, in your service files, which is why I can not use any of your suggestions: They don't do anything different.

You ignored me saying it wasn't a solution, treating me as.. I don't know what. Not worth hearing. I think the overall term for it is *splaining