Left over rclone processes preventing mount from starting correctly

What is the problem you are having with rclone?

Afternoon folks,

So I've just spent the last few days moving most of my apps into docker containers (Plex, Sonarr, Docker, Rutorrent etc) and for the most part everything is working well.

The problem I'm having occurs every time I reboot my server; When it comes back up my rclone mount shows up in SFTP software, however opening folders seems to have a slight delay. None of my docker containers can see the mount at all!

If I check the status of my mount's systemd .service file, it says the service failed to start with an error and if I run journalctl -xe it gives me the following read out:

>  root@KingKong ~ # journalctl -xe
>     Sep 27 12:08:46 KingKong systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
>     Sep 27 12:08:46 KingKong systemd[1]: PlexMount.service: Found left-over process 7185 (rclone) in control group while starting unit. Ignoring.
>     Sep 27 12:08:46 KingKong systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
>     Sep 27 12:08:46 KingKong systemd[1]: Starting RClone Service...
>     -- Subject: A start job for unit PlexMount.service has begun execution
>     -- Defined-By: systemd
>     -- Support: http://www.ubuntu.com/support
>     --
>     -- A start job for unit PlexMount.service has begun execution.
>     --
>     -- The job identifier is 20475.
>     Sep 27 12:08:47 KingKong kernel: Firewall: *UDP_IN Blocked* IN=enp2s0 OUT= MAC=44:8a:5b:a0:2b:e6:64:64:9b:6a:7a:7f:08:00 SRC=141.0.93.12 DST=148.>
>     Sep 27 12:08:48 KingKong kernel: Firewall: *UDP_IN Blocked* IN=docker0 OUT= PHYSIN=vetheae91aa MAC=ff:ff:ff:ff:ff:ff:02:42:ac:11:00:04:08:00 SRC=>
>     Sep 27 12:08:48 KingKong kernel: Firewall: *UDP_IN Blocked* IN=docker0 OUT= PHYSIN=vetheae91aa MAC=ff:ff:ff:ff:ff:ff:02:42:ac:11:00:04:08:00 SRC=>
>     Sep 27 12:08:48 KingKong systemd[1]: PlexMount.service: Main process exited, code=exited, status=1/FAILURE
>     -- Subject: Unit process exited
>     -- Defined-By: systemd
>     -- Support: http://www.ubuntu.com/support
>     --
>     -- An ExecStart= process belonging to unit PlexMount.service has exited.
>     --
>     -- The process' exit code is 'exited' and its exit status is 1.
>     Sep 27 12:08:48 KingKong systemd[1]: PlexMount.service: Failed with result 'exit-code'.
>     -- Subject: Unit failed
>     -- Defined-By: systemd
>     -- Support: http://www.ubuntu.com/support
>     --
>     -- The unit PlexMount.service has entered the 'failed' state with result 'exit-code'.
>     Sep 27 12:08:48 KingKong systemd[1]: Failed to start RClone Service.
>     -- Subject: A start job for unit PlexMount.service has failed
>     -- Defined-By: systemd
>     -- Support: http://www.ubuntu.com/support
>     --
>     -- A start job for unit PlexMount.service has finished with a failure.
>     --
>     -- The job identifier is 20475 and the job result is failed.
>     Sep 27 12:08:48 KingKong kernel: Firewall: *UDP_IN Blocked* IN=enp2s0 OUT= MAC=44:8a:5b:a0:2b:e6:64:64:9b:6a:7a:7f:08:00 SRC=141.0.93.12 DST=148.>

There are references to left over rclone processes, which I need to kill before the mount will load properly. I then need to restart all of my docker containers before they will see the mount.

Can anyone suggest what the problem might be and how I can go about troubleshooting it?

What is your rclone version (output from rclone version)

rclone v1.54.0-beta.4788.d8239e019

  • os/arch: linux/amd64
  • go version: go1.15.2

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-48-generic x86_64)

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)


[Unit]
Description=RClone Service

[Service]
Type=notify
Environment=RCLONE_CONFIG=//home/craftyclown/.config/rclone/rclone.conf
KillMode=none
RestartSec=5
ExecStart=/usr/bin/rclone mount "RichFlixCrypt:/Encrypted/" /home/craftyclown/Skull \
   --allow-other \
   --buffer-size 256M \
   --dir-cache-time 72h \
   --log-level INFO \
   --log-file /home/craftyclown/logs/rcloneencryptedmount.log \
   --poll-interval 15s \
   --timeout 1h \
   --umask 002 \
   --cache-dir=/home/craftyclown/rclone_cache \
   --vfs-cache-mode full \
   --vfs-cache-max-size 100G \
   --vfs-cache-max-age 24h \
   --tpslimit 2
ExecStop=/bin/fusermount -uz /home/craftyclown/Skull
Restart=on-failure
User=craftyclown
Group=craftyclown

[Install]
WantedBy=multi-user.target

Use journalctl to examine what is in the rclone log - that should tell you why it failed.

1 Like

Sorry, I should have posted the entries from the log as well...

2020/09/27 11:57:55 Fatal error: Directory is not empty: /home/craftyclown/Skull If you want to mount it anyway use: --allow-non-empty option

So this is the strange part; rclone is throwing a fatal error as the directory is not empty, however when I check using FSTP software I can see the mount and not just the folder structure, but the files inside. This alongside the journalctl error about left over rclone processes, makes it seem like the mount is trying to run repeatedly over itself. As mentioned before, deleting the rclone processes flagged up by journalctl fixes the issue.

Sorry I'm probably confusing myself here, but everything was working fine until I started containerising my apps. I know it's not usually advised, but should I just use the --allow-non-empty option?

Why container them then? It just adds another layer of complexity unless you have a reason to do it.

If you add non empty, that would be the exact reason why not to add it as you have an application writing to the file system underneath and you'd overmount hiding whatever process is writing to it before you mounted it.

You'd want to figure out what's writing to it and fix your order of operations so that the rclone mount happens before.

Well other than this issue, using docker has made my life a lot easier, so I'm happy to push forward with it, providing I can work this out that is :wink:

So as I mentioned before, the strange thing is that it appears to be the mount trying to write over itself. I know that sounds odd, but when the issue occurs I can see the structure of the mount showing up and it's only through killing the leftover rclone processes as mentioned by journalctl that fixes things

How so? What problem did it solve for you?

Not sure what you mean by that.

Keeping 12 pieces of software and their dependencies in one location, with the ability to turn them on and off at will has proven to be incredibly convenient , especially foe someone like myself who only has basic linux skills. Other than this issue with my mount, my setup is working better than ever.

So there is definitely something odd going on here. I restarted the server without docker and I'm still having issues with my mount. I'm a bit confused about the error message though, as it seems to indicate there is an error in my script, although I haven't made any changes.

Here's the error:
Sep 27 15:23:42 KingKong rclone[11803]: Command mount needs 2 arguments maximum: you provided 3 non flag arguments: ["RichFlixCrypt:/Encrypted/" "/home/craftyclown/Skull" "\\"]
and here's the script:
[Unit]
Description=RClone Service

[Service]
Type=notify
Environment=RCLONE_CONFIG=//home/craftyclown/.config/rclone/rclone.conf
KillMode=none
RestartSec=5
ExecStart=/usr/bin/rclone mount "RichFlixCrypt:/Encrypted/" /home/craftyclown/Skull \
   --allow-other \
   --buffer-size 256M \
   --dir-cache-time 72h \
   --log-level INFO \
   --log-file /home/craftyclown/logs/rcloneencryptedmount.log \
   --poll-interval 15s \
   --timeout 1h \
   --umask 002 \
   --cache-dir=/home/craftyclown/rclone_cache \
   --vfs-cache-mode full \
   --vfs-cache-max-size 100G \
   --vfs-cache-max-age 24h \
   --tpslimit 2
ExecStop=/bin/fusermount -uz /home/craftyclown/Skull
Restart=on-failure
User=craftyclown
Group=craftyclown

[Install]
WantedBy=multi-user.target

I really can't see what the problem is here?

I rebooted the server again (docker disabled) and initially the mount wasn't showing, then after a few minutes it suddenly loaded.

Running systemctl status showed it was loaded, but once again there were left over processes causing issues

 PlexMount.service - RClone Service
     Loaded: loaded (/etc/systemd/system/PlexMount.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2020-09-27 16:25:35 BST; 4min 36s ago
   Main PID: 1467 (rclone)
      Tasks: 28 (limit: 38303)
     Memory: 172.4M
     CGroup: /system.slice/PlexMount.service
             β”œβ”€ 595 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /home/craftyclown/Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level DEBUG --log-file /home/craftyclown/logs/PlexMount.log --poll->
             └─1467 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /home/craftyclown/Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level DEBUG --log-file /home/craftyclown/logs/PlexMount.log --poll->

Sep 27 16:25:13 KingKong systemd[1]: Stopped RClone Service.
Sep 27 16:25:13 KingKong systemd[1]: PlexMount.service: Found left-over process 595 (rclone) in control group while starting unit. Ignoring.
Sep 27 16:25:13 KingKong systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 27 16:25:13 KingKong systemd[1]: Starting RClone Service...
Sep 27 16:25:35 KingKong systemd[1]: PlexMount.service: Got notification message from PID 595, but reception only permitted for main PID 1467
Sep 27 16:25:35 KingKong systemd[1]: Started RClone Service.

I'm running the latest beta of rclone, is it possible this is a bug?

As you can see from the previous .service status, the script is running multiple times and they seem to be conflicting with each other. If I kill the processes and then manually start the mount everything works perfectly.

I just don't understand why it is happening. I installed the non-beta version of rclone to see if that made a difference, but it doesn't. Real head scratcher!

If it fails to load, it restarts so that's the reason as your previous posts, you have file in there so it's not empty so it continues to restart based on your service file setup.

Fuse mounts are a bit tricky as you have some IO on the mountpoint most likely as well so it can't unmount and never actually stops I would surmise.

You'd want to stop everything and figure out what else is hitting your mount point and put in the proper items in your service files to make sure things are running before progressing or you'll get some funkiness like you are experiencing.

You have all the software in one docker? Just trying to follow and get what you are doing. I find docker to be overly complex if you run a single server and even more so if you have less Linux skills as it's another set of permission/complexities and such.

You have all the software in one docker? Just trying to follow and get what you are doing. I find docker to be overly complex if you run a single server and even more so if you have less Linux skills as it's another set of permission/complexities and such.

I have 12 containers running so I can independently control any one piece of software from within it's own container. It's great if I am having issues with anything as it's simple to troubleshoot by taking things out of the equation. As all the dependencies are stored within the container, it also reduces the issues of things not playing nice. It also makes things incredibly easy to back up as well. But honestly, it's horses for courses and I appreciate it's not to everyone's liking.

If it fails to load, it restarts so that's the reason as your previous posts, you have file in there so it's not empty so it continues to restart based on your service file setup.

Fuse mounts are a bit tricky as you have some IO on the mountpoint most likely as well so it can't unmount and never actually stops I would surmise.

You'd want to stop everything and figure out what else is hitting your mount point and put in the proper items in your service files to make sure things are running before progressing or you'll get some funkiness like you are experiencing.

Ok, yes this makes sense. So there is probably something getting in the way of the Mount and then the rclone processes are tripping over each other and that's what the errors are about. That said, why does it only start working when I kill the rclone processes? Surely whatever was stopping them in the first place should still be an issue?

So I can count out docker and any of the software that runs in it's containers, as I have disabled the docker services and the problem remains, but that is also odd, as most of my software is now containered, so I can't understand what could be getting in the way.

Next test is disabling the mount and then seeing if there is anything in the space that the mount used to occupy once I reboot.

With 12 containers, you are managing 12 environments rather me just doing 1 since a lot of the software ha shared dependencies as well.

I tend to see dockers more for moving software around and scaling so I was curious on your setup and reasoning on your use case.

Not judging by any means as if it works, definitely do it!

1 Like

So I tried disabling the mount, but following a reboot there didn't appear to be anything populating the space the mount usually would.

Next I disabled the mount once more and then re-located outside of my home folder where it usually lives, hoping that might help. Sadly the issue remains and there are rclone processes still fighting with each other and preventing the mount from starting correctly :frowning:

 PlexMount.service - RClone Service
     Loaded: loaded (/etc/systemd/system/PlexMount.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2020-09-27 22:19:11 BST; 1min 19s ago
   Main PID: 1313 (rclone)
      Tasks: 25 (limit: 38303)
     Memory: 107.6M
     CGroup: /system.slice/PlexMount.service
             β”œβ”€ 596 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level INFO --log-file /home/craftyclown/logs/PlexMount.log --poll-interval 15s --ti>
             └─1313 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level INFO --log-file /home/craftyclown/logs/PlexMount.log --poll-interval 15s --ti>

Sep 27 22:19:08 KingKong systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 27 22:19:08 KingKong systemd[1]: Starting RClone Service...
Sep 27 22:19:11 KingKong systemd[1]: Started RClone Service.

So here's the really weird bit... Right now the rclone mount's service status says it isn't running and it's throwing the following error...

craftyclown@KingKong:~$ sudo systemctl status PlexMount
● PlexMount.service - RClone Service
     Loaded: loaded (/etc/systemd/system/PlexMount.service; enabled; vendor preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Sun 2020-09-27 22:41:59 BST; 2s ago
    Process: 2835 ExecStart=/usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level INFO --log-file /home/craftyclown/logs/PlexMount.log --poll-interval >
   Main PID: 2835 (code=exited, status=1/FAILURE)
      Tasks: 14 (limit: 38303)
     Memory: 79.8M
     CGroup: /system.slice/PlexMount.service
             └─1313 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level INFO --log-file /home/craftyclown/logs/PlexMount.log --poll-interval 15s --ti>

Sep 27 22:41:59 KingKong systemd[1]: PlexMount.service: Failed with result 'exit-code'.
Sep 27 22:41:59 KingKong systemd[1]: Failed to start RClone Service.

However if I open up my sftp software I can see the mount and everything within it!! What is going on?

Just to make sure I'm not going crazy, I downloaded something from the mount via sftp and it worked fine.

perhaps if you use debug logs, you might find something there.

Sadly not. Just the same message as previous;

2020/09/27 22:47:01 Fatal error: Directory is not empty: /Skull If you want to mount it anyway use: --allow-non-empty option

It's almost as if the mount gets created and the errors come as it repeatedly tries to build the mount on top of itself.

So strange!

Does another systemd service have a required for the rclone mount? That would cause it to start too.

You do want to figure out what keeps putting files on before the mount happens though if you clean it out and are sure it’s empty.

No, nothing else is calling the mount. That's what is so confusing!

So I did another reboot and this time it says the mount has loaded properly, however in the status it still mentions two identical processes under CGroup. Is that normal behaviour?

Last login: Sun Sep 27 22:18:53 2020 from 77.103.9.42
craftyclown@KingKong:~$ sudo systemctl status PlexMount
[sudo] password for craftyclown:
● PlexMount.service - RClone Service
     Loaded: loaded (/etc/systemd/system/PlexMount.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2020-09-27 23:26:13 BST; 8min ago
   Main PID: 5104 (rclone)
      Tasks: 24 (limit: 38303)
     Memory: 98.4M
     CGroup: /system.slice/PlexMount.service
             β”œβ”€ 596 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level INFO --log-file /home/craftyclown/logs/PlexMount.log --poll-interval 15s --ti>
             └─5104 /usr/bin/rclone mount RichFlixCrypt:/Encrypted/ /Skull --allow-other --buffer-size 256M --dir-cache-time 72h --log-level INFO --log-file /home/craftyclown/logs/PlexMount.log --poll-interval 15s --ti>

Sep 27 23:26:10 KingKong systemd[1]: Starting RClone Service...
Sep 27 23:26:13 KingKong systemd[1]: Started RClone Service.

Nope as that means you have two separate processes going on.

In that screenshot, 5104 is the main one, but 596 is there too.

Just add this to your systemd:

ExecStartPre=-/bin/fusermount -uz /YOUR/FOLDER

1 Like