Sonarr / Radarr Hangs when using cache temp upload path

After further command line testing and @Animosity022’s investigation above, I think we can break this into 2 different issues (no encryption, to remove that factor):

1) Without the backend cache
Whether using “–vfs-cache-mode writes” or not:

  • A cp or mv from local to mount hangs until the file has finished uploading to Google drive
  • Whilst it is uploading:
    – Any subsequent attempted modifications to this file in the mount are on hold and awaiting their turn
    But despite this, the file is shown as available in the mount immediately (ie. whilst still uploading)
  • After the file finishes uploading, the queued modifications are processed.

So it seems that what @Animosity022 is experiencing is a specific reaction to this by Sonarr. Something along the lines of:

  • Sonarr starts copying “test.mkv.partial~” from local to the mount
  • The cp hangs until test.mkv.partial~ has finished uploading to Google drive
  • But test.mkv.partial~ is shown available in the mount immediately so Sonarr issues its “mv test.mkv.partial~ test.mkv” in the mount
  • This mv queues up and makes the test.mkv.partial~ AND test.mkv dissappear from the mount until test.mkv.partial~ has finished uploading (I actually see this behaviour in my mount)
  • Sonarr either times out or notices the now-empty mount so decides that the original cp has failed and retries with another cp of test.mkv.partial~
  • The first cp upload completes and the queued mv is performed on it. Sonarr registers success and stops processing.
  • The second test.mkv.partial~ upload completes (but with no mv issued by Sonarr).

Not sure how to progress this one further?

2) With the backend Cache & --cache-tmp-upload-path
I didn’t test the backend cache without --cache-tmp-upload-path as I expect for uploads it will be the same scenario as (1) above.

What we’re seeing with the backend cache and --cache-tmp-upload-path is:

  • Using Sonarr or the command line, cp from local to the mount results in:
    – The file being copied into the local --cache-tmp-upload-path
    – After this completes (fairly quickly as its all a local operation), the mount shows the file as available.
  • Whilst there are files in --cache-tmp-upload-path, and before the --cache-tmp-wait-time offset expires for any file in this upload queue, all files in --cache-tmp-upload-path can be modified by Sonarr or the command line without issue.
  • However, as soon as any file in the queue commences its upload to Google drive, ALL files in --cache-tmp-upload-path are locked for modification until that file completes uploading. “Locked” might be the wrong word here, as modifications to these files queue up rather than getting rejected.
  • Once the currently uploading file completes uploading, all queued modifications to all files in --cache-tmp-upload-path are processed before the next file in the upload queue commences uploading.

In the Offline Caching area of the docs the intention is:

  1. Reading the file still works during the upload but most modifications on it will be prohibited

However it seems that this modification prohibition is being applied to the whole queue, not just the currently uploading file in the queue.

Anyone know where in the code this modification prohibition is applied?

Maybe somewhere around here?

Seems like a locked mutex is set up on the BoltDB that contains the whole queue, but somehow the next line’s “defer” to unlock the mutex is not triggered until the file has finished uploading.

Sorry forgot to press send on this about two weeks ago! It --daemon-timeout is merged and in the latest beta.

Well spotted. This is a consequence of the config re-organisation. Unfortunately the command line flag and the config variable had different names so I renamed the command line flag to be --cache-chunk-total-size - sorry!

Great! I’ll write some docs and merge that.

That is the expected semantics - cp or mv shouldn’t return until the file is in place.

That is because the lock on that file is held while the file is uploading.

I think arguably cp on a normal file system would do the same thing - the file would appear at the destination immediately with 0 size and grow until the full size, then cp would return.

Can you try to replicate the problem with a sequence of mv and cp commands - and then make a new issue on github then we can work out what to fix. I think there will be a small tweak we can do somewhere!

I think this scenario deserves an issue as well so we don’t forget about it.

I’m less sure about the solution since I don’t know the cache backend as well as the rest of rclone but your mutex looks to be a possibility!

Defers are only run when the function returns.

thanks for that detailed reply @ncw, apologies for my belated response. I am travelling at the moment but will follow up on those tests and github issues when i get back. thx again!

1 Like

Quick update - I upgraded to the latest RClone 1.43.1 and could not replicate this issue anymore - which is great news. So I have not raised any Github issues, but will do so if I start seeing this again.

1 Like

What settings do you currently use? I still have the problem on the latest stable version of Rclone.

I am using:

sudo /usr/local/bin/rclone mount Media: ~/rclone_data/media_mount_point --allow-other --dir-cache-time=160h --cache-chunk-size=10M --cache-chunk-total-size=10G --cache-info-age=168h --cache-workers=6 --attr-timeout=1s --cache-tmp-upload-path ~/rclone_data/rclone_upload --modify-window 1s --umask 002 --drive-use-trash=false --cache-chunk-path ~/rclone_data/cache_dir/cache-backend --cache-db-path ~/rclone_data/cache_dir/cache-backend --cache-dir ~/rclone_data/cache_dir --config ~/.config/rclone/rclone.conf --buffer-size=0M --volname Media --syslog --log-level INFO --cache-tmp-wait-time 15m --daemon --daemon-timeout=10m --tpslimit 5

It might be fixed for cache but for VFS I still see the problem. Sonarr/Radarr will hang when it is finished copying one file and starts the next, which starts out first as rate limited due to the first still being uploaded I believe. It might also be maybe if it’s hanging on trying to delete a file while a copy is happening as well, I’m not too sure.

So on OSX High Sierra:

1) Using cache mount with a cache_tmp_upload_path:
Testing with rclone 1.43.1 I no longer get the locks that I was getting previously. Not sure whether this was due to the mutex idea or something else, but I can’t reproduce it so I won’t chase that anymore.

2) Using vfs, no cache mount:
I moved away from the backend cache as I could not get the same read performance as with just vfs.

So I now use a pretty standard drive->crypt mount setup and I am seeing similar issues as everybody else on this thread with a vfs mount. Some apps just isn’t tolerant of slow moves to the cloud.

Small files usually move to the crypt remote just fine. But apps still struggles with slowness due to:

  • Larger file moves
  • Bandwidth taken up by competing apps.

I am going to try the new rclone union mount with:

  • the high priority (read/write) location being a local folder; and
  • the low priority location (read only) being the crypt-atop-drive vfs mount; and
  • a nightly rclone move from the local folder to the crypt remote.

This will be fine fine for movement of new files, but won’t work with changes to older files that are only in the lower priority, read-only mount location.

Meanwhile, I am really hoping for the rclone union remote functionality to evolve so that the low priority location is not just read-only.

If you’re on linux, I’d just forget rclone union remotes for now and copy @Animosity022’s github setup which uses mergerfs instead. If you’re on Windows or OSX, I havn’t been able to find any decent union file systems that I could get working with rclone, so cross your fingers for evolution of the rclone union remote!

1 Like

Did this ever get solved?

I'm having the same problem with radarr now. Movies stay in queue even if radarr has successfully moved them to the rclone mount

You can follow this issue: