Testing for new vfs cache mode features

darthShadow · June 26, 2020, 4:21pm

v1.52.2-132-g61e4b4db-beta

That looks like a beta build from the master branch, however, these changes have not been merged to the master branch and consequently, the latest beta yet. You will need to try the build from here to get the latest changes:

https://beta.rclone.org/branch/v1.52.1-145-g4d9ad98a-vfs-beta/

Holtzmann · June 26, 2020, 5:18pm

Guys, I have some doubts about the time the cache file will be on the disk using the new vfs cache and the old one.
This service below I'm using the old cache, but with flags of the vfs.
I have seen --vfs-cache-max-age 720h but I'm not sure if this will influence what is configured within .config.

[Unit]
Description = drive
Wants = network-online.target
After = network-online.target

[Service]
Type = notify
User = 0
Group = 0
KillMode = none
RestartSec = 5
ExecStart = / usr / bin / rclone mount cache: / mnt / gdrive
--config /home/holtzflix/.config/rclone/rclone.conf
--allow-other
--allow-non-empty
--cache-dir / mnt / Cache / vfs
--drive-skip-gdocs
--poll-interval 1m
--vfs-cache-max-age 720h
--vfs-cache-mode writes
--vfs-cache-poll-interval 1m
--vfs-cache-max-size off
--vfs-read-chunk-size 16M
--vfs-read-chunk-size-limit 1G
--gid = 1000
--uid = 1000
--umask 002
--fast-list
--log-file /opt/cache-rclone.log
--log-level INFO
--cache-chunk-no-memory = true

ExecStop = / bin / fusermount -uz / mnt / gdrive
Restart = on-failure

[Install]
WantedBy = default.target

My config looks like this:

[cache]
type = cache
remote = gdrive:
chunk_size = 10M
info_age = 1h
chunk_path = / mnt / Cache / rclone
db_path = / mnt / Cache / db
chunk_total_size = 30G
rps = 10
workers = 4

What I wanted to know is the difference between the old and the beta.
In the beta I configure the --vfs-cache-max-age 720h and I know that the data will remain there for that period, being replaced by new ones over time.

In the old way, how long do the chunks stay on the disk? I configuring -vfs-cache-max-age will be valid for the old cache too?

bhh44 · June 26, 2020, 6:04pm

Thanks, the smaller buffer definitely helps with the initial scan. I have some thoughts on an enhancement that may or may not be feasible but will greatly help for the significant number of users wanting to use this cache for Plex.

Currently, a mount with the VFS=full on this beta, upon initial scan in Plex will download all .nfo files and a small chunk of each video file before scanning the files into the Plex database. On my server, this will take 18 hours or so with no existing cache files. But once all these fragments are in the cache, it will only take 10-20 minutes to complete a new scan to add any new items.

Here's the issue. Say all these small files take 10GB on disk and the cache max size is 50GB with each video being 10GB. After watching 4 videos, the cache is full so when you watch a 5th video, it will delete the oldest file in the cache, which would be in my case over 15000 small files that represent each item in the library. Next time it scans to update any new content, it will take 18 hours again to re-cache all those 15000 files rather than 20 minutes if it deleted one of the video files in the cache.

My request that will immensely improve this cache for Plex users: can you have a flag that will delete the largest file in the cache first when it hits max-size or a flag that sets a threshold so files under say 20MB do not get deleted from the VFS cache unless there is no other option? Thank you

Animosity022 · June 26, 2020, 6:30pm

Any existing analyzed/scanned content is not re-analyzed though so I'm not sure why you'd see that. Once a library is scanned, it only checks size/mod times and skips anything.

You can always run a mount for TV shows, mount for Movies or a mount for Music and split things up and cache that way.

Your use case is pretty much useless in my plex scenario (doesn't mean it's not super useful for you but I wouldn't make a generic Plex statement as the use cases for Plex are insane)

For me to really use this, I have to keep a different cache disk that's got some size to it and some cheap storage.

bhh44 · June 26, 2020, 7:16pm

That's interesting because that's exactly the behaviour I expected when I started this. I believe when I had it running on Linux it did work this way but I can't confirm. Using the latest Plex Media Server on Windows and the latest rclone beta from this thread, it definitely always requires the small chunks to be downloaded into cache of every file to re-scan. Perhaps one of my mount parameters are causing this behaviour if it's abnormal

I do already have different mounts for tv/movies which helps, but I'll always inevitably run into this scenario with all the small metadata files being deleted.

Rclone mount --allow-other --user-agent=RandomAgent --timeout=1h --cache-dir="D:\rclone_cache" --dir-cache-time=168h --max-read-ahead=512k --no-checksum --read-only --vfs-cache-max-age=700h --vfs-cache-mode=full --vfs-read-chunk-size=128M --vfs-read-chunk-size-limit=off --buffer-size=16M --vfs-cache-poll-interval=5m0s --tpslimit=35 --vfs-cache-max-size 50G --poll-interval=10m --log-level=NOTICE --log-file=D:\rclone_logs\rclone_movies.log --rc --rc-web-gui --rc-addr=localhost:27773 --rc-user=test --rc-pass=test --config="C:\Users\home\.config\rclone\rclone.conf" 720p: C:\ServerApps\teamdrives\720p

Animosity022 · June 26, 2020, 7:21pm

Windows or Linux, if it's rescanning files each time, something is incorrect in your setup or not working properly with Plex. A scanned library takes with the directory / file structure metadata in cache takes 20-30 seconds to scan since it's just comparing file information. I'd probably start a different thread and work that item out as it's not normal. @VBB can add some Windows knowledge as my setup is Linux. @VBB's library is also larger than mine on Windows and his scan is in the same realm. It should only be checking the file size /date/time on the file.

VBB · June 26, 2020, 7:54pm

Yes, I have a fairly large library, and I still have most of my movie folders together in one big folder (rather than in individually alphabetized sub-folders, i.e. A, B, C, etc.). This makes my daily scans take a long time (2 hours+). That's simply a limitation of Windows Explorer, and I've been too lazy to do something about it . I haven't done an initial Plex scan in years, so I can't comment on that.

I have no intention of using the new VFS cache mode, because, just like @Animosity022, it wouldn't make sense in my case. I run Plex on a dedicated (remote) server with limited storage space, but with unlimited gig speeds up and down, and no peering issues. But even if I had the necessary space, I just don't think this cache mode would make things better for a Plex user, unless you're having severe issues with normal VFS. I could be totally wrong, of course.

Rootax · June 26, 2020, 7:57pm

In some cases, I guess it can help with latency (local vfs cache vs requests on gdrive), and amount of api calls.

bhh44 · June 26, 2020, 8:42pm

VBB:

Yes, I have a fairly large library, and I still have most of my movie folders together in one big folder (rather than in individually alphabetized sub-folders, i.e. A, B, C, etc.). This makes my daily scans take a long time (2 hours+). That's simply a limitation of Windows Explorer, and I've been too lazy to do something about it . I haven't done an initial Plex scan in years, so I can't comment on that.

I have no intention of using the new VFS cache mode, because, just like @Animosity022, it wouldn't make sense in my case. I run Plex on a dedicated (remote) server with limited storage space, but with unlimited gig speeds up and down, and no peering issues. But even if I had the necessary space, I just don't think this cache mode would make things better for a Plex user, unless you're having severe issues with normal VFS. I could be totally wrong, of course.

So you're mounting it using standard VFS in the stable rclone version to cache the directory structure only using the non-beta rclone? Do you ever run into API bans? (Error 403)

Would you mind sharing your mount parameters? I could try mounting and making a new library to test it out and see if this is an alternate solution to the issue

VBB · June 26, 2020, 9:39pm

Yup, that's how I've been doing it for years. No API bans since early 2018 (when we all used the old cache mode). My current mount, based on @Animosity022 always awesome settings:

rclone mount --attr-timeout 1000h --dir-cache-time 1000h --poll-interval 0 --rc --read-only --vfs-read-chunk-size 32M -v

Once mounted, I prime it using the following:

rclone rc vfs/refresh recursive=true --fast-list -v

Then I run my daily Plex scan.

Both commands are launched from batch files. I always use the latest available beta version of Rclone, but I have not used the branch that introduced the new cache mode. I'm currently on v.1.52.2.130.

For any uploads, I use RcloneBrowser by @kapitainsky.

For any editing/moving/copying within the mount, I use this super simple command:

rclone mount -v

bhh44 · June 26, 2020, 10:19pm

Wonderful, I'll mount this and compare it to the new VFS backend to provide feedback. One question, with poll-interval set to 0, wouldn't there be no updates reflected for 1000h?

VBB · June 27, 2020, 2:33am

Yes, my mount is meant to be static until the next manual prime.

ncw · June 27, 2020, 8:37am

This seems like a bug... I fixed a very similar bug here which happened with --buffer-size 0. Now you don't have buffer size 0 but maybe the underlying cause is similar.

Can you try this beta and if it still doesn't work then post a link to a log with -vv - thanks!

https://beta.rclone.org/branch/v1.52.2-175-g0b033fcc-vfs-beta/

PS The above is ready to merge now and unless something comes up over the weekend I'll merge it to master next week

bhh44 · June 27, 2020, 8:46pm

Thanks. I've tried running your exact command with the same Rclone version and I've observed similar behaviour where some folders are scanned instantly but others Plex wants to look deeper into some files so a piece of each file is accessed. This is when re-scanning, not the initial scan.

It seems the only foolproof way to have fast scans is to keep that chunk of the files in the cache but then we run into the scenario in my previous post where a larger video file will wipe out thousands of small cached items. @ncw Is it feasible to add any parameters to allow some control over which files get removed from a full cache first?

Animosity022 · June 27, 2020, 8:57pm

If that's the case, something is broken in your setup as that's not how Plex or Emby works. Many, many folks run with no cache as for months/years without having files rescanned.

As i mentioned before, happy to assist and just start a new post and list out your details.

VBB · June 27, 2020, 10:06pm

In my case, the reason the scan takes so long is due to the thousands of individual folders. Windows doesn't handle that well. Even when I had everything locally on my NAS, it took just as long. I know for sure it's that, because my TV Shows library with 100,000+ episodes scans within minutes. Cache wouldn't make things any faster here. Quite the opposite, probably.

Justin_Wedepohl · June 28, 2020, 4:47pm

I've been playing around with the latest beta for an SFTP mount and not sure if I'm a moron or this is a new VFS beta thing. When items are removed from the source file system the deletions aren't showing up and exist in the VFS cache until I manually delete them. I've tried a vfs/refresh and vfs/forget. What am I missing? vfs-poll-interval should pick up any deletions on source right?

  --allow-other \
  --allow-non-empty \
  --async-read=false \
  --buffer-size=16M \
  --copy-links \
  --fast-list \
  --dir-cache-time=1m \
  --include Media/** \
  --max-read-ahead=200M \
  --poll-interval=30s \
  --rc \
  --rc-addr=localhost:5573 \
  --rc-htpasswd /home/plex/.htpasswd \
  --umask=002 \
  --sftp-disable-hashcheck \
  --syslog \
  --verbose=2 \
  --vfs-cache-max-age=24h \
  --vfs-cache-max-size=200G \
  --vfs-cache-mode=full \
  --vfs-cache-poll-interval=1m \
  --vfs-read-chunk-size=64M \
  --vfs-read-chunk-size-limit=2048M \

Animosity022 · June 28, 2020, 4:56pm

It only works for a backend that supports polling.

You should see this with SFTP mount:

2020/06/28 12:56:05 INFO  : sftp://felix@localhost:4022/: poll-interval is not supported by this remote

Justin_Wedepohl · June 28, 2020, 5:24pm

Thanks @Animosity022, any creative suggestions on the best way to keep the local cache and remote file system synced?

guizadas · June 28, 2020, 6:32pm

Silly me - you are right. Updated to the latest and it is now working as expected. No issues to report so far!