New Feature: vfs-read-chunk-size

BinsonBuzz · June 29, 2018, 9:11pm

I’m trying to work out how --buffer-size works with -vfs-read-chunk-size and whether it adds any value.

As I understand it, if I have --vfs-read-chunk-size 10M --vfs-read-chunk-size-limit 512M --buffer-size 100M then the first 10M is grabbed, then 20M, then 40M, then 80M and so on.

At what point does the 100M buffer get filled if vfs is continually grabbing more data until the limit is hopefully reached, linespeed permitting?

B4dM4n · June 29, 2018, 11:32pm

-vfs-read-chunk-size will only request request blocks of the given size from the backend (for HTTP based remotes by using the Range header). It will not buffer these blocks.
--buffer-size will do this for you. It always tries to keep the specified size of data in memory.

You are right, this is how the chunk size would grow.

The 100M buffer will get filled up as fast as your connection allows it, as soon as you open the file. If you have a 1 Gbit/s connection, rclone will read the first 10M chunk into the buffer in around 100ms and request the next chunk from the remote. This will be copied into the buffer as well, until the 100M are full.

Because the buffer gets filled right from the start until full, I would recommend setting --vfs-read-chunk-size to at least the --buffer-size. Opening HTTP connections can take some time, therefore setting --vfs-read-chunk-size to low can reduce your download speed at the beginning of a file (and after seek calls).

BinsonBuzz · June 30, 2018, 7:24am

Thanks - that makes sense to me now i.e. there's no point requesting 10, then 20, then 40 and a bit of the next 80MB chunk at the start, if the file won't start playing until 100MB is in the buffer i.e that's 3 wasted requests.

I'm going to go with 128M like in your example and --vfs-read-chunk-size-limit 512MB which I assume will mean I should always have 128M in the buffer and roughly another 512MB queued up once the ceiling is hit.

B4dM4n · June 30, 2018, 8:01am

Filling the buffer will happen in the background. There is no waiting for a full buffer. Once the data is in the buffer, it can be read by the vfs at any time.

Animosity022 · June 30, 2018, 11:30am

There’s probably a nice sweet spot for the chunk size/buffer config depending on your server and your line speed.

Think of the buffer as more of the ‘backup’ if you line get congested a little or something happen.

The chunk size is more how much you may have to wait for something to start or return. Having a smaller chunk size mitigates a slow download a bit but makes more calls. If you had a larger pipe and it was reliable, making less calls and getting more data would probably be more efficient.

Do some testing and see what works best for you.

ybd · July 3, 2018, 9:41am

Hey everyone,

Thanks for all the work that has gone into this.

My head is a small bit melted from this.

What is the difference between running the mount command with --vfs-cache-mode writes and without using the VFS chunked reading?

B4dM4n · July 3, 2018, 10:58am

They are independent features.

--vfs-cache-mode writes will create a local cache file when a file is opened for writing on your mounted filesystem. After the file is closed, rclone will move this cache file to the real destination.

VFS chunked reading changes the way read only files are downloaded from the remote. There should be no impact on download speed when --vfs-read-chunk-size is enabled, unless it is set too low.

Setting --vfs-read-chunk-size 64M --vfs-read-chunk-size-limit off should work for most users (these will probably be the default values soon).

ybd · July 3, 2018, 11:40am

Thanks B4dM4n, that helps clear things up for me.

So in what scenario would it be beneficial to use --vfs-cache-mode writes vs not?

I’m trying to improve the start up times for media while also optimising upload speed, I’m using latest rclone beta.

Would there be a way to use vfs, and that Radarr/Sonarr could copy files in and it could be watched in Plex straight away? I tried to use --cache-tmp-upload-path and --cache-tmp-wait-time 60m but it still looked like it uploaded straight away and ignored the tmp upload path. I tried this both with and without --vfs-cache-mode writes. Is that tmp upload path just for cache mount?

Also in my log file, transfers are showing like this where it says 0% /off instead of the full percent when Radarr/Sonarr are copying files over. Is this related to issues above?

2018/07/03 13:36:16 INFO  :
Transferred:   16.083 GBytes (21.956 MBytes/s)
Errors:                 0
Checks:                 0
Transferred:            4
Elapsed time:      12m30s
Transferring:
 *   ...layer One 2018 Bluray-1080p.mkv.partial~:  0% /off, 20.453M/s, -

Here is my current mount if it helps:

/home/ybd/rclone-beta/rclone mount gdcrypt: /home/ybd/gmedia \
 --allow-other \
 --dir-cache-time 96h \
 --vfs-read-chunk-size 128M \
 --vfs-read-chunk-size-limit off \
 --vfs-cache-max-age 48h \
 --cache-total-chunk-size 50G \
 --cache-workers 6 \
 --buffer-size 64M \
 --checkers 32 \
 --transfers 8 \
 --drive-chunk-size 128M \
 --stats 10s \
  --attr-timeout 10s \
 --log-file /home/ybd/logs/gmedia.log \
 -v

(i took tmp upload path out as it seemed to do nothing)

There’s probably some commands in there that are obsolete and don’t work with VFS but not entirely sure so said I would post it.

Thanks a lot in advance I really appreciate it.

Animosity022 · July 3, 2018, 11:53am

You can remove all the -cache commands as those don’t do anything if you aren’t using the cache type on the mount.

I found it easier to user a unionfs/mergerfs mount and I just script my rclone moves to get items to my GD. Everything can be watched instantly since they are locally there and the GD picks up new items every minute.

vfs cache writes uploads the file immediately when it done as there is no pause for that.

ybd · July 3, 2018, 1:19pm

Brilliant mate, thank you.

I am thinking that that would be the best approach too. Could you share your full setup with the scripts to move and how you set up the union/mergerfs?

Is there ever an issue when the upload finishes and you’re watching a file from the local version? Is there a “blip” or does the remote version just not get streamed as it’s seen to be the same? Would really appreciate it.

Also as an aside, I’ve read your posts about looking at time mediainfo. I get longer mediainfo times for smaller files, and shorter times for bigger files. Is mediainfo important for startup times or is it just for analysing files when plex is scanning?

Thank you, I really appreciate the help.

Animosity022 · July 3, 2018, 1:36pm

My current is GD -> Crypt and mounting the crypt via:

felix@gemini:/etc/systemd/system$ cat rclone.service
[Unit]
Description=RClone Service
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/bin/rclone mount gcrypt: /GD \
   --allow-other \
   --dir-cache-time 48h \
   --vfs-read-chunk-size 32M \
   --vfs-read-chunk-size-limit 2G \
   --buffer-size 512M \
   --syslog \
   --umask 002 \
   --log-level INFO
ExecStop=/bin/fusermount -uz /GD
Restart=on-abort
User=felix
Group=felix

[Install]
WantedBy=default.target

I use mergerfs as I just like it better, but that’s some personal preference. My mergerfs writes to the first entry always and my rclone mount is RW.

My systemd:

felix@gemini:/etc/systemd/system$ cat mergerfs.service
[Unit]
Description=mergerFS Mounts
After=network-online.target rclone.service
Wants=network-online.target rclone.service
RequiresMountsFor=/GD

[Service]
Type=forking
User=felix
Group=felix
ExecStart=/home/felix/scripts/mergerfs_mount
ExecStop=/usr/bin/sudo /usr/bin/fusermount -uz /gmedia
ExecStartPost=/home/felix/scripts/mergerfs_find
Restart=on-abort
RestartSec=5
StartLimitInterval=60s
StartLimitBurst=3

[Install]
WantedBy=default.target

I use a script to run the actual mount for more flexibility.

felix@gemini:~/scripts$ cat mergerfs_mount
#!/bin/bash

# RClone
/usr/bin/mergerfs -o defaults,sync_read,allow_other,category.action=all,category.create=ff /data/local:/GD /gmedia

I’ve found that sync_read is needed for unionfs and mergerfs (ymmv).

and I do a little find to prime/warm up the cache:

felix@gemini:~/scripts$ cat mergerfs_find
#!/bin/bash
/usr/bin/find /gmedia &

My thought process for my settings is that a transcode will use the plex setting and read ahead for 600 seconds so that handles buffering on its own. If direct stream, I rather have a 512M buffer as no transcode happens so buffering in rclone seemed better.

I run a script over night to move from local to the cloud and do some clean up. Not the prettiest but effective:

felix@gemini:~/scripts$ cat upload_cloud
#!/bin/bash
LOCKFILE="/var/lock/`basename $0`"

(
  # Wait for lock for 5 seconds
  flock -x -w 5 200 || exit 1

# Move older local files to the cloud
DIR="/data/local/Movies"
if [ "$(ls -A $DIR)" ]; then
/usr/bin/rclone move /data/local/Movies/ gcrypt:Movies --checkers 3 --fast-list --syslog -v --tpslimit 3 --transfers 3
cd /data/local/Movies
rmdir /data/local/Movies/*
fi

# Radarr Movies
DIR="/data/local/Radarr_Movies"
if [ "$(ls -A $DIR)" ]; then
/usr/bin/rclone move /data/local/Radarr_Movies/ gcrypt:Radarr_Movies --checkers 3 --fast-list --syslog -v --tpslimit 3 --transfers 3
cd /data/local/Radarr_Movies
rmdir /data/local/Radarr_Movies/*
fi

# TV Shows
DIR="/data/local/TV"
if [ "$(ls -A $DIR)" ]; then
/usr/bin/rclone move /data/local/TV gcrypt:TV --checkers 3 --fast-list --syslog -v --tpslimit 3 --transfers 3
cd /data/local/TV
rmdir /data/local/TV/*
fi

) 200> ${LOCKFILE}

In theory, if someone was playing a file while it was moved, you may get a blip, but only would be a minute. You could schedule the moves to be at off hours to minimize that impact.

ybd · July 3, 2018, 1:58pm

This is extremely helpful and I’m going to try to replicate this as it seems to give the best of both worlds. Instant playback in plex, and uploading happening overnight makes a lot of sense too in order to not interfere with any torrenting.

Few questions

I don’t really understand mergerfs, only unionfs, but is that pretty much saying /data/local is the folder to write to, and /gd is the folder to read from?

The reasoning for the 32M/2G on chunk-size/chunk-size-limit, are these just what you have found worked best for you? I am currently doing 64M or 128M and 2G or off. Any reasoning behind your values? (tyvm in advance) do you find the buffer size gave the best start time also?

Does the find command just warm up the directory cache or is there a file cache too?

Also, what do you use to notify plex of new content? Setting in Radarr/Sonarr I am assuming? With this set up it would work perfect as you aren’t waiting for any remote uploading to happen, I assume.

Thank you very much. Extremely helpful.

Edit: Do you point plex at the merged file? What happens if plex tries to delete a file? It does it through rclone I am imagining?

Animosity022 · July 3, 2018, 2:19pm

So mergerfs has a ton more you can do with it, but that also makes it much more complicated.

What my specific mergerfs does is for any create (directory or file), it will always write to the first entry:

Both locations are read/write so I can always delete/rename/move/ and basically pretend it’s not there.

All my Sonarr/Radarr/Plex/everything points to /gmedia which is the mergerfs mount point of my local storage and my GD. If you get any specific questions on mergerfs, I can try to help or the issue on his github is good as he’s very responsive as well. I didn’t like the hidden .unionfs files and crap which moved me away from that and onto mergerfs.

I’ll probably bump to 64M and see how that works as that seemed like a better point. I wanted to test my move from 10M to 32M for a little longer and get more data. The buffer-size doesn’t really do anything positive or negative for start times as that just keeps more in memory for playback so if you have a blip or something, it already has the data.

The “cache” for dir-cache is only in memory and nothing written locally. I just like to prime it as part of the mount so it’s instantly responsive. In this case, think of the directory/files as a whole entity. If something changes in a directory and it detects it via the 1 minute polling, it’ll refresh it and get a new listing of the directory/files.

I use plex-autoscan as a custom hook in Sonarr/Radarr so it just updates the folders changed and for me, I like to use the empty trash feature as if something is upgraded or deleted, it will empty the trash and you can give it a failsafe so if it finds more than 50 files, it won’t auto empty.

ybd · July 3, 2018, 3:23pm

Great, I’m actually running mergerfs now as you are and it seems to be working great. Even the caching is way better than unionfs. I run mediainfo once on unionfs and repeat calls have to be repeated. Whereas on mergerfs they are instant.

My question is, when you try to delete media from inside of plex I get an error saying there was a problem deleting that item. I mean it’s not a huge deal, but it would be nice. Can you do this sucessfully?

Animosity022 · July 3, 2018, 3:54pm

Yes, I can delete from Plex without an issue. Any permissions things like that possibly going on?

ybd · July 3, 2018, 5:36pm

Yep you were right, permissions issue with plex. Working now. Brilliant.

Linhead · July 4, 2018, 9:13am

+1 for Mergerfs…

It allows adding/removing data-sources without remounting - so my Plex is served via rclone-cache however if that has any issues (bug #2354) I have a monitoring script that switches it over to plexdrive until rclone-cache behaves itself again, as an example, this means I don’t have a Plex library convinced that many of its files has gone missing when a mountpoint has a temporary issue

chkk · July 4, 2018, 10:28am

I moved from a cache / crypt gdrive mount to a crypt only mount with the new vfs-read-chunk-size/limit options. Plex performance is much better and scans are reliabe and fast.

What I do miss is the automatic upload caching done before using the cache-tmp-upload-path/wait-time options. Uploading files became available immediately without any complexity and overhead added for unionfs, background scripts for sync etc.

Is there any way with rclone to do get this immediate availability of uploading files (without having to use unionfs and background upload scripts)?

snickers2k · August 10, 2018, 6:38pm

Hey. Thanks for your guys work. I’m using Animosity022’s last posted config. No Bans, so far so good.
But two Problems here.

Plex-Buffer is very small compared to PlexDrive.
Plex sporandically stops playing an Video after 1 sec playing (even very small ones). Sometimes i have to start the Video 3-4 times still it’s finally working.

How can i improve things here? thanks.

Animosity022 · August 10, 2018, 7:23pm

Debug logs would be your best bet as anything else would be guesses.