Testing for new vfs cache mode features

Yup, I will PM you one in just a bit.

Ah, I see, I guess that would be what I want then. Ideally, a dynamic read buffer that can grow similar to vfs-chunk size. ie. Would it be possible to detect sequential playback vs random playback?

File 1 is opened at Point 1, 2, 3.... and if so grow buffer.

File 1 is opened at Point 1, 87, 173... and if so don't grow buffer. This way a mediainfo, Plex scan, or seeding (lots of random read) wouldn't download unneccessary chunks.

I'll try that. Thank you for the clarification.

Ah ok, interesting. Are you aware if Emby/JF operate a different way? What about using MPC/VLC to play from a cloud mount. I'm guessing the latter is definitvely sequential with a single file open.

As I understood it, the buffer size is controlling the read ahead and how much to cache to disk rather than to memory (it does affect memory in non-cache implementations). I have a large amount of disk space (not memory) and I am happy to cache whatever for the sake of playback convenience.

The ideal case would be disk-level playback and seeding low-trafficked files at high speeds. ie. 3 year old torrent with 2 seeders.

Seeding scenario:

  1. Peer requests chunk from file.
  2. Chunk is sent, complete file downloaded in background.
  3. Rest of the chunks are reqested and sent with minimal latency, since everything has been cached.

Disk-level playback:

  1. Useful for skipping around a movie (by chapters for example)
  2. Skipping intros for TV shows/logos for movies (1-2 min of cache ahead is good here)

With a big enough cache, complete file caching might even allow me to easily cache frequently played content (ie. newest blockbuster movie is played often so it just never expires from cache). Having the full file on disc ready to go means I can sometimes even avoid the initial 5s load time which has become normal for me, but isn't for others who are used to Netflix for example.

The way it works at the moment is if you read from point 1 then skip to point 7, rclone will carry on reading --buffer-size data from point 1 ahead and fill up the buffer, so you'll end up with 2x --buffer-size data but no more read from point 1. Rclone times out after 5 seconds of inactivity for each read point at which point the buffer will be dumped to disk and the stream closed.

You'll notice the --buffer-size parameter was used in two ways there, one for the read-ahead and one for the size of the buffer. We could have another parameter, lets say --vfs-read-ahead to control this instead.

This will leave --buffer-size controlling the size of the read-ahead buffer (yes another kind of read-ahead this time from network to memory) and --vfs-read-ahead controlling the size of read-ahead that gets stored to disk.

If you were to set this to say 1G then rclone would read up to 1G ahead of where the reader had read to. It has only got 5 seconds to do this though, so in practice rclone would read ahead 5 seconds of data transfer from point 1 (assuming your connection can't read 1G in 5 seconds - if it can then it will read 1G).

For point 7, because the reader is still reading from that point slower than rclone can read from the network, then eventually rclone will get the full 1G ahead + --buffer-size.

When I did this work originally I didn't want to add another VFS parameter because, quite honestly, there are too many parameters as it is! However we've seen from the mostly positive experiences in this thread that --buffer-size is probably a reasonable default so I could introduce --vfs-read-ahead and default it to "off" which will mean - use the value of --buffer-size.

I think this would do pretty much what you want provided you don't mind downloading 5 seconds of data on each seek.

I've taken a look at the log... What happens is

  1. Plex reads a bit from the start of the file
  2. Rclone starts a reader
  3. Rclone waits for the reader
  4. Rclone returns the data to Plex - all good
  5. Plex reads a bit from the end of the file (4GB or so offset)
  6. Rclone notices it has a reader already (from step 2) and that it is within --buffer-size (10G) of the read request point (4G) so it waits for the reader to get to that point
  7. After reading for 30 seconds or so and having cached 4GB of data, rclone returns the 7k that Plex requested

Not starting a new stream when we have one within --buffer-size is useful because that stream might already have the data in its buffer and it is best not to start a new stream if we don't have to. However when buffer size is 10G things work as you see above!

So I think this another argument for keeping --buffer-size small and having a --vfs-read-ahead parameter. Let's say you had --buffer-size 16M and --vfs-read-ahead 1G, in this case what would have happened is that at step 6, rclone would have opened a new stream. which would have stopped that 30 second delay.

Looking at this log I can see plex opens and closes the file 4 times.

This is another thing leading to delays because when plex closes the file, rclone closes all the readers on that file. So after doing the media info it had a perfectly good reader at the start of the file which was closed because plex closed the file before re-opening it to play the video.

So that makes me think rclone should maybe keep the readers open until they expire due to getting to the end of the file, or 5 seconds of inactivity, rather than closing them when the file is closed. I'm not 100% sure whether that would be easy or not but it would improve performance here I think. It might kill performance elsewhere though so needs thought.

When not using --vfs-cache-mode full setting --buffer-size large is effective at stopping buffering at the potential cost of using lots of memory. Because rclone doesn't store stuff to disk here, it stores the read-ahead buffer in memory. However when using --vfs-cache-mode full we don't need to store the buffer in memory as we can store it on disk, so this buffer can be small as its sole purpose here is to allow a small amount of read-ahead in the network.

Summary

  • big --buffer-sizes are bad - maybe rclone should emit a warning here when using --vfs-cache-mode full suggesting setting --vfs-read-ahead instead?
  • a --vfs-read-ahead parameter looks like a good idea
  • not closing the downloaders when the file is closed might be a good idea (more thought needed)
  • waiting to hear back from you whether using --vfs-read-chunk-size 128M (or larger) helps with the initial slow start with --buffer-size 16M

I think I'll split the VFS docs out of the code and put them on their own page so I can write more about everything!

4 Likes

That looks like a good idea.

I think that would require more testing and I'm hesitant to make big changes there. Use case wise, I think most people are streaming things? I'm sure there are other use cases though other than just streaming.

I've never seen a slow start and I've been using the defaults now for my testing anyway.

2 Likes

how do you feel about a --vfs-read-ahead-delay (or a hard coded value), so that --vfs-read-ahead is only used after a certain time (or bytes?) of sequential read activity, to be fairly sure we actually need it?

Here is a beta with the --vfs-read-ahead flag @aga72 can you give this a go with a normal sized --buffer-size and making --vfs-read-ahead big?

v1.52.2-283-gd786027e-vfs-beta on branch vfs (uploaded in 15-30 mins)

Interesting idea.. Note that if you read 1 byte from a file you'll only get 5 seconds of read-ahead before the downloader times out so it isn't unlimited unless you keep reading from the file.

1 Like

I've tested it sucessfully with default --buffer-size and --vfs-read-ahead 4G.
Works great! Starts streaming fast as always AND it seems like the whole File (3,5 GB x265 Movie) is downloading via --vfs-read-ahead

Great!

I'm just giving it the VFS torture test and if it passed I'll merge it.

While investigating this I discovered a bug which was probably affecting everyone. If you are streaming a file slower than the network can give it then it kept closing the downloader and reopening it whenever the buffer was full. This would have potentially lead to stutter. It is certainly inefficient. Fixed now!

2 Likes

What would be a good value for --buffer-size with a big --vfs-read-ahead such as 4G?
(with 10G link and 32GB RAM)

The purpose of --buffer-size is for the network read-ahead buffer allowing the thread reading from the network to get a bit ahead of the thread reading the memory. With a large read ahead buffer I don't think buffer-size needs to be big - the default of 16M should be OK. You could try making it bigger 32M or 64M or even removing it. All those choices have consequences which are difficult to predict in advance!

Thanks for your advice. Will keep testing it with default --buffer-size

Will this features be merged to beta soon as i am looking for these features but is running in docker and don't want to build it by myself :slight_smile:

It’s already in beta.

It's in a separate branch, not in the master branch from where the beta builds go out.

So when would that branch be merged to master? :slight_smile: looking forward for these functions.

I've merged it now - I just needed to complete the VFS torture test on it!

v1.52.2-285-g4d7f9130-beta on branch master (uploaded in 15-30 mins)

2 Likes

--vfs-read-ahead default size 16M or ?

Hello I am trying u use --vfs-read-ahead for plex but I dont know if is correct what I do....

I have use curl https://rclone.org/install.sh | sudo bash -s beta to upgrade my rclone on ubuntu on raspberry pi 4, now I have a version : v1.52.3-294-g61c7ea40-beta

I have this configuration:

[team1prueba]
type = drive
client_id = *******
client_secret = *********
scope = drive
root_folder_id = *********
token = ********
team_drive = *********

[team1pruebacache]
type = cache
remote = team1:/prueba
plex_url = http://xxx.xxx.x.xx:32400
plex_username = *********
plex_password = ********
chunk_size = 10M
info_age = 1d
chunk_total_size = 10G
plex_token = **********

I use this command:

rclone mount team1prueba:/ /home/ubuntu/team1prueba --allow-other

and

rclone mount team1pruebacache:/ /home/ubuntu/team1pruebacache/ --allow-other --vfs-read-ahead 5G

Plex takes data from team1pruebacache but i think it dosnt work, I think I do everoghit bad :)....

Thanks.

You need to remove the cache backend as this is not part of it. Point to your regular remote if you want to test this.

Hello this?.

[team1pruebacachevfs]
type = drive
client_id = **********
client_secret = *********
scope = drive
token = **********
team_drive = **********
plex_url = http://xxx.xxx.x.xx:32400
plex_username = ********
plex_password = ********
plex_token = *************

using:

rclone mount team1pruebacachevfs:/ /home/ubuntu/team1pruebacache2/ --allow-other --vfs-read-ahead 5G

I have test it but con rclone/.cache dir not show anyghint film play but dosnt doenload to cache

You can remove all of this.

You need to add this to your command to use the new feature.