That is a little surprising... I'm interested in your investigations.
debug.log.zip.txt (265.0 KB) I took one file now and just turned debug on and grepped that out. It's just about a minute for that file.
Just remove the txt end and unzip it.
Oh I'm sorry as I figured out what's different...
Plex added the intro analysis and since I'm removing and adding them back in, it's re-doing the intro analysis so taking a minute rather than 3-10 seconds is actually expected.
Whew! I think once you've got the little boys of the files Plex reads cached the it should be a lot quicker.
I've been trying (and loving) the new cache mode, but I can't quite get it to work how I want no matter what options I try.
--cache-dir /mnt/vfscache \ --vfs-cache-mode full \ --vfs-cache-max-age 240h \ --vfs-cache-max-size 100G \ --vfs-read-chunk-size 16M \ --vfs-read-chunk-size-limit=off \ --buffer-size 1G \ --dir-cache-time 24h \ --multi-thread-streams=4 \ --multi-thread-cutoff 4G \
Here are the relevant options that I am messing around with.
Here is my use case, and expected behaviour:
- Play media file.
- Start right away (low initial vfs read chunk size to help with this?) and download continuously at max speed in the background sequentially (chunk size continues to grow allowing me to hit faster speeds).
- Continue to cache the WHOLE file assuming it's been open and read from multiple times (streaming basically, as opposed to a mediainfo).
I figured this is what read chunk size + no size limit would allow me to do, but it caches some of the file, then stops, until I skip forward to an unbuffered part.
I then tried increasing buffer-size, which DID make a difference. Setting it to something obscenely high like 1G or 2G meant that rclone would download at max speed until it filled the buffer. However, the problem with a high buffer size was that the stream wouldn't start until this buffer was full. Shouldn't the stream start playing even while this initial buffer is being cached?
As for multi-thread streams, documentation says it works with rclone mount, but not sure how it plays together with this new cache implementations.
So am I misunderstanding cache-mode full? Or am I misunderstanding the options and doing something wrong?
It should, yes but maybe something has got lost in translation...
When I tried this I couldn't replicate the problem - it started delivering data as soon as the file was opened no matter how large the --buffer-size
Can you test this again with the latest beta?
Where does the docs say it works with
rclone mount? It used to work with the old
--vfs-cache-mode full but won't any more.
Does this mean that every stream is single threaded now? Was that the same without vfs caching? Can chunked downloading be enabled with the new caching logic?
If you have multiple readers on the file then it will multi thread
The VFS has only ever been single threaded, except for when using cache mode full.
Yes that is how it works now. It will download chunks of the file into a sparse file and store a bit of metadata to keep track of which chunks are downloaded.
So... finally got some time to test again and here are my results with the latest beta and it seems I'm getting the exact same issues.
rclone v1.52.2-280-gf50ab981-beta - os/arch: linux/amd64 - go version: go1.14.6
EDIT: All files are being direct played. Forgot to mention.
Using Plex for playback and my relevant settings again for reference:
--cache-dir /mnt/vfscache \ --vfs-cache-mode full \ --vfs-cache-max-age 240h \ --vfs-cache-max-size 100G \ --vfs-read-chunk-size 16M \ --vfs-read-chunk-size-limit=off \ --buffer-size 10G \ --dir-cache-time 24h \ --multi-thread-streams=4 \ --multi-thread-cutoff 4G \
Note the 10G buffer size, this is the primary variable I tested.
Case 1: 10G buffer size, small movie (4.6 GB):
Graph Explanation: Clicking Play on the movie led to a ~30s wait, during which the complete movie was cached, and only then did it start. This is shown on the network graph, confirmed by running a size check on the cache directory. Once completely cached, playback was smooth, and skipping was as expected of a local file.
Performance: Top notch, maxed out my Google Cloud VPS's 2gbit/s quite effectively.
Problem: 30s load time before starting any movie, which means file is not ready until complete file is cached in this case.
Case 2: 10G buffer size, big movie relative to cache size (21 GB):
Graph Explanation: Clicking Play on the movie led to a ~5s wait, during which some of the movie was cached (~2-5gb). Playback and scrubbing was smooth for the cached part. Skipping to non-cached part of the file led to a 10-20s wait (longer than the initial 5s for the movie load???).
Performance: Initial load maxed out the connection (2gbit) but only for part of the file, after which speeds fell off a cliff (500 mbps). Skipping ahead allowed me to see faster speeds(1gbit), BUT not as fast as the initial speed.
Problem: This test performed much closer to what I expected with quick start playback. However, it did not cache 10gb of the file as I expected. Skipping ahead to an unbuffered part also took a longer time, while also not caching another 10gb.
Case 3: 16M buffer size, small movie (4.6 GB):
Graph Explanation: Clicking Play on the movie led to a ~5s wait, during which a very small part of the movie was cached. Scrubbing up to 30s in any direction of cached content seems to be fine. Skipping to an unbuffered part of the movie leads to a spike.
Performance: With a lower buffer size, network performance takes a significant hit. Speeds are in the 100-150 mbps range at best.
Problem: Playback is essentially similar to vfs-read, except for slightly improved rewind/forward 30s skip since it's cached at the expense of a cache directory. Expected behavior is to keep growing the cache since read chunk size doubles as the file is read.
Case 4: 16M buffer size, big movie (21 GB):
Same as case 3 except the amount of file being cached is the same 100-200M, which in the Case 3 meant 30s of playback in any direction, but due to the higher bitrate, it's more like 10s of playback in any direction. Spikes on the graph are: Initial File open, forward to uncached area (2 spikes), and forward again (3 spikes). Slightly higher peak perforamnce.
Case 1: Complete file was cached before read.
Case 2: Expected behaviour, did not cache 10G.
Apologies if this got a little too long. I would include a DEBUG log however it simply gets way too long in just a small timeframe of testing, and I am unsure on how to extract only the relevant info for you instead of sending a ridiculoudly long log file. I am unsure of how to make a simplified test case.
And just to confirm, some questions:
--vfs-read-chunk-size 16M \ --vfs-read-chunk-size-limit=off \
Should these two settings allow me to exceed the specified cache size?
I also expected some other issues like playback completely stopping, playback not starting at all, intermittent buffering even on cached areas (might have to do with server-client connection). I don't experience these issues with a read only vfs cache.
Sorry, I was just referring to the old docs, and figured the new cache-mode also supported it.
That's generally your problem as it's using 10GB per file and causing it to fill 10GB of buffer. Generally best to use defaults for everything unless there is a reason for changing something.
Pretty sure that does nothing as mounts don't use it.
No, those are just range requests for the HTTP call to Google Drive and have nothing to do with the caching.
Are you direct playing those files? That has a ton of impact on moving around and the speed of your Plex comes into play if you are direct streaming/transcoding.
Well as you can see in Case 2, it worked kind of how I wanted. It was caching, but also concurrently playing. The problem was it wasn't caching enough actually. And I do have a reason for changing defaults, I want to load the whole file when playback is detected (network speed, bandwidth constraints, data wastage is not an issue).
Yup, sorry I've edited that into the main post. Everything was direct playing.
Thanks for a comprehensive series of tests. That kind of high level test is very useful.
Ah, I think that must be the problem... Rclone must be trying to fill the buffer and since you've set it to 10G that means the whole file is downloaded.
I tried to replicate this and I couldn't, so it would be useful for me to see the debug log for Case 1. I'm happy to look at a long debug log - I'm used to it now! If you want you can post a link here or PM me a link. If it just has Case 1 in then it will be easy enough (hopefully!) to work out what is going on.
Your Case 3 is how I imagine people using
--vfs-cache-mode full with a smallish buffer-size.
--vfs-cache-mode full only forward caches
--buffer-size of data so it doesn't download stuff you aren't going to need. So the 100-150mbps is probably expected as it is closer to the actual bitrate of the movie.
It sounds like what you would like is for the vfs cache to continue reading the file. The read-ahead setting at the moment is set by
--buffer-size - this could be another parameter. You could potentially set this to unlimited to read the entire file.
Note that these two parameters control the chunks that are requested from the remote end, they don't control the sizes of things stored on the local end. They are most useful for making Range requests on the remote so it doesn't think you've downloaded the whole file when you haven't because you closed the connection early (so using up your quota). The most efficient thing would be to set
--vfs-read-chunk-size bigger - the default is 128M. That will certainly help with your slow starts... You could make it bigger still if you wanted say 1G.
--vfs-read-chunk-size 16M \ --vfs-read-chunk-size-limit=off \
That's unfortunately not how Plex generally works as it depends on how you are playing the file. For example, when you play, it open/closes the file 3 times so having a huge buffer makes things slower. If the files are being sequentially read/direct play.
If you have more streams going as well that would not work as well as it would be buffering a ton of dat not needed as majority of players have 'tiny' buffers on them so having something huge tends to be counter productive.
If you have plenty of capacity, why front load it as it would load it up anyway as it would happen naturally as it's being watched.
If you share that in your use case as to why, it helps others/me understand why you'd have a huge value as that would blow up majority of systems unless you really have a lot of spare memory.
Yup, I will PM you one in just a bit.
Ah, I see, I guess that would be what I want then. Ideally, a dynamic read buffer that can grow similar to vfs-chunk size. ie. Would it be possible to detect sequential playback vs random playback?
File 1 is opened at Point 1, 2, 3.... and if so grow buffer.
File 1 is opened at Point 1, 87, 173... and if so don't grow buffer. This way a mediainfo, Plex scan, or seeding (lots of random read) wouldn't download unneccessary chunks.
I'll try that. Thank you for the clarification.
Ah ok, interesting. Are you aware if Emby/JF operate a different way? What about using MPC/VLC to play from a cloud mount. I'm guessing the latter is definitvely sequential with a single file open.
As I understood it, the buffer size is controlling the read ahead and how much to cache to disk rather than to memory (it does affect memory in non-cache implementations). I have a large amount of disk space (not memory) and I am happy to cache whatever for the sake of playback convenience.
The ideal case would be disk-level playback and seeding low-trafficked files at high speeds. ie. 3 year old torrent with 2 seeders.
- Peer requests chunk from file.
- Chunk is sent, complete file downloaded in background.
- Rest of the chunks are reqested and sent with minimal latency, since everything has been cached.
- Useful for skipping around a movie (by chapters for example)
- Skipping intros for TV shows/logos for movies (1-2 min of cache ahead is good here)
With a big enough cache, complete file caching might even allow me to easily cache frequently played content (ie. newest blockbuster movie is played often so it just never expires from cache). Having the full file on disc ready to go means I can sometimes even avoid the initial 5s load time which has become normal for me, but isn't for others who are used to Netflix for example.
The way it works at the moment is if you read from point 1 then skip to point 7, rclone will carry on reading
--buffer-size data from point 1 ahead and fill up the buffer, so you'll end up with 2x
--buffer-size data but no more read from point 1. Rclone times out after 5 seconds of inactivity for each read point at which point the buffer will be dumped to disk and the stream closed.
You'll notice the
--buffer-size parameter was used in two ways there, one for the read-ahead and one for the size of the buffer. We could have another parameter, lets say
--vfs-read-ahead to control this instead.
This will leave
--buffer-size controlling the size of the read-ahead buffer (yes another kind of read-ahead this time from network to memory) and
--vfs-read-ahead controlling the size of read-ahead that gets stored to disk.
If you were to set this to say 1G then rclone would read up to 1G ahead of where the reader had read to. It has only got 5 seconds to do this though, so in practice rclone would read ahead 5 seconds of data transfer from point 1 (assuming your connection can't read 1G in 5 seconds - if it can then it will read 1G).
For point 7, because the reader is still reading from that point slower than rclone can read from the network, then eventually rclone will get the full 1G ahead +
When I did this work originally I didn't want to add another VFS parameter because, quite honestly, there are too many parameters as it is! However we've seen from the mostly positive experiences in this thread that
--buffer-size is probably a reasonable default so I could introduce
--vfs-read-ahead and default it to "off" which will mean - use the value of
I think this would do pretty much what you want provided you don't mind downloading 5 seconds of data on each seek.
I've taken a look at the log... What happens is
- Plex reads a bit from the start of the file
- Rclone starts a reader
- Rclone waits for the reader
- Rclone returns the data to Plex - all good
- Plex reads a bit from the end of the file (4GB or so offset)
- Rclone notices it has a reader already (from step 2) and that it is within
--buffer-size(10G) of the read request point (4G) so it waits for the reader to get to that point
- After reading for 30 seconds or so and having cached 4GB of data, rclone returns the 7k that Plex requested
Not starting a new stream when we have one within
--buffer-size is useful because that stream might already have the data in its buffer and it is best not to start a new stream if we don't have to. However when buffer size is 10G things work as you see above!
So I think this another argument for keeping
--buffer-size small and having a
--vfs-read-ahead parameter. Let's say you had
--buffer-size 16M and
--vfs-read-ahead 1G, in this case what would have happened is that at step 6, rclone would have opened a new stream. which would have stopped that 30 second delay.
Looking at this log I can see plex opens and closes the file 4 times.
This is another thing leading to delays because when plex closes the file, rclone closes all the readers on that file. So after doing the media info it had a perfectly good reader at the start of the file which was closed because plex closed the file before re-opening it to play the video.
So that makes me think rclone should maybe keep the readers open until they expire due to getting to the end of the file, or 5 seconds of inactivity, rather than closing them when the file is closed. I'm not 100% sure whether that would be easy or not but it would improve performance here I think. It might kill performance elsewhere though so needs thought.
When not using
--vfs-cache-mode full setting
--buffer-size large is effective at stopping buffering at the potential cost of using lots of memory. Because rclone doesn't store stuff to disk here, it stores the read-ahead buffer in memory. However when using
--vfs-cache-mode full we don't need to store the buffer in memory as we can store it on disk, so this buffer can be small as its sole purpose here is to allow a small amount of read-ahead in the network.
--buffer-sizesare bad - maybe rclone should emit a warning here when using
--vfs-cache-mode fullsuggesting setting
--vfs-read-aheadparameter looks like a good idea
- not closing the downloaders when the file is closed might be a good idea (more thought needed)
- waiting to hear back from you whether using
--vfs-read-chunk-size 128M(or larger) helps with the initial slow start with
I think I'll split the VFS docs out of the code and put them on their own page so I can write more about everything!
That looks like a good idea.
I think that would require more testing and I'm hesitant to make big changes there. Use case wise, I think most people are streaming things? I'm sure there are other use cases though other than just streaming.
I've never seen a slow start and I've been using the defaults now for my testing anyway.
how do you feel about a
--vfs-read-ahead-delay (or a hard coded value), so that
--vfs-read-ahead is only used after a certain time (or bytes?) of sequential read activity, to be fairly sure we actually need it?
Here is a beta with the --vfs-read-ahead flag @aga72 can you give this a go with a normal sized --buffer-size and making
Interesting idea.. Note that if you read 1 byte from a file you'll only get 5 seconds of read-ahead before the downloader times out so it isn't unlimited unless you keep reading from the file.
I've tested it sucessfully with default
Works great! Starts streaming fast as always AND it seems like the whole File (3,5 GB x265 Movie) is downloading via
I'm just giving it the VFS torture test and if it passed I'll merge it.
While investigating this I discovered a bug which was probably affecting everyone. If you are streaming a file slower than the network can give it then it kept closing the downloader and reopening it whenever the buffer was full. This would have potentially lead to stutter. It is certainly inefficient. Fixed now!