Testing for new vfs cache mode features

vfs read ahead is not a parameter anymore ? I guess the vfs read size + buffer parameters are doing the same thing ?

2020/06/09 20:50:04 Fatal error: unknown flag: --vfs-read-ahead

EDIT : I can confirm the new beta is working a lot better, it's not downloading like crazy anymore :slight_smile: Very well done. And I guess "- Download to multiple places at once in the stream" this will do wonder for p2p stuff too.

Sorry forgot to mention that I took that out! I came to the conclusion that the way the internal buffering works the read ahead is effectively --buffer-size which we have a flag for already.

Great!

Thanks! And yes it should do, though I should say that I didn't put a limit on how many downloaders you could have at once, yet! The worst case would be the size of the file divided by the --buffer-size in use...

What it will fix is the N people streaming from the one file problem which has been mentioned to me several times.

Do you think it would be possible to have stats about read cache hit ? Like the % of chunck reads that were in the cache, this sort of things ? I know I ask a lot : o

1 Like

So demanding :wink: I hadn't thought of making stats, I think it is a great idea. I'll put it on the list. It would probably be an rc command which could give you some stats on the vfs, maybe vfs/stats. Or maybe make it part of the general rclone stats - then it could come out in the prometheus stats.

1 Like

The newest beta addresses the speed issues I was seeing.

1 Like

That is great news! I've been testing it on my home internet which isn't the quickest and it is as fast as everything else. I should do some local testing too I think.

1 Like

I think i found a bug.

Log: https://pastebin.com/L8R2RU1Y

I cannot Play this file on a mount on Windows 10.
I Plays fine with RClone 1.5.1 with the same mount Settings.

Edit: I've i Change the Cachepath from the Default one to another drive then ist work, so i guess i'ts the local path lenght.

Just a comment. Because you're not using a db, it is now far easier to control the caching outside of rclone which is great.

For my use case, I can see myself wanting to run something like this periodically to clear some file types:

find vfs -type f -ctime +1 -regextype posix-egrep -iregex ".*.(mp4|m4v|mkv|iso|avi|srt|idx|sub|sfv)$" -size +10M -printf "%S\t%p\n" | grep -v "^0"

and pass that to xargs or similar to selectively prune certain files over X days old while leaving the rest long-term. Its easily scriptable now.

1 Like

The cache is really working well, that and the fact that the downloaded data are more in line with what is needed. my queries per 100 seconds went from around 150-200 to 40-60, for the same usage.

1 Like

Do you think the path is too long for the filing system? It will fall down there. Not impossible to fix but introduces more complexity.

Yes I really like having the files available in a nice hierarchy.

I may at some point put the metadata into a database but I'll leave the files just like that!

Hello. Can someone confirm that --dir-cache-time is working for them with this beta ? I've set it at 168h, and this morning a lot of the cache was cleaned, with the messages in the rclone console that it was too old...

Thx !

dir-cache-time controls the metadata caching not the file caching, you want these for the file cache

      --vfs-cache-max-age duration             Max age of objects in the cache. (default 1h0m0s)
      --vfs-cache-max-size SizeSuffix          Max total size of objects in the cache. (default off)

I might change these defaults...

1 Like

Yes i think the path is too long, when its stored in the default Location.

"C:\Users\XXXXXX XXXXXXXX\AppData\Local\rclone\vfsMeta\Google-G-Suite-Crypt"

A strange "bug", some files refuse to open :

2020/06/11 10:42:57 ERROR : IO error: open RW handle failed to open cache file: vfs cache item: check object failed: vfs cache item: open truncate failed: vfs item truncate: failed to open cache file: Le chemin d’accès spécifié est introuvable. (meaning tthe specified path does not exist).
Now the path is pretty long, but i don't remember having this kind of error before.

If you want I can give you access to 1 gbps servers for testing!! That shouldn't slow you down lol

I'm really interested in trying this new cache mode... I just have some questions:

  1. Does it handle the same file opened simultaneously very well?

  2. It seems that when a file is opened, it needs to be first buffered to disk before giving data to the application, isn't possible to do it simultaneously ? Like read first chunk from remote and serve it right away to speed up start times, then start caching the next chunks in async mode

  3. Would any of my current settings conflicts with whatever defaults you are using for this cache?

    --vfs-read-chunk-size=10M
    --vfs-read-chunk-size-limit=0
    --buffer-size=0
    --union-search-policy=eprand
    --async-read=false \

  4. Do you think storing the cache in RAM would speed up things? Since it looks like a random read pattern, where some HDDs could struggle? i.e can the average HDD max 1 gbps out if you are doing random reads to small chunks of files?

I just had a look through your log a bit more carefully

The path in question is 199 bytes. Added to C:\Users\XXXXXX XXXXXXXX\AppData\Local\rclone\vfsMeta\Google-G-Suite-Crypt" that could make it longer than 260 bytes. Windows docs on path length.

In the local backend we are careful to use paths which look like this

To specify an extended-length path, use the "\?" prefix. For example, "\?\D:\very long path".

Which do not have the 260 character limit. So it looks like I need to do that in the vfs cache also.

I'll factor out the code out of the local backend to do this and it should fix this problem.

Have a go with this - it uses UNC paths on Windows so shouldn't have this problem. It is otherwise unchanged.

https://beta.rclone.org/branch/v1.52.1-065-gc0601011-vfs-beta/ (uploaded in 15-30 mins)

Are you on Windows? If so try the beta in the post above

1 Like

Yes!

It writes each chunk to the disk then reads it back from the disk, however it should be in the OS cache at that point so be effectively instant. This could potentially be optimized further...

Let me know if it is a big problem - I don't think it should be.

Ah, you need --buffer-size to be set for the cache to be effective. I need to warn the user about this. I recommend the default of 16M to start with

Frequent files should get buffered by your OS so provided your working set < your RAM you should be OK. If your working set > your RAM and you are using HDD then nothing will save you except for SSDs!

No more error with this beta, thx : )

2 Likes