Testing for new vfs cache mode features

In case this helps convince you to add it to the roadmap :crossed_fingers:, both Windows & Linux do support zeroing out chunks to reclaim space, but it seems to be limited in the filesystems that it supports in Linux, however it does support all the popular ones. Docs:

https://man7.org/linux/man-pages/man2/fallocate.2.html

1 Like

To make this more noticeable, I pinned it so folks should see it more often now as it'll stay at the top. It'll expire at the end of the month currently.

4 Likes

Hey Nick,

It looks like you are storing meta by way of Item entries in go. Didn't see how this translated to disk storage though. Are you going to make the vfs meta (chunks and/or internal db) available to be stored on separate a storage medium? --vfs-cache-db /nvme --vfs-cache-files /hdd

Also, will you be allowing parallel uploads? --vfs-cache-upload-limit 99

Also, will you be allowing a separate upload bw limit? --vfs-cache-upload-bwlimit 99M

Here is the next beta

This should hopefully sort out downloading issues like downloading the entire universe in the background!

https://beta.rclone.org/branch/v1.52.0-044-g11d5be41-vfs-beta/ (uploaded in 15-30 mins)

- Download to multiple places at once in the stream
- Restart as necessary
- Timeout unused downloaders
- Close reader if too much data skipped
- Only use one file handle as use item for writing
- Implement --vfs-read-chunk-size and --vfs-read-chunk-limit
- fix deadlock between asyncbuffer and vfs cache downloader
- fix display of stream abandoned error which should be ignored
3 Likes

I'm storing them on disk in a parallel directory structure as JSON blobs, so in ~/.cache/rclone/vfs you'll find the data and ~/.cache/rclone/vfsMeta you'll find the metadata.

You can use --cache-dir to direct where you want the cache to go. There isn't a separate setting for the metadata - it is very small though.

The writeback cache allows --transfers uploads at once.

I think that is a separate issue (I'm pretty sure there is one) for a --bwlimit and a --txbwlimit and --rxbwlimit flags. Or make the bwlimit flag take some more funky parameters.

By that you mean it take into accounts the existing parameters ?

Will the transfers limit a total amount (upload and download)?

For instant we write 20 1gb files then try to read a file that is not in vfs cache does the read stall while we wait for the uploads to finish with --transfers 10?

Handling the asymmetrical bandwidth of most end points is what I'm looking to overcome.

it would be best if we could set an upload limit separately from a total or download limit.

It does and it uses the existing mechanism that VFS read code uses so we should keep the goodness embodied in that.

No , --transfers only limits the writeback upload. The downloads done will depend on the application using rclone.

Here are a couple of issues you might find interesting

and

I don't think it is too hard do. I'd probably try to think of a syntax for --bwlimit 10M where you could express the upload and download limit separately, maybe --bwlimit 10M:1M This syntax needs to not use a , or a space in order to fit into the bwlimit parsing.

Some ideas

--bwlimit 10M    # limit up and down to 10M
--bwlimit 10M:1M # limit down to 10M and up to 1M
--bwlimit up10M:down1M # limit down to 10M and up to 1M
--bwlimit 10Mu:down1Md # limit down to 10M and up to 1M

I think leaving out one of the two would set the other to unlimited

1 Like

vfs read ahead is not a parameter anymore ? I guess the vfs read size + buffer parameters are doing the same thing ?

2020/06/09 20:50:04 Fatal error: unknown flag: --vfs-read-ahead

EDIT : I can confirm the new beta is working a lot better, it's not downloading like crazy anymore :slight_smile: Very well done. And I guess "- Download to multiple places at once in the stream" this will do wonder for p2p stuff too.

Sorry forgot to mention that I took that out! I came to the conclusion that the way the internal buffering works the read ahead is effectively --buffer-size which we have a flag for already.

Great!

Thanks! And yes it should do, though I should say that I didn't put a limit on how many downloaders you could have at once, yet! The worst case would be the size of the file divided by the --buffer-size in use...

What it will fix is the N people streaming from the one file problem which has been mentioned to me several times.

Do you think it would be possible to have stats about read cache hit ? Like the % of chunck reads that were in the cache, this sort of things ? I know I ask a lot : o

1 Like

So demanding :wink: I hadn't thought of making stats, I think it is a great idea. I'll put it on the list. It would probably be an rc command which could give you some stats on the vfs, maybe vfs/stats. Or maybe make it part of the general rclone stats - then it could come out in the prometheus stats.

1 Like

The newest beta addresses the speed issues I was seeing.

1 Like

That is great news! I've been testing it on my home internet which isn't the quickest and it is as fast as everything else. I should do some local testing too I think.

1 Like

I think i found a bug.

Log: https://pastebin.com/L8R2RU1Y

I cannot Play this file on a mount on Windows 10.
I Plays fine with RClone 1.5.1 with the same mount Settings.

Edit: I've i Change the Cachepath from the Default one to another drive then ist work, so i guess i'ts the local path lenght.

Just a comment. Because you're not using a db, it is now far easier to control the caching outside of rclone which is great.

For my use case, I can see myself wanting to run something like this periodically to clear some file types:

find vfs -type f -ctime +1 -regextype posix-egrep -iregex ".*.(mp4|m4v|mkv|iso|avi|srt|idx|sub|sfv)$" -size +10M -printf "%S\t%p\n" | grep -v "^0"

and pass that to xargs or similar to selectively prune certain files over X days old while leaving the rest long-term. Its easily scriptable now.

1 Like

The cache is really working well, that and the fact that the downloaded data are more in line with what is needed. my queries per 100 seconds went from around 150-200 to 40-60, for the same usage.

1 Like

Do you think the path is too long for the filing system? It will fall down there. Not impossible to fix but introduces more complexity.

Yes I really like having the files available in a nice hierarchy.

I may at some point put the metadata into a database but I'll leave the files just like that!