New Feature: vfs-read-chunk-size

gforce · May 20, 2018, 9:01pm

Ya that 2G thing would be good to know. So far looking good. Starting to build out the role for it.

Animosity022 · May 20, 2018, 9:11pm

The 2GB isn’t part of the command. He’s saying that it would only count towards the ‘requested bytes’ that we can’t see or tell if there’s a limit on

All in all, it seems nice, but adds no value for me as the cache is faster and provides a better bang for the buck as I handle all my items on the cache rather than doing extra scripts or whatnot.

My use case is just english Movies/TV though so it’s a bit cleaner.

neik · May 20, 2018, 9:31pm

Do you write directly to the cache mount or do you use rclone move? The last time I used the cache backend I was having problems with files having 0 bytes even tried with the flags vfs-mode write and cache-writes but none worked.

Without the cache, only crypt instead the vfs-mode write works flawlessly (thanks @B4dM4n for the fix once again!).

This actually is the reason why I am not using the cache backend right now.

Animosity022 · May 20, 2018, 9:35pm

I don’t use cache-writes, I use cache-tmp-upload and write directly to the mount with Sonarr/Radarr.

It autouploads in 60 minutes based on my config.

neik · May 20, 2018, 9:46pm

Just had a look at the Wiki and saw that the files are moved using rclone move and that’s probably the reason why you are not having the 0 bytes files.

For people like me who do not have a lot of storage on their VPS a fix to the 0 bytes issue would be great. In my case for instance I do not have the storage to extract files locally I have to extract them to the remote.

The only workaround is using the vfs-layer on a crypt backend and unfortunately not being able to use the cache backend.

B4dM4n · May 20, 2018, 10:16pm

Sure I will write a post when the time comes

Neither --vfs-read-chunk-size nor --vfs-read-chunk-size-limit will store or cache any data. When enabled, they instruct rclone to use HTTP range requests when getting data from a remote. A simple version of this technique was already used to enable seeking in files. A seek to position x would lead to a HTTP request with the range x to END. The new part is, --vfs-read-chunk-size y will result in a HTTP request with the range x to (x+y) and automatically request the next range, when the current one is read completely.

If you set --vfs-read-chunk-size 1G and read a 10 GB file from your mount, rclone will issue 10 requests to the remote, where every request only has a range of 1 GB, instead of issuing 1 request with 10 GB without the flag.

It is a tradeoff between "increased number of API calls" and "wasted download quota if closed early". This all only makes sense for non cached mounts. As explained in a previous post, some workloads can produce large amounts of "wasted download quota if closed early" a non cached mount.

To reduce the "increased number of API calls" overhead there is the second flag, --vfs-read-chunk-size-limit, which lets the requested HTTP range grow exponentially.
If you set --vfs-read-chunk-size 1G --vfs-read-chunk-size-limit 50G and read a 10 GB file from your mount, there will only be 4 requests: 0-1GB, 1GB-3GB, 3GB-7GB and 7GB-end.

The numbers 128M and 2G from my first post are only simple guesses, which may need adjustment. They heavily depend on the use case and the daily limits of the remote. I will probably try 64M and 8G at some time for my Google Drive, but since the current values seem to work I don't see a need for a change.

It is worth a try, but unfortunately I think that there will be not difference. There is no change in the way open files are handled.

gforce · May 21, 2018, 9:40am

This has been working perfectly. 3 servers scanning over the last 12 hours, 0 api bans, all of them play well! no need for cache!

neik · May 21, 2018, 10:20am

@B4dM4n, thanks for the explanation!

@gforce, what settings have you been using?

Animosity022 · May 21, 2018, 1:33pm

@B4dM4n - With the dir-cache-time on 48 hours, are you waiting for 48 hours for a directory to update? I wasn’t sure how you are moving files up to your GD.

neik · May 21, 2018, 3:45pm

@Animosity022, responsible for that isn’t that flag but --poll-intervall.

I have an even higher dir-cache-time and no problem with new files although I have to admit that all the changes are made within the mount without any exception.

Animosity022 · May 21, 2018, 3:48pm

I see it. I thought that was for cache only.

gforce · May 22, 2018, 10:45pm

https://github.com/Admin9705/PlexGuide.com-The-Awesome-Plex-Server/tree/Version-5/ansible/roles/pgdrive

Working the permission issues for the mounts, but again, all servers have not gone down Cache to me is pointless after this!

Update: BETA 3 Released, permission issues gone; all works well!

gforce · May 23, 2018, 11:17am

stick with these numbers. they work on 3 servers and other members that have reported it works well.

Animosity022 · May 23, 2018, 11:40am

@gforce - What did you finally end up using for the conf/mount? What’s the size of the libraries you were scanning? How long did the ‘cold’ scan take?

Also, are you writing directly to the mount of moving in some other way?

ncw · May 23, 2018, 1:13pm

It sounds from this thread that the --vfs-chunk-size is working very well - a great patch - thank you!

If we were to consider turning it on by default would you use those values?

I'm just wondering how you arrived at those values.

I'm interested if anyone has tried reducing the --vfs-read-chunk-size 128M - it seems to me that reducing it will be beneficial in terms of quota usage reduction at the cost of a few more round trips while it is warming up.

I guess it makes sense not to have the value for --vfs-read-chunk-size smaller than the --buffer-size setting. Perhaps the --buffer-size would be a sensible default... (or twice like you have).

Animosity022 · May 23, 2018, 2:20pm

I did some testing this morning and it does work nice, but still slower than using cache for me.

Even when primed up, the scan is still pretty slow.

I was testing with this:

ExecStart=/usr/bin/rclone mount gcrypt: /gmedia \
   --allow-other \
   --dir-cache-time=168h \
   --vfs-read-chunk-size 64M \
   --vfs-read-chunk-size-limit 2G \
   --buffer-size 512M \
   --syslog \
   --umask 002 \
   --rc \
   --log-level INFO

The API hits didn’t look bad at all and start time was just as fast as my cached setup. It definitely seems like a good alternative, but if you are looking for a similar experience to a local drive, the cache provides a bit more bang for the buck.

Everyone’s use case is different so was just sharing mine.

gforce · May 23, 2018, 8:49pm

200TB, 3 servers all scanning; unionfs sits on top of the personal drive and teamdrive meshing them. The downloaded files bounce to /mnt/move. I have a traditional move service transfering files at 9MB or supertransfer at max speed bypassing the limit.

Dual-O · May 23, 2018, 10:02pm

What bitrates and resolutions are you playing?
Have you tried 80Mb/s 4k Files?

Animosity022 · May 23, 2018, 11:59pm

Thanks for sharing. That’s a bit more complex than I wanted as I like the simplicity of my setup as everything thinks its local.

For both setups, I’ve played 50Mb 50GB 4K files without a problem. I’ve never had a 4k download so far at much higher than that.

camjac251 · May 24, 2018, 2:33am

Are those remuxes that you have tried? Or Web 4k files?