Best practices for content that is read in chunks

Hello.

I have a gdrive mounted with rclone and fronted by an http web server. This in turn is fronted by a CDN with large file support enabled so the files are read in chunks (byte range requests) ranging from 2 to 5 megabytes. The mount implements vfs caching with files going to a dedicated NVME drive.

I'm trying to fine-tune the settings to optimize read latency and reduce iowait as much as possible. These are the relevant settings that the mount is running with:

  --vfs-read-chunk-size=64M \
  --vfs-read-chunk-size-limit=2048M \
  --vfs-cache-max-size=500G \
  --vfs-read-ahead=256M \
  --cache-dir=/mnt/md0/cache/rclone \
  --vfs-cache-mode=full \
  --vfs-cache-max-age=720h \
  --buffer-size=64M \
  --dir-cache-time=168h \
  --use-mmap \
  --async-read=true \

While this seems to work, iowait is a bit higher than I'd hope:

image

Without that much network activity:

image

Are there any recommended settings that I'm missing? Any recommended changes to any of the settings that I already have?

Also, can I get a confirmation that the chunks will likely be read from disk/memory and not generate additional network requests back to gdrive with the current flag choice (especially around buffer-size, vfs-read-chunk-size/limit and vfs-read-ahead)?

Thanks

IOWait is a bad metric to watch with a cloud based remote as it'll always be waiting on IO since it's grabbing things from the cloud.

When you have something in cache, it does need to compare what you have to the remote so it does generate some traffic. I leave generally everything at defaults unless I have a reason to change it.

This is rclone's controls for doing the range requests to the gdrive... Maybe trying to match them (approximately) with what the CDN asks for might help, so reduce the vfs read chunk size to 8M say.

If you are concerned about latency then reducing this is a good idea. Reducing it will probably reduce throughput though.

If the chunks are on disk, then rclone will serve them from there.

However rclone may check the file is still in existence when the --dir-cache-time expires or the directory is changed externally and the change notify fires.

Excellent!

My rationale for using vfs-read-ahead at 256M is because I assume that will help in ensuring enough has been cached on disk by the time the CDN asks for the chunk. I can also see how that might keep iowait high and that wouldn't necessarily be a bad thing. My concern with iowait would be requests getting blocked because of API limits but I feel like reducing some of these values would increase API calls and potentially create higher likelihood of getting rate limited.

I'll test the current settings a bit more but I don't believe I've experienced any major problems with it, I'd also expect less and less origin requests as more stuff gets cached naturally so blocking requests should be less of a concern. I might play around with the values once I reach a steady state in terms of cache hit ratio at the CDN.

1 Like

What CDN would be caching this data? Is it actual data or encrypted content?