Rclone, s3, nginx and large video file startup slowness

hacktek · November 1, 2022, 3:14pm

I'm trying to use rclone to serve an S3 compatible bucket (Linode Object Storage). I'm playing around with an nginx caching solution for large video files so nginx dumps its sliced cache of a large file in 64 MB chunks into the S3 bucket via rclone. The bucket is co-located with the Linode cloud instance (same datacenter), so latency is very low and performance is quite good (pulls from object storage hit upwards of 5 Gbps, in some cases >10 Gbps). I'm using full vfs cache mode with a small 100 GB local cache so, presumably, things will fall out of this cache fairly quickly but should be available in the bucket. This should all be transparent to nginx who's cache manager will handle whether or not an object is in cache, regardless of it being pulled from rclone's local vfs cache or the bucket. The idea if that, instead of going to the origin to pull around 200 Mbps, performance should be much better pulling from the local bucket when content is cached.

When a video is played from the beginning things look good, nginx slices the file in chunks, places them in the rclone mounted path and rclone schedules uploading this to the bucket while leaving a local copy in the vfs cache. As long as the chunks are local in the vfs cache, things work well. Resuming playback for files that are on the small side (a couple of GBs) also works well even if they're not in the rclone vfs cache as pulling these from object storage is very fast.

What I noticed is that resuming playback of large files (40 GB+) when playback was stopped deep into the file (say 1 hour into a 90 minute film) is slow. Looking at server activity when this happens, I can see a spike in network utilization. The time to resume varies, but I noticed that the deeper into the file playback was stopped, the longer it takes to resume, so the assumption is that rclone is downloading the chunks up until enough has been downloaded to satisfy the client's requested byte range and the first byte isn't actually being delivered until this happens. It's unclear to me whether this is rclone behavior or just a reaction to how nginx is behaving so I need to look at some logs. Any clues about what I should be looking for? If I enable verbose rclone logging I see a lot of logging but I'm not sure what to look for to confirm if this is occurring because of how nginx deals with its cache.

Any pointers are appreciated.

ncw · November 1, 2022, 5:13pm

Rclone doesn't do this - it should go straight to the requested byte range.

It might be nginx reading the entire file for some reason (maybe to make a hash to see if it has changed)?

hacktek · November 1, 2022, 5:40pm

And I think this is right. Nginx has a configuration tag that allows skipping the cache for large ranges. When I enable this, it appears to resume quickly by reusing the ranges that are in cache but any new ranges not already in the cache get forwarded straight to the origin. So anecdotally it seems like nginx is to blame, however I was wondering if there was any logging that I could enable on rclone to confirm that it is indeed requesting all of these separate files.

Side question, assuming nginx is indeed checking every file to see if it's changed, is there any flag in rclone to do that from the remote, without having to actually download the file? I tried playing around with "--size-only", "--checksum" but I didn't notice any performance improvements, the file was still downloaded every time.

ncw · November 1, 2022, 5:44pm

Are you running rclone mount or rclone serve?

The way to see this is to use logging with -vv. There you'll see every request in gory detail!

I'm assuming that you are running rclone mount? If nginx is reading the files all the way through then rclone will fetch them from the source.

The log will show exactly what nginx is doing - if you need help intepreting it then post it as github gist (or similar, eg pastebin) and put a link here.

hacktek · November 1, 2022, 6:02pm

Yup it's a mount.

I couldn't reproduce just now because I believe most of this file is still cached locally but that shouldn't matter for the purposes of validating what nginx is doing:

gist.github.com

https://gist.github.com/hacktek/43b58fe674b4a82a6f733237374473b9

gistfile1.txt

Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/: Lookup: name="7fa0a91d2bbd1872e8589288cb556204"
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/: >Lookup: node=p1/4/20/7fa0a91d2bbd1872e8589288cb556204, err=<nil>
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: Attr:
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: >Attr: a=valid=167h0m0s ino=0 size=67109619 mode=-rw-rw-r--, err=<nil>
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: Open: flags=OpenReadOnly+OpenNonblock
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: Open: flags=O_RDONLY|0x800
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: newRWFileHandle:
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: >newRWFileHandle: err=<nil>
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: >Open: fd=p1/4/20/7fa0a91d2bbd1872e8589288cb556204 (rw), err=<nil>
Nov  1 11:48:10 localhost rclone[29615]: p1/4/20/7fa0a91d2bbd1872e8589288cb556204: >Open: fh=&{p1/4/20/7fa0a91d2bbd1872e8589288cb556204 (rw)}, err=<nil>

This file has been truncated. show original

I'm not sure if there's a read operation for every file all the way to the end based on this.

That would be very surprising to me and a very inefficient way of determining cache consistency if that was the case. Proper way to do this would be to take the first byte of the file and then assume all of the chunks for that file are bad if that first byte's Last Modified or Etag has changed. I'm not sure how to confirm this however, since it doesn't seem like nginx has a mechanism to configure advanced debug logging of its internals (at least based on what I've researched).

hacktek · November 1, 2022, 10:01pm

Just to close the loop in case anyone cares: I believe I've worked around this. What I did was set up a nested nginx configuration.

In the default vhost I set up caching so that the slice module is enabled but all requests are forwarded to the origin and not cached by using proxy_cache_max_range_offset 0. This causes the slice module to still use the 64 MB slice configuration (so every large file request to the backend is split in 64 MB byte ranges). Instead of going to the origin directly, I go forward to an internal/intermediary vhost. This intermediary has the same slice module configuration but does not implement proxy_cache_max_range_offset so it basically passes through the same 64 MB range it gets from the main vhost to the origin but treats it as its own, independent request. Since the range it's getting equals its own chunk size configuration, each chunk file does not need to be split into multiple files. This intermediary caches in the rclone path, which uploads to the bucket.

What this effectively does is abstract any arbitrary byte range from the client into a single 64 MB chunk. The intermediary vhost does not know about the actual file, since it's only getting 64 MB chunk requests from the main. The slice module is configured to forward without caching so once the application requests a byte range deep into the file, the main translates this byte range to the appropriate 64 MB chunk that contains it and makes this request to the intermediary. Since the intermediary only needs to pull this specific range's file from the bucket (and does cache consistency of each of these files independently of each other), this operation completes very quickly as only this file needs to be obtained from the bucket and served back to the main, which then serves it back to the client.

TTFB is much faster because the specific range's file is the only file that's pulled and there's no need to validate consistency across multiple chunks. I'm pretty sure I don't need the slice module in the intermediary and can simply forward the range request from the main as is to the origin, so I'll be experimenting with that next but for now it seems to be working with much better performance.

EDIT: turned off the slice module in the intermediary to reduce complexity and reconfigured the cache key so it would use the same one as before. Still working well. The slice range generated by the main is passed on to the intermediary which passes it to the origin and caches the response as a single file.

ncw · November 2, 2022, 11:03am

I think I followed that! Well done for fixing the problem and thanks for reporting the solution.

system · January 2, 2023, 7:03am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.