Is rclone repeatedly downloading during seeks?

Hi folks,

I’m testing rclone as a read-only mount backend for s3ql via unionfs, which is currently storing files in encrypted 10MB chunks on Amazon Drive.

Most reads show an s3ql file downloaded one time only and FUSE reading the file normally in 128k chunks, but there seem to be multiple download attempts when rclone reports a seek:

12:34:57 DEBUG : s3ql_data_/264/s3ql_data_264611: Dir.Lookup
12:34:57 DEBUG : s3ql_data_/264/s3ql_data_264611: Dir.Lookup OK
12:34:57 DEBUG : s3ql_data_/264/s3ql_data_264611: File.Attr valid=1m0s ino=0 size=10486673 mode=-rw-rw-r--
12:34:57 DEBUG : s3ql_data_/264/s3ql_data_264611: File.Open OpenReadOnly
12:34:57 DEBUG : s3ql_data_/264/s3ql_data_264611: Downloading large object via tempLink
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 32768 offset 0
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 16384 offset 98304
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.seek from 32768 to 98304
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: Downloading large object via tempLink
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:06 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 65536 offset 32768
12:35:07 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.seek from 114688 to 32768
12:35:07 DEBUG : s3ql_data_/264/s3ql_data_264611: Downloading large object via tempLink
12:35:15 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:15 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 114688 offset 114688
12:35:15 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.seek from 98304 to 114688
12:35:15 DEBUG : s3ql_data_/264/s3ql_data_264611: Downloading large object via tempLink
12:35:21 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:21 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 131072 offset 229376
12:35:21 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:21 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 131072 offset 360448
12:35:21 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
...
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 131072 offset 10321920
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read size 36864 offset 10452992
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Read OK
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Flush
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Flush OK
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Flush
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Flush OK
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Release closing
12:35:22 DEBUG : s3ql_data_/264/s3ql_data_264611: ReadFileHandle.Release OK

This is with rclone v1.36-47-gd86ea86β with mount options:
--allow-other --read-only --buffer-size 100M --acd-templink-threshold 0 --max-read-ahead 100M

Adjusting/removing the buffer-size and --max-read-ahead seem to have no effect. Based on the timestamps, it seems that the download is actually being repeated for the 10MB file, which kills read performance.

Is there any way to ensure the file is downloaded only once and serve all reads from that download instead of re-downloading? Or are the logs saying that something entirely different is happening?

the internal read buffer is thrown away if you seek back in the file, even if it is only on byte

Which storage provider do u use ?

@seuffert Do you happen to know where in the code this is handled? If this is the issue, I’m interested in altering the behavior to keep the buffer for this use case (small, 10MB read-only files that are completely static), even if it means maintaining a fork.

The read performance is otherwise great - I’ve been testing with streaming media files and the playback is fine until the constant re-downloading of the same 10MB s3ql file, at which point playback halts for buffering.

@francesco2013 Amazon Drive

To be continued over at issue #1394

1 Like