My use case requires lots of very small (1 or 2 MB) random reads (between 7 and 64) of larger files using a mount and Google Drive. The reads are never the same and so vfs-cache-mode full doesn’t seem to help. I’m trying to optimise to achieve the fastest seek/reads as possible.
I’ve spent quite some time reading and playing with different flags etc, so I don’t think I’m looking for specific help with that. I’m just wondering whether there is anything in the pipeline that might help me, particularly around concurrent reads without using a cache / ways to speed up random seeks/reads?
perhaps consider testing at wasabi, a s3 rclone, which is known for hot storage.
wasabi does not have all those gdrive api and other limits.
might get better performance.
This scheme outlined in the thread you linked to still seems to be to be quite a good one!
What we could do is something like this for --vfs-cache-mode off:
we get a read for offset X - open the file and return data
we get an read for offset Y which needs seeking
currently we close the reader and reopen at Y
instead we open a new reader for this
we can now read at both places. If a reader is not read for 5 seconds we can close it.
There isn't a development for this in the works at the moment, but it is a nice idea...
I would have thought -vfs-cache-mode full would be better performance for you.
Note that opening files on Google Drive is quite slow at the best of times and every time you seek, you have to open the file afresh, so you make be at the limit of what is possible, I don't know.
Thanks for this. I tried Wasabi (only uploaded 1TB of data as that's the maximum allowed on the free trial). I am getting much better performance. Reads are quicker and much more predictable in terms of latency. The only issue is the price (which is amazing compared to most other storage solutions out there, but still much more expensive than Drive).
I think i've got some choices to make to balance price vs performance.
I think you may be right. I tested GD against Wasabi (as recommended by @asdffdsa above) and got much more predictable reads (sometimes GD was quicker, but on average Wasabi was much better and I didn't get some of the much much longer reads that I do with GD).
I'll keep testing anyway, and will keep an eye out for any rclone development that may help optimise further.
as for price, depends on use-case.
i have local backup servers, i rarely access the cloud data.
i keep latest veeam and other backups in wasabi.
and older backups in aws s3 deep glacier for $1.01/TB/month.
so the overall pricing is very cheap
it is great that you can use one tool with gdrive and s3 wasabi?