I'd like to know from the rclone community wisdom if rclone could support my use case.
Suppose you have a bunch of data files in some remote storage folder, some useful for some analysis some not, but you really have no way to predict which files will actually be accessed.
What I would like is:
- a virtual file system (mount) that exposes these files locally
- files are (seamlessly) downloaded lazily, when accessed
- files are cached locally in case the code read a file twice, or if the analysis is re-run
- the cache is persistent, so that files used today won't have to re-downloaded tomorrow (if I restart the system , e.g. VM, docker container, mount)
- there is some kind of cache validation, in case files have changed on the source storage. This validation does not have to happen at each access, but when last validation is too old. This validation could simply use size/modification time metadata.
From my previous experience with rclone VFS I believe that rclone can support this use case, but I'd like a confirmation, and possibly some hints and advices.
Also the reason I'm asking is that I first considered blobfuse2 for this task, since the current storage is an Azure blob container, and the developer told me that the persistent aspect of this use case it not supported by blobfuse2.
Regards,
Karl