Rclone as a s3 storage cache for production

GeriMoreni · November 7, 2022, 11:22am

I'm working in a startup and we have microservices architecture, there are several worker servers which based on request download files from a shared s3 storage, process it, etc.
My goal is to only download files once per worker server and keep the files locally a well, current implementation is to have a directory on the disk and check if it's exists there first before downloading it.
it works now, but there are several problems:

How much disk space is used (we can afford around 1TB local disk), not trivial to check size constantly.
What if two different processes (requests) try to download same time and overwrite each other? etc.

I'm using rclone from personal goals for years now with s3 as a backend + crypt.
I was thinking to mount s3 storage as a local dir, we control cache size, expiration, etc. and just access files from the directory instead of directly using s3.

What you think is it good idea to use rclone in production in such way?

regards

Animosity022 · November 7, 2022, 12:21pm

Not quite sure I've got the flow but if you are keeping something locally, it would require that much space.

There's no collision detection really via rclone. So if A uploads it at 14:01 and B gets a new version and uploads it at 14:02, the newest one will always win.

Test it out and see if it works for you.

system · December 7, 2022, 12:21pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.