Apologies for the unorthodox approach.
Rclone has appeared in our radar as a solution that could solve many of the problems that we are currently experiencing in our institution, the documentation and quality of the code is so good that I don't understand why I did not know about this solution before, so that is why I'm here, to ask if I'm missing something.
Our institution provides access to open data to any researchers in the world, for example you can go to http://ftp.ebi.ac.uk/ and as you click the links around you have access to a total of no less than ~40 Petabytes of data. Petabytes, that is not a typo.
So, we usually see more than 5 PB of downloads every month, through several protocols that we support, and I was planning to retire some of them, however the scientific community wants to keep their pipelines untouched and that would be a cost that if I can I'd like to skip.
Also, we have several kinds of backends:
- in house S3 implementation with HGST(activescale X100) backend
So, the protocols I wanted to retire are supported by rclone, and in fact it appears that we could replace existing services handled by vsftpd, httpd, nginx with rclone. That is why I have to ask the following question :
Can you (members of the rclone community) share with us (the possible next member of the community) answers to the following list of questions?
- what are the sizes of your deployments?
- How do you scale rclone in the real life?
- What problems have you had and how easy was to solve them?
- How much traffic does your rclone instance handle?