Rclone wrapper for data archiving to Glacier etc

Dirk_Petersen · April 21, 2023, 8:39pm

I always propose Rclone to users when they want to archive stuff but I am frequently seeing that users in a HPC centric research community have difficulties in the process, e.g.

finding data they should archive because there are some huge files in some folder 7 levels deep that they forgot about
remembering where the data was archived to and getting it back quickly when needed
shepherding archiving processes of hundreds of TiB that sometimes interrupt and we need to remember to resume
only <3% of the observed data copy processes are currently check-summed because it is too much effort or users do not know what a checksum is
the decision of deleting local data is hard and comparing source and target takes extra time
users find working with AWS Glacier cumbersome

This tool is mostly a wrapper for Rclone and keeps track of some meta data in csv files and interacts with Glacier and S3 compatible storage. It is designed to be easily replaceable if something better comes along in the future. Excuse the coding style, most of it has been generated by ChatGPT4.

system · June 20, 2023, 8:39pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.