I’m trying to store backups on s3 using rclone. Conceptually, it works great. However, I have to deal with a fairly large data set that’s over 10Tb that has millions upon millions of small files. Average file size is around 40k. If I use rclone sync as is, I’m getting a fairly big performance hit. Over 10 times slower than if I’m backing up 10Tb of files that are larger than, say, 100Mb. Solution I’m thinking about is to have a feature when rclone rcat automatically splits remote file once it reaches certain size. Kind of similar to how Unix split works - once certain size is reached, start a new file w/ a different suffix. For example, if I want to store 550Gb file in 100Gb chunks, result would look like:
filename.00 (100Gb)
filename.01 (100Gb)
filename.03 (100Gb)
filename.04 (100Gb)
filename.05 (100Gb)
filename.06 (50Gb)
And rclone cat would reassemble it on a fly. Same can probably be applied to copy and sync commands.
Thoughts on this?
Cheers!
–dima