Hello everyone,
I'm currently trying to setup a backup solution to save my dropbox data to AWS S3, in an immutable way, and I have a few interrogations being a complete beginner when it comes to rclone. I have read a lot of posts so I have an idea of a possible solution but I would gladly hear the opinion of rclone experts.
I first wanted to use something like restic (to save space with deduplication) but I am afraid of repository corruption.
I plan to have with rclone 5 remotes :
- dropbox
- awss3 (storage_class : STANDARD)
- awss3encrypted (type crypt, remote = awss3:/encrypted)
- awss3da (storage_class : DEEP_ARCHIVE)
- awss3daencrypted (type crypt, remote = awss3da:/encrypted)
I plan to have on AWS S3 awss3encrypted :
- 6 daily incremental (I read that --backup-dir can help with that) : with a 1 week retention
- 3 weekly full backups : with a 1 month retention
I also plan to have on AWS Deep Glacier awss3daencrypted (as have 180 days minimum retention, I can't use it for daily/weekly backups) :
- 11 monthly full backups : with a 1 year retention
- 5 yearly full backups : with a 5 year retention
And I plan to setup IAM roles that allow only write/read access (no deletion which would be done by retention policies).
I read that I can have immutable data with the --immutable flag.
So basically I plan to run the following commands :
full backup on monday : rclone copy dropbox: awss3encrypted:/full/20211026 --immutable --progress
incremental backup tuesday-sunday : rclone sync dropbox: awss3encrypted:/full/20211026 --backup-dir=awss3encrypted:/incremental/20211027 --immutable --progress
And with another cron for the monthly/yearly deep glacier backups :
rclone copy dropbox: awss3daencrypted:/monthly/20211101 --immutable --progress
: every month
rclone copy dropbox: awss3daencrypted:/yearly/20220101 --immutable --progress
: every year
I am not sure if the copy works for deep_archive ?
To save on the number of file transfers on the S3 side, is there a way to tar the dropbox data during the operation ? Another solution would be to first copy data on the AWS Fargate container, tar the files then copy it to S3 but I was really interested to do the backup in one command.
With that strategy, I would be using a lot of storage with that amount of full backups (despite deep glacier being cheap).
Due to how glacier works, I guess I could not do the same kind of incremental backup as what I plan to do for the daily/weekly backups.
I am wondering if something like that could work :
rclone copy dropbox: awss3daencrypted:/monthly/20211101 --immutable --progress
: the first month of the year a full backup directly to S3 standard instead of glacier (with still a one year retention).
Then at the beginning of each of the next 11 months, I'd do an incremental sync with a transition policy to be moved to deep_glacier after one year (as I guess I cannot sync directly to deep_archive ?).
rclone sync dropbox: awss3daencrypted:/monthly/20211101 --backup-dir=awss3daencrypted:/monthly-incremental/20211201
But I am really worry of data corruption. With the backup-dir flag, has there already been cases of data corruption ? And in the case it happens, what could be done to fix it (just do another incremental again) ?
I saw @asdffdsa mentionned he used veeam to create the backups. In my case, I'd like to execute the backup directly from the cloud as data are pulled directly from dropbox (maybe in AWS Fargate) so I would not retain the veeam backup chain, that's why I was leaning towards rclone which seems to be able to handle the cloud to cloud transfer without having to have the data on disk (just in memory).
What would be the advantage of using Veeam here instead of the rclone incremental. Or in case of pure full monthly/yearly backups, is there a point at all using something like Veeam or all its power shines only when you plan to setup things like forever forward incremental / reverse incremental ? Here I'm kind of trying to do something like a GFS backup but managing myself the sync to cloud part.
One more thing, in the case of incremental backup with the backup-dir flag, what does the restore process would look like ? Do I have to download the last full backup + all the incremental backup (till the day needed to restore), then manually copy all incremental to the full, or maybe something else ?
Anyway thanks for creating such a great tool !