Use case:
We currently have a job that uses rclone copy to copy changed files to a S3 bucket. We now want to put only the updated files into a sub folder/bucket within the parent bucket. Something like:
s3://parent-bucket/2019-01-01/changed-files-on-2019-01-01-go-here-after-job
s3://parent-bucket/2019-01-02/changed-files-on 2019-01-02-go-here-after-job
Question:
Is there a way to accomplish this on S3 using rclone, where all the files, regardless of sub-folders (buckets), are considered under a main bucket? If not, does anyone have suggestion on doing this?
I’m not 100% clear what you want to achieve, but I think using –backup-dir might be helpful. This will put the files that would be overwritten in a sync into a backup directory.
@ncw,
Thanks for the reply. I don’t believe the backup-dir option suits our need as we want the newer (and not the older) files in the dated directory. Reason for doing this is that we process the updated files and we would like to cut down on the time needed to look up changed files.
After searching for a while, I don’t think this is possible so I probably need a different approach. Like parsing the dry-run logs from a “copy” operation and use that list instead.
You can use rclone lsf --files-only --min-age or --max-age to generate lists of objects. You can then feed these back into rclone with the --files-from flag - that might help.
@ncw, when using the --files-from flag for rclone copy, will it download unchanged files if the file is in the file list? If not, then what you suggested will definitely be easier than parsing the dry run logs from a copy command.