Hello Team,
I am evaluating Rclone for copying of files from Azure blob to AWS S3 .
My requirement is
When files get added to Azure blob storage, it should be copied to AWS 3
When files get updated in azure blob , it should be updated in AWS s3
When a file get deleted in azure blob , it should be deleted in AWS S3
When new files that are added only in AWS S3 (didn’t add in azure), they should not be deleted
While copying all the metadata(user defined metadata) should be preserved
If I use Rclone sync , it does step 1-3 but the issue is it does not do step 4, it deletes the extra files on AWS S3
If I use Rclone copy , it does step 1,2,4 but doesn’t do step 3(deleting files in s3 when it gets deleted in azure)
Both of these command doesn’t preserve user defined metadata even with — metadata flags.
Can you help me if this is achievable. Any suggestions would be greatly appreciated .
Run the command 'rclone version' and share the full output of the command.
rclone v1.69.1
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.8.0-1021-azure (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.24.0
- go/linking: static
- go/tags: none
Which cloud storage system are you using?
Source : Azure blob storage and Destination: AWS S3
The command you were trying to run
rclone sync —metadata azureblob:containername aws_s3:bucketname
Azure rclone remote does not support metadata - see docs . Look at the last column. Explanation further down on the same page. So nothing to do re 5.
As for your 1 to 4 requirement it is not possible to perform as a single operation without some persistence (memory of previous state). 3 and 4 are impossible to distinguish if you do not know previous state of the both remotes. It means that to achieve it you have to write some clever software:) And of course you can use rclone as a tool to move files around.
Or rethink what really you want to achieve.
rclone sync running as cron job will do step 1-3
You could do 4 if files added to AWS have some unique names which can be excluded from sync (1-3).
What do you mean by this? How can we exclude ?
So you can run something like:
rclone sync azure:bucket aws:bucket --exclude "aws_only_files*"
Is there a way to achieve 5?
Is the option to fetch from azure and pass update the metadata in s3 via code? Or is there any other option?
rclone does not support metadata on azure today. It could if implemented.
So either you have a go and submit PR, find somebody to do this for you or consider sponsoring development.
I’ll see if I can submit the PR
Also, if we are running Rclone as cron job , how can we monitor if something fails to copy ? Is there a way to monitor the failures / successes ?
Does it mean you are ready to do all required development? Great:)
rclone development is tracked on github . Fork this repo -> develop azure metadata solution -> propose your code by sending PR back.
And for sure when you start there will be people happy to guide you.
For the start you read this doc:
# Contributing to rclone
This is a short guide on how to contribute things to rclone.
## Reporting a bug
If you've just got a question or aren't sure if you've found a bug
then please use the [rclone forum](https://forum.rclone.org/) instead
of filing an issue.
When filing an issue, please include the following information if
possible as well as a description of the problem. Make sure you test
with the [latest beta of rclone](https://beta.rclone.org/):
- Rclone version (e.g. output from `rclone version`)
- Which OS you are using and how many bits (e.g. Windows 10, 64 bit)
- The command you were trying to run (e.g. `rclone copy /tmp remote:tmp`)
- A log of the command with the `-vv` flag (e.g. output from `rclone -vv copy /tmp remote:tmp`)
- if the log contains secrets then edit the file with a text editor first to obscure them
This file has been truncated. show original
I’ll try my best.
Can you please provide some pointers on how we can monitor the failures when running ? Do we get logs or is there some way to find the issues?
Everything can be logged with various details level .
rclone always return exit code indicating operation status.