Hello Team,
I am trying Rclone for copying of files from Azure blob to AWS S3 .
There are around 50k files in Azure blob storage.
Rclone is installed on an azure VM , We are using Rclone sync command mentioned at the bottom of this message. We are planning to run this command as a cron job every 1 minute .
-
Does this command check 50k files every 1 minute on both Azure and AWS ?
-
Is there a better command that I can use to have better performance and reduce cost? Because there are 50k files.
-
Is there a way that I can somehow know/monitor if Rclone stops working (eg Rclone crashed )
-
If the file size is big eg : 10 mb, does Rclone sync command take care of this big size files?
-
Also the credentials of AWS like access and secret keys are stored in Rclone config inside the VM, is that not a security concern or is there any other way?
Run the command 'rclone version' and share the full output of the command.
rclone v1.69.1
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.8.0-1021-azure (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.24.0
- go/linking: static
- go/tags: none
Which cloud storage system are you using?
Source : Azure blob storage and Destination: AWS S3
The command you were trying to run
rclone sync —metadata azureblob:containername aws_s3:bucketname
Yes. How otherwise it could do what you ask it to do? sync all source and destination...
Do not sync everything every time. Sync only changes in specific time period e.g. for your job run every minute:
rclone sync source: dest: --max-age=61s
You have to make sure though that your sync finishes in 1 min. If not then you have to implement some logic to extend max-age
if needed. You will probably need something a bit more sophisticated than simple crontab:)
Many ways here how to approach it - but --max-age
is probably the best way.
Then run full sync only periodically (once a day?) to double check that all is in sync.
It has nothing really to do with rclone... Use your OS capabilities. The same as with any other software you want to monitor.
Yes
you do not have to store credentials in config file. You can export them in environment variables for example. Then up to you where you get it from - can be local keychain etc. Whatever meets your security requirement.
rclone does not support metadata on azure today. It is possible but if you need it then either you develop it yourself and submit PR, find somebody to do it for you or look at sponsoring development - https://rclone.com/.
Can we use Rclone copy command and add tags while copying from azure to S3? Like common tags to indicate these files were synced from azure?
Not sure. Check rclone AWS S3 docs if tags are supported.
Looks like Rclone has no support for metadata and tags. I thought of having a marker ( like copied from azure) so that I can identify the files that were copied from azure .
Might be similar like metadata with azure. It is possible but not implemented yet. If you really need it then the same applies:) beauty of open source.
You might be right... have forgotten that global flag can be used for it.
This is not very common option. The best would be to test if it works.