Rclone ORDER of sync

What is the problem you are having with rclone?

We are using rclone to sync our OCI Object Storage buckets between two Cloud regions. However, we are having issues with the order of the sync.
Our data in these buckets are YUM RPM packages which have the Packages(consists of RPMS and Source RPMS), their metadata and the repodata.

The resolution of data when we issue a YUM command happens in this order:

  • repodata references the metadata
  • metadata references the packages
  • referenced packages are served with any dependency resolution as needed

rclone for some reason for the above syncs between a Source and a Destination bucket, starts with the repodata.
This is breaking our YUM requests for the Users when they request for the packages as those have not yet been synced. While we understand rclone is pretty fast in syncing its content between two Object Storage buckets but it seems like our Users are spinning off YUM requests faster than we had expected and rclone can get to it to complete the sync.

Questions:

  • Does rclone have any inherent mechanism of identifying the order of the data for the syncs? or is it totally random?
  • Is there any option or definition that we can provide to the sync to start the sync with the data, then to metadata and finally repodata. Essentially something that gives us the control for the order of the syncs?

What is your rclone version (output from rclone version)

rclone v1.53.2
- os/arch: linux/amd64
- go version: go1.15.6

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Oracle Linux 7.9 64 bit

Which cloud storage system are you using? (eg Google Drive)

Oracle Cloud Infrastructure(OCI)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

/usr/local/bin/rclone sync <source_profile>:<source_bucket> <dest_profile>:<dest_bucket> --transfers=100 --log-file=<log_location> -v

The rclone config contents with secrets removed.

[rclone-profile]
type = s3
provider = IBMCOS
access_key_id = <redacted>
secret_access_key = <redacted>
region = <oci_region>
endpoint = <redacted_namespace>.<oci_object_storage_endpoint>
location_constraint = <oci_region>
server_side_encryption = AES256
force_path_style = true

A log from the command with the -vv flag

N/A

hi,

kind of a simple suggestion, odds are you thought of it already.
run three syncs, one after the other.

You can use the --order-by flag

If you want exact ordering then you'll need the --check-first flag too.

This may not be precise enough for you though so I think @asdffdsa's idea of 3 syncs is not a bad one!

yes, we did :slight_smile:

that was the only option and way forward for us!

Do you have any examples for the order by flag and the check flag? Would like to explore how well those options can work than doing 3 syncs.

If you want to try --order-by you need something which can order the files in the correct order.

So use --check-first then one of --order-by size --order-by name or --order-by modtime

So if the metadata is always created after the data then --check-first --order-by modtime would do that.

Or if you can name the metadata so that it is always later in the alphabet, or if it is always smaller then you can use --order-by name and --order-by size.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.