OCI Copy 150TB - 1.5 Billion Files Fast

What is the problem you are having with rclone?

Trying to optimize speed and compute power, flood our OCI Object Storage backend and migrate 150TB | 1.5 Billion objects between regions. We have 3 top level folders/prefixes, and a ton of folders and data within those 3 folders. I'm trying to copy/migrate the data to another region (Ashburn to Phoenix). My issue here is I have 1.5 Billion objects. I decided to split the workload up into 3 VMs (each one is an A2.Flex, 56 ocpu (112 cores) with 500Gb Ram on 56 Gbps NIC's), each VM runs against one of the prefixed folders. I don't notice a large amount of my cpu & ram being utilized, backend support is barely seeing my listing operations (which are supposed to finish in approx 7hrs - hopefully).

Based on these specs, what would be the most efficient way I can list and transfer this amount of objects? I see a variety of different post regarding millions of files in single directories, or deeply nested ones. How do we figure out the proper way to do listing without running out of memory, maximize concurrent processes and flood the VM + OCI Object storage service? Any guidance is greatly appreciated!

Run the command 'rclone version' and share the full output of the command.

rclone v1.69.1

  • os/version: oracle 8.10 (64 bit)
  • os/kernel: 5.15.0-304.171.4.el8uek.aarch64 (aarch64)
  • os/type: linux
  • os/arch: arm64 (ARMv8 compatible)
  • go/version: go1.24.0
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

OCI - Oracle Cloud Infrastructure

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy <source>:<source_bucket> <destination>:<destionation_bucket> --transfers=5000 --checkers=5000 --fast-list

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[ociashburn]
type = oracleobjectstorage
provider = instance_principal_auth
namespace = XXX
compartment = XXX
region = us-ashburn-1

[ociphoenix]
type = oracleobjectstorage
provider = instance_principal_auth
namespace = XXX
compartment = XXX
region = us-phoenix-1

A log from the command that you were trying to run with the -vv flag

No logs as of right now, early testing.

You might be interested in trying the latest rclone beta which includes some improvements for such cases:

2 Likes

How are the files distributed under the 3 prefixes/directorys? Is there more structure underneath?

If you have more that 1000000 files in a directory then you'll like need the beta @kapitainsky linked above. This is due for release in 1.70

If you don't mind spending the extra transactions, then removing --fast-list will probably speed things up and use less memory though I would reduce checkers down a bit to 500 say

2 Likes

I will definitely check out the beta version, looks awesome.

Update here: 4000 transfers, 2000 checkers, fast-list were able to move 10 million objects per hour, should be finished in 6.2 days for full migration. Sitting about 332Gb memory used on each VM after listing and remaining consistent without going OOM.

Folder structure is 3 Top level prefixes, then in each of those is a ton of nested->nested->nested folders with files spread everywhere.

If we hit any issues, will try without -fast-list and bump checkers down.

And another "bleeding edge" feature which might be handy here.

--hash-filter - not in the master beta yet but binaries are already generated for hash-filter branch. No need to compile from source.

It would allow easily to split massive task into parallel threads without much worry how to make sure that processed things do not overlap.

Some raw details in this commit:

2 Likes