Howto migrate Billions of objects

dingdong · October 14, 2020, 1:02pm

Hi

I am in the process of evaluation a migration method for our environment. I was wondering if rclone could able to handle our migration need.

We have on-premises object storage that needs to be migrated to different vendor. We have roughly around 2.5 Billion small objects (avg. size is around 20K) in our largest bucket. We’re using keys similar to this example: 0111f0/60/00/00/object.dat.

During the migration, we would need to achieve around 1000-1500 TPS, faster than that will stress the source too much and slower will take too much time.

As far as I understood, Rclone requires quite a lot of memory on and CPU when working with large number of objects. Is it able to handle this much?

asdffdsa · October 14, 2020, 1:32pm

hello and welcome to the forum

there are many flags that can be tweaked.
for example,
https://rclone.org/docs/#no-check-dest
https://rclone.org/docs/#no-traverse

ncw · October 14, 2020, 3:39pm

The --tpslimit flag has got you covered there

CPU isn't normally a problem, but memory might be.

You say your objects look like 0111f0/60/00/00/object.dat - rclone will treat that as a file path even though it is really just a database key.

The limiting factor is how many files in a "pseudo directory". Rclone needs to keep the info for each directory in memory. This takes about 1k of RAM per object, so if your largest directory has 1,000,000 objects in it then rclone will use about 1G of RAM per --checkers.

If you want to find the largest directory then you could do something like this

rclone lsf -R --files-only remote:bucket | sort > listing
sed 's/\(.*\)\/\(.*\)$/\1/' < listing | uniq -c | sort -n | tail

Which will output something like this

     22 rclone-integration-tests/2020-09-18-172054
     22 rclone-integration-tests/2020-09-21-181957
     22 rclone-integration-tests/2020-10-07-183607
     25 rclone/bin
     47 rclone/docs/content
     68 rclone/docs/content/commands
    101 100files
    101 100files-copy
   1301 dir
   2000 2000small

Indicating the directory 2000small has 2000 objects in.

If you are migrating from an S3 to S3 appliance then you'd use --checksum to stop rclone reading metadata.

dingdong · October 15, 2020, 12:16pm

Thanks for the initial advice.

I ran this against our one of our smaller bucket and it seems that data is distributed evenly across file paths. This would roughly mean that we would need more than 166GB of memory, as there’s 15 file paths. This should not be an issue as we can allocate around 512-768GB is needed. I need to do some more testing and see if I can calculate this against the large bucket. That will take a long time to complete..

Luckily the bucket we need to migrate is WORM protected and there will not be any new writes to it during the migration. That should make things a bit easier. We just need 1:1 copy of all objects. I will need to dig deeper on the needed parameters and start testing this tool a bit more.

one issue that’s wondering about is what happens if there’s a network failure or other issue during the migration. Is the only option to start from the beginning? All list operations to the source array are painful to execute. I already noticed that there’s no graceful stop or pause options.

ncw · October 17, 2020, 9:47am

To make sure rclone only transfers one of these paths at once you need --checkers 1 otherwise it will try to do multiple.

I recommend --checksum. Put --transfers up from the default 4 to get the throughput up.

Yes, starting from the beginning is the only option. Rclone won't transfer stuff it hasn't transferred already.

If you are concerned you could split the transfer into the 15 paths.

You can pause the process with CTRL-Z for arbitrary amounts of times and it will come back just fine.

asdffdsa · October 17, 2020, 12:55pm

hi, can you change the transfer rate on the fly with this
https://rclone.org/rc/#core-bwlimit

system · December 17, 2020, 8:55am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.