So in summary: Rclone will first list the entire content of the source "leaf" directory before initiating the transfer. If this is the only directory that needs to be transferred and contains several million objects, there will be a delay equal to the time it takes for listing the entire source directory, which can accumulate to several hours for millions of objects. I created a feature request on Github.
For now, I ended up with:
-
rclone lsf --absolute Cloudian:/bucket/prefixC/ > list_of_object_names.txt
- for 90M objects, the result is a 4 GiB text file - Split file into 90 files with 1M lines/objects each
- For loop around
rclone copy --checksum --s3-no-head --s3-no-head-object --no-traverse --no-check-dest --from-files-raw list_of_object_files_part00.txt
for each of the 90 files
Like @Ole noted, trying to do it in a single operation did not work well, either. I stopped it when rclone was at 24 GB memory consumption and no transfer had started.
This way, I still have to list the entire source directory, but at least if something unexpected happens during the transfer and rclone or the node crashes, I can skip the 3 hour long list operation step.
Edit: Thanks @asdffdsa for the --absolute
parameter - I added it to the steps.