Almost 24 hours CPU compute time during sync between two large S3 buckets

caymard · June 28, 2023, 1:31pm

STOP and READ USE THIS TEMPLATE NO EXCEPTIONS - By not using this, you waste your time, our time and really hate puppies. Please remove these two lines and that will confirm you have read them.

What is the problem you are having with rclone?

Hi all,

I read a lot of articles, SO topics, rclone forum topics, I still have a problem and I'm out of ammo, so I try to ask for help here.

My request is "quite" simple : I need to sync a bucket on Ceph+radosgw with another bucket on Wasabi. This bucket has millions of objects, with a total size of around 100TB.

I ran the command you see in the following section, I tuned parameters after some tests on a smaller bucket, to ensure my Ceph was not too much stressed by the massive read.

Then, the timeline for that first sync is:

day 0, 08:00 : sync is launched, requests are made to Ceph & wasabi endpoints
day 0, 14:30 : no more requests, a CPU core is used at 100% (all the others are idle), the only logs I saw is these:

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       1m0.3s

[...]

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:   20h59m0.3s

day 1, 11:30 : data transfer is starting, using all the max bandwith I setup (250MB during the day, 500MB during the night)
day 6, 10:00 : data transfer is done

Timing summary is:

list files in buckets : 6h30m
CPU thing (difference ? graph ?) : 21h00m
data transfer (110TB) : 4 days + 22h30m

After that first sync, I made a second run, and timing were:

list files in buckets: 30m
CPU thing (difference ? graph ?) : 20h00m
data transfer (1TB) : 1h30m

So, my question is: what is clone doing with the CPU during these 21 hours ? Is it possible to optimize it, multithread, etc ?

Run the command 'rclone version' and share the full output of the command.

rclone v1.62.2
- os/version: ubuntu 20.04 (64 bit)
- os/kernel: 5.4.0-148-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.20.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Source : S3 (Ceph + Radosgw)
Destination : S3 (Wasabi)

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone \
    --config=/etc/rclone/rclone.conf \
    --rc \
    --rc-addr ":5572" \
    --rc-enable-metrics \
    --rc-web-gui \
    --rc-user rclone \
    --rc-pass ************ \
    --max-delete 1000 \
    --checkers 32 \
    --fast-list \
    --transfers 16 \
    --buffer-size=1G \
    --bwlimit "00:00,500M 05:00,250M" \
    --s3-chunk-size 32M \
    --s3-upload-concurrency 16 \
    --s3-list-chunk 50000 \
    --checksum \
    --dump=headers --retries=1 --low-level-retries=1 --log-level=DEBUG --log-file=/tmp/rclone/rclone.log \
    sync \
    ceph:************ \
    wasabi:************

The rclone config contents with secrets removed.

[ceph-production]
type = s3
provider = Ceph
region =
env_auth = false
access_key_id = ****************************
secret_access_key = ****************************
endpoint = **********************
acl = private

[wasabi]
type = s3
provider = Wasabi
env_auth = false
access_key_id = ****************************
secret_access_key = ****************************
endpoint = **********************
acl = private

A log from the command with the `-vv` flag

It generates a lot of logs, not sure if it is relevant to paste it here.

ncw · June 28, 2023, 3:17pm

A good question!

So there were no network requests in that 21h? What was in the log when rclone started transferring stuff?

Is it possible the machine was swapping then and running very slowly?

How many files do you have in your largest "directory" (I know s3 doesn't really have directories but rclone syncs on a directory by directory basis).

You could run rclone with the -rc flag, then use this to grab a list of running go routines

curl http://localhost:5572/debug/pprof/goroutine?debug=1

If you could get one of those when it is in high CPU no networking phase that would be incredibly useful.

I suspect this is something to do with --fast-list - you could try the sync without it and it might be quicker if you raise --checkers above the default (and you have lots of directories to process - if everything is just in one big directory with millions of files then it will make no difference).

caymard · June 28, 2023, 4:41pm

Hey, thanks for your response ! Here some data about what you asked.

So there were no network requests in that 21h? What was in the log when rclone started transferring stuff?

No network at all, when transfer started, I have this log just after the last "Transferred: 0B / 0B":

2023/06/28 05:43:13 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                                       
2023/06/28 05:43:13 DEBUG : HTTP REQUEST (req ******************)                                                                    
2023/06/28 05:43:13 DEBUG : GET /**************************** HTTP/1.1                                
Host: **********************
User-Agent: rclone/v1.62.2                                                                                                     
Accept-Encoding: gzip                                                                                                          
Authorization: XXXX                                                                                                            
X-Amz-Content-Sha256: **********************************                               
X-Amz-Date: ************************************

Is it possible the machine was swapping then and running very slowly?

no swap at all, it's a 128GB server

How many files do you have in your largest "directory" (I know s3 doesn't really have directories but rclone syncs on a directory by directory basis).

I think it's the main problem. On the "root" of the bucket, there is only "folders", no files. When I mean a lot, it's around 500k to 1M.

If you could get one of those when it is in high CPU no networking phase that would be incredibly useful.

95 or 96 routines during this phase, and sometimes 97. I scrapped the metrics during the sync.

I suspect this is something to do with --fast-list - you could try the sync without it

We tested without, it was extremely longer.

it might be quicker if you raise --checkers above the default

I already set to 32, you think more will be good ?

ncw · June 28, 2023, 6:10pm

That is a lot, but well within the capacity of your server. As a rule of thumb expect about 1k of memory usage per file in a directory, so that will only use about 1 GB of ram for the sync.

If you could get me the curl output that would be perfect.

This shouldn't have any user identifiable info in it, but if you are worried about it then email it to me nick@craig-wood.com - put a link to this forum issue in the email so I've got some context.

I was talking about for use without --fast-list - it probably won't make much difference with --fast-list.

caymard · June 29, 2023, 7:47am

Hi Nick,

I just emailed you the output, thanks.

ncw · June 29, 2023, 2:34pm

Got the trace thank you.

I can see what is wrong - it looks like a single directory with 1 million entries has hit a pathological case in the --fast-list code.

The --fast-list code builds a directory tree then makes sure it is normalized. It is this normalization process which is running very slowly. I suspect the current algorithm is O(N²) on number of directory entries and when N = 1,000,000 that makes for a long time processing.

I have tried to reproduce the pathological behavior but I haven't yet.

Would it be possible for you to email me a file listing? You'd get this by doing

rclone lsf -R ceph:************ > file-listing

I don't actually need to know the names of the files, just how they are laid out and how many are in each directory. If the file names are confidential and you can't email them to me then you could do this which will replace every non / or newline character with a.

rclone lsf -R ceph:************ | tr -c '/\n' 'a' > cleaned-file-listing

I can probably work with that too!

ncw · June 29, 2023, 3:18pm

Scratch the above! I've managed to replicate your problem after noting that you have 1M directories in the root not files.

Will provide an update when I've worked out what is happening!

caymard · June 29, 2023, 3:46pm

Nice, looking forward to see how we can fix that.

Do you think it's possible to parallelize this normalization step ?

ncw · June 29, 2023, 8:25pm

This should take a few seconds at most. I'll post something for you to try in the next few days.

It's not a 1 line fix so needs a bit of thought and testing first!

(And I'm making the 1.63 release tomorrow!)

caymard · June 30, 2023, 7:01am

Perfect, thanks Nick. If you need data to try out, our bucket has 1 million of "folders" with this "subfolders" in it :

a/
a/a
a/a/
a/a/
a/a/a
a/a/a/
a/a/a/
a/a/a/
a/a/a/
a/a/a/
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a
a/a/a/a

ncw · June 30, 2023, 5:35pm

I've had a chance to make a fix for this.

Please give this a go. It shouldn't take longer than 1s to process the directory so hopefully this will speed things up!

Its possible there are other bottlenecks - if so send me the goroutine dump as before.

v1.64.0-beta.7121.89048c7fd.fix-dirtree on branch fix-dirtree (uploaded in 15-30 mins)

caymard · July 3, 2023, 8:07am

Great ! I'll give it a try today, keep you in touch.

caymard · July 3, 2023, 1:06pm

I just ran some tests, and indeed, the "CPU thing" went from ~20 hours to few seconds. That is crazy ! Thanks again Nick.

It will be added in the next patch version ?

ncw · July 3, 2023, 1:11pm

Thanks for testing.

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.63.1 (which will likely be in a few days).

PS If you are working for a business maybe they'd like rclone support ? Checkout rclone.com for details.

caymard · July 3, 2023, 1:27pm

Perfect for next release, thanks.
I work for a company indeed, I'll touch a word to my team about that, I'll be glad to support the project.

Martin_Sandell · July 4, 2023, 4:42pm

Hello. Got this thread recommended by another user here.
I’ve got a similar, but not identical issue.
Trying to copy a big bucket from an Hitachi HCP to a scality ring bucket.
44,5 million objects in one “directory”, but haven’t actually copied a single object yet.
It never gets out of the “index” phase, I’ve aborted after about 150-180 hours.

I’m for sure going to test this to see if it solved my issue as well.

Otherwise maybe Nick.

system · August 3, 2023, 4:42pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.