ncw
(Nick Craig-Wood)
July 5, 2023, 12:09pm
41
I don't know anything about the source machine, but it is possible it will list directories quicker using the v2 list protocol if it is supported. That is why AWS developed it to make the listing protocol quicker.
So you can try setting this flag --s3-list-version 2
. (it can also go in the config file).
--s3-list-version int Version of ListObjects to use: 1,2 or 0 for auto
This may work or may give an error that it is unsupported on 1.63.0. Don't try this on earlier rclones as it is very likely to get into a directory loop if it isn't supported.
1 Like
ncw:
--s3-list-version
rclone copy hcp:bucket1/RES/tblIMAGE/ scality:spring-bucket/dir/tsubdir/ -vv --checksum --s3-versions --progress --create-empty-src-dirs --log-file=tblIMAGE_migration_2023.07.05.13.2é0.txt --dry-run --fast-list --dump headers --s3-list-version 2
Seems to not generate errors at least! I'll run it a while and see if it throttles down as well.
No looping so far at least
Hmmm, noticed something now, which also speaks for the issue being with the source object store.
These goes fast as lightning
2023/07/05 13:18:44 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 13:18:44 DEBUG : HTTP REQUEST (req 0xc000769200)
2023/07/05 13:18:44 DEBUG : GET /spring-bucket?delimiter=&key-marker=RES%2FtblIMAGE%2F16456059-thumb&max-keys=1000&prefix=RES%2FtblIMAGE%2F&version-id-marker=null&versions= HTTP/1.1
Host: new.s3.enpoint.com
User-Agent: rclone/v1.64.0-beta.7128.22a14a8c9
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20230705T111844Z
Accept-Encoding: gzip
2023/07/05 13:18:44 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 13:18:44 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/07/05 13:18:44 DEBUG : HTTP RESPONSE (req 0xc000769200)
2023/07/05 13:18:44 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Connection: keep-alive
Content-Type: application/xml
Date: Wed, 05 Jul 2023 11:18:44 GMT
Server: openresty
Strict-Transport-Security: max-age=31536000; includeSubdomains ; preload; always
X-Amz-Id-2: bc5a61bf114cb152e84d
X-Amz-Request-Id: bc5a61bf114cb152e84d
2023/07/05 13:18:44 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/07/05 13:18:44 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 13:18:44 DEBUG : HTTP REQUEST (req 0xc000257800)
2023/07/05 13:18:44 DEBUG : GET /spring-bucket?delimiter=&key-marker=RES%2FtblIMAGE%2F16793601&max-keys=1000&prefix=RES%2FtblIMAGE%2F&version-id-marker=null&versions= HTTP/1.1
Host: new.s3.enpoint.com
User-Agent: rclone/v1.64.0-beta.7128.22a14a8c9
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20230705T111844Z
Accept-Encoding: gzip
2023/07/05 13:18:44 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 13:18:44 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/07/05 13:18:44 DEBUG : HTTP RESPONSE (req 0xc000257800)
2023/07/05 13:18:44 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Connection: keep-alive
Content-Type: application/xml
Date: Wed, 05 Jul 2023 11:18:44 GMT
Server: openresty
Strict-Transport-Security: max-age=31536000; includeSubdomains ; preload; always
X-Amz-Id-2: fe51723b2d46bcd01290
X-Amz-Request-Id: fe51723b2d46bcd01290
2023/07/05 13:18:44 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
But the ones that slow down to snail speed are these from the source object store
2023/07/05 14:52:58 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 14:52:58 DEBUG : HTTP REQUEST (req 0xc04c6b3c00)
2023/07/05 14:52:58 DEBUG : GET /bucket1?delimiter=&key-marker=RES%2FtblIMAGE%2F14627169-thumb&max-keys=1000&prefix=RES%2FtblIMAGE%2F&version-id-marker=99969317042945&versions= HTTP/1.1
Host: old.s3.endpoint.com
User-Agent: rclone/v1.64.0-beta.7128.22a14a8c9
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20230705T125258Z
Accept-Encoding: gzip
2023/07/05 14:52:58 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 14:53:00 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/07/05 14:53:00 DEBUG : HTTP RESPONSE (req 0xc04c6b3c00)
2023/07/05 14:53:00 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Cache-Control: no-cache,no-store,must-revalidate
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; connect-src 'self'; img-src 'self'; style-src 'self' 'unsafe-inline'; object-src 'self'; frame-ancestors 'self';
Content-Type: application/xml;charset=utf-8
Date: Wed, 05 Jul 2023 12:52:58 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
Vary: Accept-Encoding, User-Agent
X-Content-Type-Options: nosniff
X-Dns-Prefetch-Control: off
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
2023/07/05 14:53:00 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/07/05 14:53:00 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 14:53:00 DEBUG : HTTP REQUEST (req 0xc05ba66500)
2023/07/05 14:53:00 DEBUG : GET /bucket1?delimiter=&key-marker=RES%2FtblIMAGE%2F14627810&max-keys=1000&prefix=RES%2FtblIMAGE%2F&version-id-marker=99930884749889&versions= HTTP/1.1
Host: old.s3.endpoint.com
User-Agent: rclone/v1.64.0-beta.7128.22a14a8c9
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20230705T125300Z
Accept-Encoding: gzip
2023/07/05 14:53:00 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/07/05 14:53:01 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/07/05 14:53:01 DEBUG : HTTP RESPONSE (req 0xc05ba66500)
2023/07/05 14:53:01 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Cache-Control: no-cache,no-store,must-revalidate
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; connect-src 'self'; img-src 'self'; style-src 'self' 'unsafe-inline'; object-src 'self'; frame-ancestors 'self';
Content-Type: application/xml;charset=utf-8
Date: Wed, 05 Jul 2023 12:53:00 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
Vary: Accept-Encoding, User-Agent
X-Content-Type-Options: nosniff
X-Dns-Prefetch-Control: off
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
2023/07/05 14:53:01 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Soo something smells rotten in the state of my old object storage
I've opened a support case with the Vendor, so I'll have to troubleshoot it with them.
The
--s3-list-version 2
didn't improve the performance.
Is the process of a copy to start with a compare?
I.E - Look at the destination first, then start doing GET's from the source?
Otherwise I don't understand why it would list everything from the destination bucket first (the thing that goes great)...
Maybe it does not like User-Agent: rclone/v1.64.0-beta.7128.22a14a8c9
you can change it by adding e.g.:
--user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
But probably only S3 storage folks can shed some light on this abysmal listing performance.
Yeah... I think it's up to Hitachi to answer what the issue is here!
Thanks for all help.
I'll share the solution here when I have one.
1 Like
ncw
(Nick Craig-Wood)
July 5, 2023, 2:15pm
46
Good idea. Let us know the outcome.
It was worth a try
It should be listing both source and destination at the same time. If the destination finishes much sooner maybe you haven't noticed that?
1 Like
For comparison here you are example of well performing server - 80 million objects takes 2h to list
With --dump-headers, we can see that there were 80095 HTTP REQUEST, and as many HTTP RESPONSE (200). Processing took about 2 hours.
The server has 30G of RAM. I restarted the copy with the options indicated, with the same result.
Good that things work - but I think it applies to your other thread:) unless it fixes both?
Yes... was the wrong window/thread...