Paste config here
[PPROD1]
type = s3
provider = Other
env_auth = false
access_key_id = *****
secret_access_key = ******
endpoint = ****
acl = bucket-owner-full-control
[PPROD2]
type = s3
provider = Other
env_auth = false
access_key_id =****
secret_access_key = ****
endpoint = *****
acl = bucket-owner-full-control
The number of workers and checkers is the only settings worked to have a big transfer bandwidth.
The OOM is done after 2 hours of transfert at 250 MIOS.
The buckets have 700 millions objets of 512 kb.
How to avoid the OOM and keep this transfert rate ?
What's the memory on system you are using? You can try --nmap and see if the memory helps but if you have a lot of objects, there is an open issue with directory caching that would need to solved if you have a huge amount of objects.
Yes it is the case, we have 700 billions objects in one folder.
Do you have a delay to fix it ?
There is possible to order the change with commercial offer ?
Hello Nick,
I'm not sure to understand what the keys are but there is a extract of the log life during a transfert from a test bucket with 1 million objet
2020/09/17 08:03:13 INFO : blocks/0000afe4/48a3304237720d61000000000000065900000000-esrtay: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000b050/d28beb37fb543a7b000000000000065a00000000-md8wxt: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000b050/d28beb37fb543a7b000000000000065a00000000-md8wxt: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000b092/bf761bb3537d3d51000000000000065700000000-i6o2tp: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000b092/bf761bb3537d3d51000000000000065700000000-i6o2tp: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000962d/8ae178bb2672dd8c000000000000061e00000000-g8iyg: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000962d/8ae178bb2672dd8c000000000000061e00000000-g8iyg: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000b098/bf761bb3518e3d94000000000000065700000000-erha3b: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000b098/bf761bb3518e3d94000000000000065700000000-erha3b: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000b0a1/bf761bb3511d99d3000000000000065700000000-dkbbsa: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000b0a1/bf761bb3511d99d3000000000000065700000000-dkbbsa: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/000099ca/8ae178bb26702b81000000000000061e00000000-g4pyg: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/000099ca/8ae178bb26702b81000000000000061e00000000-g4pyg: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000b0b3/1efa7ea3d6bbb003000000000000062500000000-5v3m7j: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000b0b3/1efa7ea3d6bbb003000000000000062500000000-5v3m7j: Copied (new)
2020/09/17 08:03:13 DEBUG : blocks/0000b062/d28beb37fb5b6928000000000000065a00000000-mdj0bj: MD5 = 386865483615bf8368be7a8f17bfb2d9 OK
2020/09/17 08:03:13 INFO : blocks/0000b062/d28beb37fb5b6928000000000000065a00000000-mdj0bj: Copied (new)
The duration is 8 hours and the rclone thread take 97 GB of RAM now and continue to grow.
The key in S3 terminology is this bit blocks/0000afe4/48a3304237720d61000000000000065900000000-esrtay
How many files in a typical directory - that is the limiting factor for rclone memory usage, not the total number of files. You can do something like this
rclone size PPROD1:bucket01/blocks/0000afe4
To find out.
If there aren't too many then I think your problem is that you are iterating too many large directories at once. So reduce --checkers from 128, to say --checkers 8 - this will iterate one directory at a time and use 16 times less memory. I don't think it will slow things down.
That is now many objects in the bucket. You need enough memory to hold that many rclone objects in memory at once which is probably approx 1.5GB. If you use --checkers 1 then rclone will only hold one directory in RAM at once and I think you should be able to sync OK.
If i undestand well the number of checkers is the number on directories in queue in RAM ?
If i put 4 checker i have 4 folders in the same time in ram ?