Has anyone transferred a large number of small files from remote-A to remote-B?

What is the problem you are having with rclone?

When I transfer a billion small files, it seems to be very slow. Has anyone done the same project? Or done the migration of a billion files? Plz tell me which options I can use for optimization . Thank you so much!

What is your rclone version (output from rclone version)

  • rclone v1.53.3
  • go version: go1.15.5

Which OS you are using and how many bits (eg Windows 7, 64 bit)

  • os/arch: linux/amd64 CentOS release 6.9 (Final)

Which cloud storage system are you using? (eg Google Drive)

From QINIU Kodo to AWS s3

The command you were trying to run (eg rclone copy /tmp remote:tmp)




rclone copyto qiniu:kodo-bucketname aws:s3-bucketname -P -c --check-first --checkers=32 --transfers=80 --use-mmap  -vv


    

The rclone config contents with secrets removed.




   [qiniu]   
   type = s3   
   provider = Other   
   env_auth = false   
   access_key_id = 123456   
   secret_access_key = 123456   
   endpoint = s3-us-north-1.qiniucs.com   
   acl = bucket-owner-full-control   

   [aws]   
   type = s3   
   provider = AWS   
   access_key_id = 123456   
   secret_access_key = 123456   
   region = ap-southeast-1   
   location_constraint = ap-southeast-1   
   acl = bucket-owner-full-control   
   bucket_acl = public-read-write   



A log from the command with the -vv flag

No log. When I transfer a billion small files, it seems to be very slow. Plz tell me which options I can use for optimization . Thank you so much!

hello and welcome to the forum,

slow as compared to what?

some backends, like gdrive, thottle the number of transcation per second.
qiniu - never heard of them in the forum.

Thank you for your reply.
QINIU is a Chinese storage provider.
I have a billion files with 50TB. When I was transferring the file with "-n", it was killed at 2TB .
This is event log:



2020-12-24 05:47:08 ERROR : : Entry doesn't belong in directory "" (same as directory) - ignoring
2020-12-24 12:53:38 ERROR : 10307/91/35: error reading source directory: RequestError: send request failed
caused by: Get "https://s3-us-north-1.qiniucs.com/kodo-bucketname?delimiter=%2F&max-keys=1000&prefix=10307%2F91%2F35%2F": http2: client connection force closed via ClientConn.Close

caused by: http2: client connection force closed via ClientConn.Close
Transferred:             0 / 2.293 TBytes, 0%, 0 Bytes/s, ETA -
Errors:                82 (retrying may help)
Transferred:            0 / 49223111, 0%
Elapsed time:   8h12m25.0sKilled



I guess this is due to the limitations of the QINIU. But I don't konw what 'Entry doesn't belong in directory' means.
BTW, I calculated that the whole transfer process would take 7 days. What can I do to speed up the transfer ?

Thank you!

you can try to tweak
https://rclone.org/docs/#transfers-n
https://rclone.org/docs/#checkers-n
https://rclone.org/s3/#s3-chunk-size
https://rclone.org/docs/#buffer-size-size
https://rclone.org/s3/#multipart-uploads

I will try. Thank you.

Hi
After the first sync is complete, I will do incremental data syncs daily. I will use the copyto command, will the md5 values of 1 billion files be compared each time ? cuz I find that info:

This doesn't transfer unchanged files, testing by size and modification time or MD5SUM. It doesn't delete files from the destination.

Thank you bro.

hi,
most people use copy, not copyto.

take a read of this
https://rclone.org/docs/#c-checksum

Sure,Let me try. Thank you bro.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.