Transfer big files 50Gb + from S3 bucket to another s3 bucket doesnt starts

Hi everyone. I'm a new user of Rclone forum (I've used Rclone for a time) and I'm having my first annoying problem (probably caused by my lack of knowledge).

What is the problem you are having with rclone?

Transfer big files 50Gb + from S3 bucket to another s3 bucket doesn't starts

Run the command 'rclone version' and share the full output of the command.

rclone v1.65.0

  • os/version: amazon 2023 (64 bit)
  • os/kernel: 6.1.61-85.141.amzn2023.x86_64 (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.21.4
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

AWS S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync accountA:bucket1/xxxx/file_1_bak.csv accountA:bucket1/xxxx/file_1_bak.csv --stats=10s -vv --stats-one-line  --s3-disable-checksum --s3-chunk-size=5G

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[accountA]
type = s3
provider = AWS
env_auth = true
region = eu-west-1
location_constraint = eu-west-1
acl = bucket-owner-full-control
bucket_acl = private
server_side_encryption = AES256

A log from the command that you were trying to run with the -vv flag

I'll put part of the log because sensitive information, if more lines are needed I'll provide it deleting some parts.

<7>DEBUG : file_1_bak.csv: Need to transfer - File not found at Destination
<7>DEBUG : file_1_bak.csv: Starting  multipart copy with 11 parts
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -
<6>INFO  :           0 B / 48.123 GiB, 0%, 0 B/s, ETA -

Thank you so much for your help :slight_smile:

It will start when ready... you set chunk size to 5G so it has to be downloaded first. Try with default value - which is 5M - it will start very fast.

Also note that with default --s3-upload-concurrency 4 and 5G chunk you need 20GB of RAM. If you copy multiple files then with default --transfers 4 it will grow to 80GB of RAM.

Check docs and make sure you understand impact of flags you are changing. If not sure better use defaults.

Thank you for your response @kapitainsky

I tried using defaults, but it's still not starting the transfer... Also I tried to download to local instance (which worked ok) and after that upload it to the destination, but it stuck in the same state 0B/s.

I tried with smaller files and it works nice.

welcome to the forum,

you wrote that S3 bucket to another s3 bucket but your command seems to use only a single bucket?

rclone sync accountA:bucket1/xxxx/file_1_bak.csv accountA:bucket1

normally such a transfer would be server-side copy

that flag should do nothing on a server-side copy?

i could be wrong, but there are two different types of encryption in use with your command.

  1. server_side_encryption = AES256 - i believe that is the default value for AWS, cannot be disabled, but can be changed.
  2. --s3-disable-checksum - used by rclone to verify file transfers. can be disabled.

the problem is there is no full debug log for us to look at?

Thank you @asdffdsa for your reply.

you wrote that S3 bucket to another s3 bucket but your command seems to use only a single bucket?

It was a typo anonymizing the bucket names, sorry :sweat_smile:

that flag should do nothing on a server-side copy?

That's right. Deleted.

server_side_encryption = AES256 - i believe that is the default value for AWS, cannot be disabled, but can be changed.

Nope, if it's not stablished when you upload a file the file is stored with no additional encryption.


Updating my case, I found that it takes so much time to start transfer (even if I donwload it locally before) Why it takes so long to start the transfer? Can be optimized?

can you post a full debug log with minimal redactions.

sorry if too off-topic but every object uploaded to aws s3 is default encrypted with AES256

  • i always use sse_customer_algorithm = AES256 and in that case, i see a practical, important, real-world differences.
  • with server_side_encryption = AES256, not seeing any practical differences?
    why use it, what is the advantage?

i did more testing, i think i might know what the issues are.

i find that the transfer starts immediately.

yes, with server-side copy on s3, i think this can be optimized.

rclone does a single PUT, waits on AWS, then does next PUT
whereas S3Browser will do multiple PUT at the same time.

so, for a 100GiB file, with 22 parts

  • with s3browser, the transfer took approx 6 minutes
  • with rclone, in that same 6 minute time frame, only 3 parts were uploaded.
1 Like

He, he, you are right! I thought I made this multithreaded but I obviously forgot.

Give this a go

v1.66.0-beta.7558.e71507737.fix-s3-copy-multithread on branch fix-s3-copy-multithread (uploaded in 15-30 mins)

The concurrency is set with --s3-upload-concurrency which defaults to 4.

Maybe this should have its own flag --s3-copy-concurrency with a higher default?

2 Likes

i gave it a go. the beta worked but there are a few minor issues.

i want to do some more testing before i post about it.

1 Like

Yep, I realized about it too, but it is not reflected on the stats printed (when you use --stats-one-line argument)

Thank you so much for the research. Now it is taking around 20 minutes to transfer ~50Gb file.

Thank you @ncw for the quick fix :slight_smile:

Not problem at all about the off-topic :smile: Yes, you are absolutely right I confused server_side_encryption with sse_customer_algorithm.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.