Chunker: uploads to GCS (S3) fail if the chunk size is greater than the max part size

What is the problem you are having with rclone?

Uploads to a chunker-wrapped Google Cloud Storage (XML API) fail during the server-side copy when the chunk size is greater than the max part size of 5GiB:

$ dd if=/dev/urandom iflag=fullblock of=gcs/24GB.bin bs=64M count=384 status=progress
25769803776 bytes (26 GB) copied, 504.541366 s, 51.1 MB/s
dd: closing output file ‘gcs/24GB.bin’: Input/output error

Works as expected when chunk_size is set to 4GiB

Run the command 'rclone version' and share the full output of the command.

$ rclone --version
rclone v1.65.1
- os/version: centos 7.9.2009 (64 bit)
- os/kernel: 3.10.0-1160.81.1.el7.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.5
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Cloud Storage (XML API) + Chunker

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone mount -vv --dump reponses --log-file rclone.log --vfs-used-is-size --s3-chunk-size 64M --s3-upload-concurrency 8 gcs: gcs/

The rclone config contents with secrets removed.

[gcs-wrapped]
type = s3
provider = GCS
access_key_id = xxxx
secret_access_key = xxxx
endpoint = https://storage.googleapis.com

[gcs]
type = chunker
remote = gcs-wrapped:ewirrick/rclone
chunk_size = 8G

A log from the command with the -vv flag

2024/01/29 16:37:09 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/01/29 16:37:09 DEBUG : 24GB.bin.rclone_chunk.001_1qj61m: Starting  multipart copy with 2 parts
2024/01/29 16:37:09 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/01/29 16:37:09 DEBUG : HTTP REQUEST (req 0xc000973d00)
2024/01/29 16:37:09 DEBUG : PUT /rclone/24GB.bin.rclone_chunk.001?partNumber=1&uploadId=ABPnzm5A023_HjAuVvyxSxfxy_nlImomB21v_JtLuftTt4X__55-qdrlYZQG4x6m3tNLlkk HTTP/1.1
Host: ewirrick.storage.googleapis.com
User-Agent: rclone/v1.65.1
Content-Length: 0
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Copy-Source: ewirrick/rclone/24GB.bin.rclone_chunk.001_1qj61m
X-Amz-Copy-Source-Range: bytes=0-4999341931
X-Amz-Date: 20240129T213709Z
Accept-Encoding: gzip

2024/01/29 16:37:09 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/01/29 16:37:09 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/01/29 16:37:09 DEBUG : HTTP RESPONSE (req 0xc000973d00)
2024/01/29 16:37:09 DEBUG : HTTP/2.0 400 Bad Request
Content-Length: 250
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type: application/xml; charset=UTF-8
Date: Mon, 29 Jan 2024 21:37:09 GMT
Server: UploadServer
X-Guploader-Uploadid: xxxx

<?xml version='1.0' encoding='UTF-8'?><Error><Code>NotImplemented</Code><Message>A header or query you provided requested a function that is not implemented.</Message><Details>UploadPartCopy is not implemented for multipart uploads.</Details></Error>
2024/01/29 16:37:09 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/01/29 16:37:09 DEBUG : 24GB.bin.rclone_chunk.001_1qj61m: Cancelling multipart copy
2024/01/29 16:37:09 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

fwiw, might want to test a simple copy command, not with rclone mount.

The same behavior is observed when using the copy command

It looks like there is something odd about GCS S3 multpart server side copy

Try experimenting with this flag

  --s3-copy-cutoff SizeSuffix   Cutoff for switching to multipart copy (default 4.656Gi)

Maybe multipart copies aren't supported on GCS in which case set it really large --s3-copy-cutoff 1P or try with it smaller --s3-copy-cutoff 100M

--s3-copy-cutoff 1P fixes the issue, but --s3-copy-cutoff 100M does not, so I'm led to believe that multipart copies just aren't supported with the GCS S3 backend

By the way, the documentation states the maximum multipart copy cutoff is 5GiB, so we may want to update that

Yes that seems likely. It would be nice to find some confirmation of this (maybe in the Google bug tracker) and I can put a quirk in for GCS

If you are obeying the S3 protocol the Max size is indeed 5G but I think Google is deviating here too.

This needs to be reported to Google if we can't find it in the Google bug tracker.

I saw your issue #323465186 in the google bug tracker - thanks for making that.

I've added a quirk to rclone which internally sets the --s3-copy-cutoff to the maximum possible for the GCS provider.

Can you give that a go?

v1.66.0-beta.7678.097db64b5.fix-gcs-s3-copy on branch fix-gcs-s3-copy (uploaded in 15-30 mins)

The issue appears to be resolved with this build, thanks

Thank you for testing

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.66

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.