Trying to force single-part with s3-upload-cutoff

What is the problem you are having with rclone?

I'm trying to upload a large file (1G) to an S3-compatible server using rclone. I want it to be a single-part upload.

If I don't set s3-upload-cutoff, the file is uploaded as multipart and includes the x-amz-meta-md5chksum header.

When I set the s3-upload-cutoff to a larger size (4000M or 4G), the file is still uploaded as multipart. However, rclone doesn't generate the x-amz-meta-md5chksum header, which seems like a bug. Rclone returns empty for md5sum.

I have tried with another file that is smaller (200MB). Copying that file works as expected. If I don't set the s3-upload-cutoff, it is uploaded as multipart and has the x-amz-meta-md5chksum header. If I set s3-upload-cutoff to a larger size, it uploads as a single part and does not have the x-amz-meta-md5chksum header.

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.2
- os/version: debian 11.1 (64 bit)
- os/kernel: 5.10.0-9-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.3
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

CDN77

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone --s3-upload-cutoff=4000M copy largefile.rar remote:files

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[remote]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = https://us-1.cdn77-storage.com
acl = private

A log from the command that you were trying to run with the -vv flag

2023/11/19 02:20:12 DEBUG : rclone: Version "v1.64.2" starting with parameters ["rclone" "-vv" "--s3-upload-cutoff=4000M" "copy" "largefile.rar" "remote:files"]
2023/11/19 02:20:12 DEBUG : Creating backend with remote "largefile.rar"
2023/11/19 02:20:12 DEBUG : Using config file from "/home/user/.config/rclone/rclone.conf"
2023/11/19 02:20:12 DEBUG : fs cache: adding new entry for parent of "largefile.rar", "/home/user/files"
2023/11/19 02:20:12 DEBUG : Creating backend with remote "remote:files"
2023/11/19 02:20:12 DEBUG : remote: detected overridden config - adding "{J8TjX}" suffix to name
2023/11/19 02:20:12 DEBUG : Resolving service "s3" region "us-east-1"
2023/11/19 02:20:12 DEBUG : fs cache: renaming cache item "remote:files" to be canonical "remote{J8TjX}:files"
2023/11/19 02:20:12 DEBUG : largefile.rar: Need to transfer - File not found at Destination
2023/11/19 02:20:12 INFO  : S3 bucket files: Bucket "files" created with ACL "private"
2023/11/19 02:20:20 DEBUG : largefile.rar: open chunk writer: started multipart upload: 2~qG7BsI0hqHPphx3wupRhH9MCLN9RbGi
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: using backend concurrency of 4 instead of --multi-thread-streams 4
2023/11/19 02:20:20 DEBUG : largefile.rar: Starting multi-thread copy with 200 chunks of size 5Mi with 4 parallel streams
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 4/200 (15728640-20971520) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 2/200 (5242880-10485760) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 1/200 (0-5242880) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 3/200 (10485760-15728640) size 5Mi starting
2023/11/19 02:21:09 DEBUG : largefile.rar: multipart upload wrote chunk 4 with 5242880 bytes and etag "6d36dd88386e0d014e76e5015ad43389"
2023/11/19 02:21:09 DEBUG : largefile.rar: multi-thread copy: chunk 4/200 (15728640-20971520) size 5Mi finished
2023/11/19 02:21:09 DEBUG : largefile.rar: multi-thread copy: chunk 5/200 (20971520-26214400) size 5Mi starting
2023/11/19 02:22:57 DEBUG : largefile.rar: multipart upload wrote chunk 3 with 5242880 bytes and etag "8bdb52b050fae024204a8c070cbc1901"
2023/11/19 02:22:57 DEBUG : largefile.rar: multi-thread copy: chunk 3/200 (10485760-15728640) size 5Mi finished
2023/11/19 02:22:57 DEBUG : largefile.rar: multi-thread copy: chunk 6/200 (26214400-31457280) size 5Mi starting

Some background...

Rclone has 3 ways of uploading files to s3

  1. a single part upload
  2. a multi part upload controlled by the s3 backend (used if size > --s3-upload-cutoff)
  3. a multipart upload controlled by the rclone code (used if size > --multi-thread-cutoff) - this can be fully concurrent unlike 2) which is sequential on the source read but concurrent on the writes.

It looks like settting MD5s is broken in scenario 3.

Hmm, yes, rclone is in transition from this being controlled by the backend to this being controlled by the rclone core so you will need to set --multi-thread-streams 0 to disable scenario 3 above.

I think this is a bug. Can you open a new issue on Github about this please.

This is very similar to #7424 but I think your problem is to do with the ChunkedWriter in the s3 backend not applying the MD5 metadata from the source so should be much easier to fix.

Files between --s3-upload-cutoff (default 200M) and --multi-thread-cutoff (default 256M) will use the old uploading mechanism (scenario 2).

This suggests a workaround for you though, raise --multi-thread-cutoff to something very large, say --multi-thread-cutoff 1P and this will disable secenario 3 uploads. Or set --multi-thread-streams 0 which should have the same effect I think.

Thank you for the background! That is helpful.

I've submitted a bug report here:

Thanks!

I tried both (separately) and either seems to work. I'm going to use:
--s3-upload-cutoff=4000M --multi-thread-cutoff 4000M

I would still like it to send multiple files at the same time but not split up.

1 Like

Note that the biggest a single part upload can be is 5G so if you are uploading files bigger than 5G you need multipart uploads so you might want that to be --multi-thread-cutoff 1P

Thanks for making the issue - will take a look soon :slight_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.