audouts
November 19, 2023, 10:13am
1
What is the problem you are having with rclone?
I'm trying to upload a large file (1G) to an S3-compatible server using rclone. I want it to be a single-part upload.
If I don't set s3-upload-cutoff
, the file is uploaded as multipart and includes the x-amz-meta-md5chksum
header.
When I set the s3-upload-cutoff
to a larger size (4000M or 4G), the file is still uploaded as multipart. However, rclone doesn't generate the x-amz-meta-md5chksum
header, which seems like a bug. Rclone returns empty for md5sum
.
I have tried with another file that is smaller (200MB). Copying that file works as expected. If I don't set the s3-upload-cutoff
, it is uploaded as multipart and has the x-amz-meta-md5chksum
header. If I set s3-upload-cutoff
to a larger size, it uploads as a single part and does not have the x-amz-meta-md5chksum
header.
Run the command 'rclone version' and share the full output of the command.
rclone v1.64.2
- os/version: debian 11.1 (64 bit)
- os/kernel: 5.10.0-9-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.3
- go/linking: static
- go/tags: none
Which cloud storage system are you using? (eg Google Drive)
CDN77
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
rclone --s3-upload-cutoff=4000M copy largefile.rar remote:files
Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.
[remote]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = https://us-1.cdn77-storage.com
acl = private
A log from the command that you were trying to run with the -vv
flag
2023/11/19 02:20:12 DEBUG : rclone: Version "v1.64.2" starting with parameters ["rclone" "-vv" "--s3-upload-cutoff=4000M" "copy" "largefile.rar" "remote:files"]
2023/11/19 02:20:12 DEBUG : Creating backend with remote "largefile.rar"
2023/11/19 02:20:12 DEBUG : Using config file from "/home/user/.config/rclone/rclone.conf"
2023/11/19 02:20:12 DEBUG : fs cache: adding new entry for parent of "largefile.rar", "/home/user/files"
2023/11/19 02:20:12 DEBUG : Creating backend with remote "remote:files"
2023/11/19 02:20:12 DEBUG : remote: detected overridden config - adding "{J8TjX}" suffix to name
2023/11/19 02:20:12 DEBUG : Resolving service "s3" region "us-east-1"
2023/11/19 02:20:12 DEBUG : fs cache: renaming cache item "remote:files" to be canonical "remote{J8TjX}:files"
2023/11/19 02:20:12 DEBUG : largefile.rar: Need to transfer - File not found at Destination
2023/11/19 02:20:12 INFO : S3 bucket files: Bucket "files" created with ACL "private"
2023/11/19 02:20:20 DEBUG : largefile.rar: open chunk writer: started multipart upload: 2~qG7BsI0hqHPphx3wupRhH9MCLN9RbGi
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: using backend concurrency of 4 instead of --multi-thread-streams 4
2023/11/19 02:20:20 DEBUG : largefile.rar: Starting multi-thread copy with 200 chunks of size 5Mi with 4 parallel streams
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 4/200 (15728640-20971520) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 2/200 (5242880-10485760) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 1/200 (0-5242880) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 3/200 (10485760-15728640) size 5Mi starting
2023/11/19 02:21:09 DEBUG : largefile.rar: multipart upload wrote chunk 4 with 5242880 bytes and etag "6d36dd88386e0d014e76e5015ad43389"
2023/11/19 02:21:09 DEBUG : largefile.rar: multi-thread copy: chunk 4/200 (15728640-20971520) size 5Mi finished
2023/11/19 02:21:09 DEBUG : largefile.rar: multi-thread copy: chunk 5/200 (20971520-26214400) size 5Mi starting
2023/11/19 02:22:57 DEBUG : largefile.rar: multipart upload wrote chunk 3 with 5242880 bytes and etag "8bdb52b050fae024204a8c070cbc1901"
2023/11/19 02:22:57 DEBUG : largefile.rar: multi-thread copy: chunk 3/200 (10485760-15728640) size 5Mi finished
2023/11/19 02:22:57 DEBUG : largefile.rar: multi-thread copy: chunk 6/200 (26214400-31457280) size 5Mi starting
ncw
(Nick Craig-Wood)
November 19, 2023, 11:41am
2
Some background...
Rclone has 3 ways of uploading files to s3
a single part upload
a multi part upload controlled by the s3 backend (used if size > --s3-upload-cutoff
)
a multipart upload controlled by the rclone code (used if size > --multi-thread-cutoff
) - this can be fully concurrent unlike 2) which is sequential on the source read but concurrent on the writes.
It looks like settting MD5s is broken in scenario 3.
Hmm, yes, rclone is in transition from this being controlled by the backend to this being controlled by the rclone core so you will need to set --multi-thread-streams 0
to disable scenario 3 above.
I think this is a bug. Can you open a new issue on Github about this please.
This is very similar to #7424 but I think your problem is to do with the ChunkedWriter in the s3 backend not applying the MD5 metadata from the source so should be much easier to fix.
Files between --s3-upload-cutoff
(default 200M) and --multi-thread-cutoff
(default 256M) will use the old uploading mechanism (scenario 2).
This suggests a workaround for you though, raise --multi-thread-cutoff
to something very large, say --multi-thread-cutoff 1P
and this will disable secenario 3 uploads. Or set --multi-thread-streams 0
which should have the same effect I think.
audouts
November 19, 2023, 9:24pm
3
Thank you for the background! That is helpful.
I've submitted a bug report here:
opened 09:18PM - 19 Nov 23 UTC
#### The associated forum post URL from `https://forum.rclone.org`
https://fo… rum.rclone.org/t/trying-to-force-single-part-with-s3-upload-cutoff/42950/2
#### What is the problem you are having with rclone?
When uploading a large file (1000M) to S3, I expected rclone to send it as a single-part, based on the `--s3-upload-cutoff=4000M` flag. However, the log shows that rclone started a multipart upload. Despite this, rclone does not generate the expected `x-amz-meta-md5chksum` header and `md5sum` reports an empty value.
If I use the default `s3-upload-cutoff` size (or set it below 1000M), rclone uses the same multipart process but correctly adds the `x-amz-meta-md5chksum` header.
#### What is your rclone version (output from `rclone version`)
```
rclone v1.64.2
- os/version: debian 11.1 (64 bit)
- os/kernel: 5.10.0-9-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.3
- go/linking: static
- go/tags: none
```
#### Which OS you are using and how many bits (e.g. Windows 7, 64 bit)
debian 11.1 (64 bit)
#### Which cloud storage system are you using? (e.g. Google Drive)
CDN77 (S3)
#### The command you were trying to run (e.g. `rclone copy /tmp remote:tmp`)
```
rclone --s3-upload-cutoff=4000M copy largefile.rar remote:files
```
#### A log from the command with the `-vv` flag (e.g. output from `rclone -vv copy /tmp remote:tmp`)
```
2023/11/19 02:20:12 DEBUG : rclone: Version "v1.64.2" starting with parameters ["rclone" "-vv" "--s3-upload-cutoff=4000M" "copy" "largefile.rar" "remote:files"]
2023/11/19 02:20:12 DEBUG : Creating backend with remote "largefile.rar"
2023/11/19 02:20:12 DEBUG : Using config file from "/home/user/.config/rclone/rclone.conf"
2023/11/19 02:20:12 DEBUG : fs cache: adding new entry for parent of "largefile.rar", "/home/user/files"
2023/11/19 02:20:12 DEBUG : Creating backend with remote "remote:files"
2023/11/19 02:20:12 DEBUG : remote: detected overridden config - adding "{J8TjX}" suffix to name
2023/11/19 02:20:12 DEBUG : Resolving service "s3" region "us-east-1"
2023/11/19 02:20:12 DEBUG : fs cache: renaming cache item "remote:files" to be canonical "remote{J8TjX}:files"
2023/11/19 02:20:12 DEBUG : largefile.rar: Need to transfer - File not found at Destination
2023/11/19 02:20:12 INFO : S3 bucket files: Bucket "files" created with ACL "private"
2023/11/19 02:20:20 DEBUG : largefile.rar: open chunk writer: started multipart upload: 2~qG7BsI0hqHPphx3wupRhH9MCLN9RbGi
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: using backend concurrency of 4 instead of --multi-thread-streams 4
2023/11/19 02:20:20 DEBUG : largefile.rar: Starting multi-thread copy with 200 chunks of size 5Mi with 4 parallel streams
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 4/200 (15728640-20971520) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 2/200 (5242880-10485760) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 1/200 (0-5242880) size 5Mi starting
2023/11/19 02:20:20 DEBUG : largefile.rar: multi-thread copy: chunk 3/200 (10485760-15728640) size 5Mi starting
2023/11/19 02:21:09 DEBUG : largefile.rar: multipart upload wrote chunk 4 with 5242880 bytes and etag "6d36dd88386e0d014e76e5015ad43389"
2023/11/19 02:21:09 DEBUG : largefile.rar: multi-thread copy: chunk 4/200 (15728640-20971520) size 5Mi finished
2023/11/19 02:21:09 DEBUG : largefile.rar: multi-thread copy: chunk 5/200 (20971520-26214400) size 5Mi starting
2023/11/19 02:22:57 DEBUG : largefile.rar: multipart upload wrote chunk 3 with 5242880 bytes and etag "8bdb52b050fae024204a8c070cbc1901"
2023/11/19 02:22:57 DEBUG : largefile.rar: multi-thread copy: chunk 3/200 (10485760-15728640) size 5Mi finished
2023/11/19 02:22:57 DEBUG : largefile.rar: multi-thread copy: chunk 6/200 (26214400-31457280) size 5Mi starting
```
#### How to use GitHub
* Please use the 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to show that you are affected by the same issue.
* Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
* Subscribe to receive notifications on status change and new comments.
ncw:
This suggests a workaround for you though, raise --multi-thread-cutoff
to something very large, say --multi-thread-cutoff 1P
and this will disable secenario 3 uploads. Or set --multi-thread-streams 0
which should have the same effect I think.
Thanks!
I tried both (separately) and either seems to work. I'm going to use:
--s3-upload-cutoff=4000M --multi-thread-cutoff 4000M
I would still like it to send multiple files at the same time but not split up.
1 Like
ncw
(Nick Craig-Wood)
November 20, 2023, 11:26am
4
Note that the biggest a single part upload can be is 5G so if you are uploading files bigger than 5G you need multipart uploads so you might want that to be --multi-thread-cutoff 1P
Thanks for making the issue - will take a look soon
system
(system)
Closed
November 23, 2023, 11:26am
5
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.