Multipart upload to B2 sometimes failing with corrupted on transfer: sizes differ

What is the problem you are having with rclone?

Copying a tree of 35 files with rclone copy from local disk to Backblaze B2, I see frequent corrupted on transfer: sizes differ errors with multipart file uploads. In each case, rclone reports that the destination file size is zero, e.g.:

2023/09/18 17:20:13 ERROR : catalog_sales/20221122_161347_00795_djagr_3a042953-d0a2-4b8d-8c4e-6a88df245253: corrupted on transfer: sizes differ 244695498 vs 0

Six files in the tree are big enough for multipart uploads; three transfer correctly, three fail with corrupted on transfer: sizes differ. The same files succeed and fail through the two subsequent retries.

Digging deeper: helpfully, Rclone just hides the files in B2 on failure, so I can download them. They all match the source files. Further, if I use --ignore-size, the copy succeeds with no errors, and a subsequent rclone check ... --download shows that all of the files in the destination are identical to those in the source.

So, is the b2_finish_large_file API erroneously reporting zero file size? I ran with --dump headers,responses and the b2_finish_large_file response contains the expected ContentLength:

2023/09/18 17:20:13 DEBUG : catalog_sales/20221122_161347_00795_djagr_3a042953-d0a2-4b8d-8c4e-6a88df245253: Finishing large file upload with 3 parts
2023/09/18 17:20:13 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/09/18 17:20:13 DEBUG : HTTP REQUEST (req 0xc000c74c00)
2023/09/18 17:20:13 DEBUG : POST /b2api/v1/b2_finish_large_file HTTP/1.1
User-Agent: rclone/v1.64.0
Content-Length: 259
Authorization: XXXX
Content-Type: application/json
Accept-Encoding: gzip

2023/09/18 17:20:13 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/09/18 17:20:13 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/09/18 17:20:13 DEBUG : HTTP RESPONSE (req 0xc000c74c00)
2023/09/18 17:20:13 DEBUG : HTTP/1.1 200 
Content-Length: 854
Cache-Control: max-age=0, no-cache, no-store
Content-Type: application/json;charset=UTF-8
Date: Mon, 18 Sep 2023 17:20:13 GMT

  "accountId": "15f935cf4dcb",
  "action": "upload",
  "bucketId": "91a5cf5973f5dc4f848d0c1b",
  "contentLength": 244695498,
  "contentMd5": null,
  "contentSha1": "none",
  "contentType": "application/octet-stream",
  "fileId": "4_z91a5cf5973f5dc4f848d0c1b_f221926aaeb9f362c_d20230918_m172005_c004_v0402008_t0028_u01695057605512",
  "fileInfo": {
    "src_last_modified_millis": "1669152496000",
    "large_file_sha1": "af2db424fe5092a7933e1ca128c0613a1953342f"
  "fileName": "tpcds-benchmark/catalog_sales/20221122_161347_00795_djagr_3a042953-d0a2-4b8d-8c4e-6a88df245253",
  "fileRetention": {
    "isClientAuthorizedToRead": false,
    "value": null
  "legalHold": {
    "isClientAuthorizedToRead": false,
    "value": null
  "serverSideEncryption": {
    "algorithm": null,
    "mode": null
  "uploadTimestamp": 1695057605512
2023/09/18 17:20:13 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/09/18 17:20:13 ERROR : catalog_sales/20221122_161347_00795_djagr_3a042953-d0a2-4b8d-8c4e-6a88df245253: corrupted on transfer: sizes differ 244695498 vs 0

It seems like, somehow, that size is being zero'd out sometimes.

Run the command 'rclone version' and share the full output of the command.

# rclone version
rclone v1.64.0
- os/version: debian 11.7 (64 bit)
- os/kernel: 5.10.0-23-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.1
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Backblaze B2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone --no-check-dest copy tpcds-benchmark b2remote:metadaddy-rclone/tpcds-benchmark

I added --dump headers,responses -vv during my investigation.

The rclone config contents with secrets removed.

    "b2crypt": {
        "password": "*****",
        "remote": "b2remote:metadaddy-rclone",
        "type": "crypt"
    "b2remote": {
        "account": "*****",
        "key": "*****",
        "type": "b2"
    "s3remote": {
        "access_key_id": "*****",
        "location_constraint": "us-west-1",
        "provider": "AWS",
        "region": "us-west-1",
        "secret_access_key": "*****",
        "type": "s3"

A log from the command with the -vv flag

This code did get re-written for v1.64.0

I stared at the code and I see the problem. In fact I'm surprised it ever worked! I suspect it worked because sometimes the b2 listings were eventually consistent and sometimes they weren't.

Anyway please give this a go.

v1.65.0-beta.7378.6072d314e.fix-b2-multipart on branch fix-b2-multipart (uploaded in 15-30 mins)

If that fixes the problem then I'll put it in a v1.64.1

PS I think this should work in v1.63.1 also.

In fact I'm surprised it ever worked!

Yep - I've been there!

If that fixes the problem

It does - my copy succeeds with this version with no need for --ignore-size.

then I'll put it in a v1.64.1

Cool - I'm writing an article for the Backblaze blog right now about the awesome work on multithreaded transfers in 1.64.0. I'll include a note that I had to use --ignore-size to successfully copy a file tree to B2, and that it should be fixed in 1.64.1. Thanks!

While I'm here, @ncw, is this an accurate summary of the changes to multithreaded file transfers?

Something of a Swiss Army Knife for cloud storage, Rclone can copy files, synchronize directories, and even mount remote storage as a local filesystem. Previous versions of Rclone were able to take advantage of multithreading to accelerate the transfer of ‘large’, by default at least 256 MB, files, but the benefits were limited. When transferring files from, say, Amazon S3 to Backblaze B2, Rclone would start multiple threads to simultaneously read chunks of the file from Amazon S3 into memory, and start another set of threads to simultaneously write that data to Backblaze B2. Data transfer speed was limited by the need to sync up ‘producer’ threads downloading data into memory with ‘consumer’ threads uploading the data.

Rclone v1.64.0 completely refactors multithreaded transfers. Now Rclone starts a single set of threads, each of which both reads a chunk of data from the source service into memory, and then writes that chunk to the destination service, iterating through a subset of chunks until the transfer is complete. The threads transfer their chunks of data in parallel, and each transfer is independent of the others. This architecture is both simpler and much, much faster.

Thanks for testing :slight_smile:

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.64.1

Previously what would happen is that the read of data from the source (S3 in this example) would be done in a single threaded sequential way. Each chunk read from the source was then passed off to another thread to be written in parallel. So if the source was about the same speed as the destination (as is likely when reading/writing over the network) the destination transfer never got the benefit of being parallel as only 1 thread was running at once. When reading from disk (the common case) which is much faster than network this does give a nice benefit of parallel transfers.

That is 100% accurate :slight_smile:

This 6072d314e1e7893ac8ace57477924bc0ee48c2fb commit seems to cause hangs.
1.63.1 works.

2023/09/20 13:39:26 INFO  : 
Transferred:      106.381 MiB / 1.577 GiB, 7%, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Checks:             34702 / 34702, 100%
Transferred:            3 / 4, 75%
Elapsed time:     26m30.6s
 * UUOEZZeDzV13DuJJdQZTcw…1OiTwE-ZsWgHzBF9r7IG9n:  5% /1.566Gi, 0/s, 2562047h47m16s

Please do a run with -vv and post the log - thanks.

Previously what would happen...

Thanks, @ncw - I corrected the blog entry. It should go out tomorrow. I'll post a link here :+1:


Well, I don't know how to reproduce it anymore. I open a new bug report if this happens again... :man_shrugging:

1 Like

We published the blog post on the v1.64.0 performance improvements - I'll update it after v1.64.1 comes out :+1: