Transport connection broken

What is the problem you are having with rclone?

Looks like rclone is hanging / failing on files that change often (eg, /var/log/messages).

What is your rclone version (output from rclone version)

rclone v1.51.0

  • os/arch: linux/amd64
  • go version: go1.13.7

Which OS you are using and how many bits (eg Windows 7, 64 bit)

CentOS Linux release 7.6.1810

Which cloud storage system are you using? (eg Google Drive)

backblaze b2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone -vv --bwlimit 10M --exclude-from /usr/local/etc/rclone.excludes --fast-list --transfers 32 sync /var crypt-sys-var:

The rclone config contents with secrets removed.

[bblaze]
type = b2
account = foo
key = bar

[crypt-sys-var]
type = crypt
remote = bblaze:sys-var-00
password = foobar
password2 = baz

A log from the command with the -vv flag

https://pastebin.com/H9VekFVR

Is the file being modified while the upload is taking place? That would certainly cause some weirdness.

Try adding the --local-no-check-updated flag.

Hello and thanks for the suggestion. I tried the --local-no-check-updated but same/similar results:
http://pastebin.com/bVNMGm3W

looks like a network issue of some kind.

If it were a network issue I think I'd see this across a variety of files. The issue seems to be specific to files which change frequently --and these files may be changing while the sync is taking place.

I'm surprised this is an issue. And more surprised --local-no-check doesn't address it. There must be something else.

so you are 100% sure, it is a file open issue and not a network issue?

have you searched the forum?

not sure about linux, but on windows, we have https://en.wikipedia.org/wiki/Shadow_Copy
"A snapshot is a read-only point-in-time copy of the volume. Snapshots allow the creation of consistent backups of a volume, ensuring that the contents do not change and are not while the backup is being made."

What do you mean by changing? Generally, if a file being written to, rclone would detect the change and skip the file. How are you are validating it is changing? You only have a small clip of a debug log and I'm not seeing any changing files in that log.

Every time I've ran rclone against /var it gets hung up on /var/log/messages (as shown in the pastebin snippets). I don't fully understand the debug output from rclone but this tells me there's contention: ContentLength=149251293 with Body length 149604545

Here's a quick demonstration of what I mean by 'changing'

lsd /var/log/messages; sleep 5; date; lsd /var/log/messages; sleep 5; date; lsd /var/log/messages

-rw-------. 1 root root 203438050 May 19 08:50 /var/log/messages
Tue May 19 08:50:31 PDT 2020
-rw-------. 1 root root 203444015 May 19 08:50 /var/log/messages
Tue May 19 08:50:36 PDT 2020
-rw-------. 1 root root 203448148 May 19 08:50 /var/log/messages

Can you include a full debug log of that with rclone as an example?

Sure. I'd prefer to PM that to you. It'll take an hour or so to complete. Also, I'll need to obfuscate the filenames.

If the file is growing, the size/mod time/hash will be different when the transfer started and when it finishes. That will definitely be an issue.

root@s163042:/data/bin# rclone copy /var/log/auth.log b2:x/ -vv 
2020/05/19 11:45:46 DEBUG : rclone: Version "v1.51.0-311-ge91b5095-beta" starting with parameters ["rclone" "copy" "/var/log/auth.log" "b2:x/" "-vv"]
2020/05/19 11:45:46 DEBUG : Using config file from "/root/.rclone.conf"
2020/05/19 11:45:46 DEBUG : fs cache: renaming cache item "/var/log/auth.log" to be canonical "/var/log"
2020/05/19 11:45:47 DEBUG : fs cache: renaming cache item "b2:x/" to be canonical "b2:x"
2020/05/19 11:45:47 DEBUG : auth.log: Need to transfer - File not found at Destination
2020/05/19 11:45:49 DEBUG : auth.log: SHA-1 = c97f8b4145492b5218dc3c076fe414487effd797 (Local file system at /var/log)
2020/05/19 11:45:49 DEBUG : auth.log: SHA-1 = 585ec88343d53653927814e314a0657a6b20e210 (B2 bucket x)
2020/05/19 11:45:49 ERROR : auth.log: corrupted on transfer: SHA-1 hash differ "c97f8b4145492b5218dc3c076fe414487effd797" vs "585ec88343d53653927814e314a0657a6b20e210"
2020/05/19 11:45:49 INFO  : auth.log: Removing failed copy
2020/05/19 11:45:49 ERROR : Attempt 1/3 failed with 1 errors and: corrupted on transfer: SHA-1 hash differ "c97f8b4145492b5218dc3c076fe414487effd797" vs "585ec88343d53653927814e314a0657a6b20e210"
2020/05/19 11:45:49 DEBUG : auth.log: Need to transfer - File not found at Destination
2020/05/19 11:45:51 DEBUG : auth.log: SHA-1 = abc212f2e39880bb96bd40106f2b75a065c632a4 (Local file system at /var/log)
2020/05/19 11:45:51 DEBUG : auth.log: SHA-1 = fa3112e5a0c7cf63aac091bc7ca969e39194a8cf (B2 bucket x)
2020/05/19 11:45:51 ERROR : auth.log: corrupted on transfer: SHA-1 hash differ "abc212f2e39880bb96bd40106f2b75a065c632a4" vs "fa3112e5a0c7cf63aac091bc7ca969e39194a8cf"
2020/05/19 11:45:51 INFO  : auth.log: Removing failed copy
2020/05/19 11:45:52 ERROR : Attempt 2/3 failed with 1 errors and: corrupted on transfer: SHA-1 hash differ "abc212f2e39880bb96bd40106f2b75a065c632a4" vs "fa3112e5a0c7cf63aac091bc7ca969e39194a8cf"
2020/05/19 11:45:52 DEBUG : auth.log: Need to transfer - File not found at Destination
2020/05/19 11:45:54 DEBUG : auth.log: Clearing upload URL because of error: Post "https://pod-000-1137-14.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001137_t0014": net/http: HTTP/1.x transport connection broken: http: ContentLength=16633886 with Body length 16634107
2020/05/19 11:45:54 DEBUG : pacer: low level retry 1/1 (error Post "https://pod-000-1137-14.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001137_t0014": net/http: HTTP/1.x transport connection broken: http: ContentLength=16633886 with Body length 16634107)
2020/05/19 11:45:54 DEBUG : pacer: Rate limited, increasing sleep to 20ms
2020/05/19 11:45:54 DEBUG : auth.log: Received error: Post "https://pod-000-1137-14.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001137_t0014": net/http: HTTP/1.x transport connection broken: http: ContentLength=16633886 with Body length 16634107 - low level retry 1/10
2020/05/19 11:45:54 DEBUG : pacer: Reducing sleep to 10ms
2020/05/19 11:45:56 DEBUG : auth.log: Clearing upload URL because of error: Post "https://pod-000-1043-02.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001043_t0030": net/http: HTTP/1.x transport connection broken: http: ContentLength=16634107 with Body length 16634565
2020/05/19 11:45:56 DEBUG : pacer: low level retry 1/1 (error Post "https://pod-000-1043-02.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001043_t0030": net/http: HTTP/1.x transport connection broken: http: ContentLength=16634107 with Body length 16634565)
2020/05/19 11:45:56 DEBUG : pacer: Rate limited, increasing sleep to 20ms
2020/05/19 11:45:56 DEBUG : auth.log: Received error: Post "https://pod-000-1043-02.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001043_t0030": net/http: HTTP/1.x transport connection broken: http: ContentLength=16634107 with Body length 16634565 - low level retry 2/10
2020/05/19 11:45:56 DEBUG : pacer: Reducing sleep to 10ms
2020/05/19 11:45:57 ERROR : auth.log: Failed to copy: Post "https://pod-000-1136-09.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001136_t0042": can't copy - source file is being updated (size changed from 16634922 to 16635140)
2020/05/19 11:45:57 ERROR : Attempt 3/3 failed with 1 errors and: Post "https://pod-000-1136-09.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001136_t0042": can't copy - source file is being updated (size changed from 16634922 to 16635140)
2020/05/19 11:45:57 INFO  : 
Transferred:   	   65.449M / 65.449 MBytes, 100%, 6.964 MBytes/s, ETA 0s
Errors:                 1 (retrying may help)
Elapsed time:         9.3s

2020/05/19 11:45:57 DEBUG : 7 go routines active
2020/05/19 11:45:57 Failed to copy: Post "https://pod-000-1136-09.backblaze.com/b2api/v1/b2_upload_file/dfad5c2a9cf5391157cc0f1c/c001_v0001136_t0042": can't copy - source file is being updated (size changed from 16634922 to 16635140)
root@s163042:/data/bin# rclone copy /var/log/auth.log b2:x/ -vv   --local-no-check-updated
2020/05/19 11:46:43 DEBUG : rclone: Version "v1.51.0-311-ge91b5095-beta" starting with parameters ["rclone" "copy" "/var/log/auth.log" "b2:x/" "-vv" "--local-no-check-updated"]
2020/05/19 11:46:43 DEBUG : Using config file from "/root/.rclone.conf"
2020/05/19 11:46:43 DEBUG : fs cache: renaming cache item "/var/log/auth.log" to be canonical "/var/log"
2020/05/19 11:46:44 DEBUG : fs cache: renaming cache item "b2:x/" to be canonical "b2:x"
2020/05/19 11:46:44 DEBUG : auth.log: Need to transfer - File not found at Destination
2020/05/19 11:46:51 DEBUG : auth.log: SHA-1 = a6816f2ee421925ae702397f93ff6e4dcd497310 (Local file system at /var/log)
2020/05/19 11:46:51 DEBUG : auth.log: SHA-1 = 5e55f551710707b79254a39a326bdadc3ea4925e (B2 bucket x)
2020/05/19 11:46:51 ERROR : auth.log: corrupted on transfer: SHA-1 hash differ "a6816f2ee421925ae702397f93ff6e4dcd497310" vs "5e55f551710707b79254a39a326bdadc3ea4925e"
2020/05/19 11:46:51 INFO  : auth.log: Removing failed copy
2020/05/19 11:46:52 ERROR : Attempt 1/3 failed with 1 errors and: corrupted on transfer: SHA-1 hash differ "a6816f2ee421925ae702397f93ff6e4dcd497310" vs "5e55f551710707b79254a39a326bdadc3ea4925e"
2020/05/19 11:46:52 DEBUG : auth.log: Need to transfer - File not found at Destination
2020/05/19 11:46:54 DEBUG : auth.log: SHA-1 = a6816f2ee421925ae702397f93ff6e4dcd497310 OK
2020/05/19 11:46:54 INFO  : auth.log: Copied (new)
2020/05/19 11:46:54 ERROR : Attempt 2/3 succeeded
2020/05/19 11:46:54 INFO  : 
Transferred:   	   31.741M / 31.741 MBytes, 100%, 3.040 MBytes/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:        10.4s

You'll notice it completed the second time though i ran it. but it took two tries because auth.log was getting updated. The http connection issues are manifesting because of the retries because I've run this repeatedly andc it only happens when it retries.

I do agree that --local-no-check-updated isn't working as expected. I'm not sure if it should be. I've tested that on google drive and it works as expected but not on b2.

I mean, just do a copy of a single file that was causing the problem. You don't have to do the whole thing.

So if you are confident /var/log/messages is causing it, use a rclone copy and replicate with that.

So for google drive, I see something like this:

felix@gemini:/data$ rclone copy test.mkv GD: -vv
2020/05/19 12:04:47 DEBUG : rclone: Version "v1.51.0" starting with parameters ["rclone" "copy" "test.mkv" "GD:" "-vv"]
2020/05/19 12:04:47 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2020/05/19 12:04:48 DEBUG : test.mkv: Need to transfer - File not found at Destination
2020/05/19 12:04:48 DEBUG : test.mkv: Sending chunk 0 length 445644800
2020/05/19 12:04:51 ERROR : test.mkv: Failed to copy: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUmhRcIkdZ4AA-r2HOH_p9eSm8bWca1bGxwTgYMo5Yknslgy9darKgHx-cn1bnKPiaMil1WHWEG0uG-PfngElQ: can't copy - source file is being updated (size changed from 445644800 to 630194176)
2020/05/19 12:04:51 ERROR : Attempt 1/3 failed with 1 errors and: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUmhRcIkdZ4AA-r2HOH_p9eSm8bWca1bGxwTgYMo5Yknslgy9darKgHx-cn1bnKPiaMil1WHWEG0uG-PfngElQ: can't copy - source file is being updated (size changed from 445644800 to 630194176)
2020/05/19 12:04:52 DEBUG : test.mkv: Need to transfer - File not found at Destination
2020/05/19 12:04:52 DEBUG : test.mkv: Sending chunk 0 length 739246080
2020/05/19 12:04:53 ERROR : test.mkv: Failed to copy: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUkzjyBH00fnJ_BDsVpjm1t-0t5CYrE3TpAFUsmaxKDGJIb9JtX7ITC7B-YXIT6NXjvl4Q6ZHSaz8sXnfTFl6Bw: can't copy - source file is being updated (size changed from 739246080 to 769130496)
2020/05/19 12:04:53 ERROR : Attempt 2/3 failed with 1 errors and: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUkzjyBH00fnJ_BDsVpjm1t-0t5CYrE3TpAFUsmaxKDGJIb9JtX7ITC7B-YXIT6NXjvl4Q6ZHSaz8sXnfTFl6Bw: can't copy - source file is being updated (size changed from 739246080 to 769130496)
2020/05/19 12:04:53 DEBUG : test.mkv: Need to transfer - File not found at Destination
2020/05/19 12:04:54 DEBUG : test.mkv: Sending chunk 0 length 838598656
2020/05/19 12:04:55 ERROR : test.mkv: Failed to copy: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUnfdN9Q3uVwwnSShcvSwmfDztJ-2CDjcp6-D8nmTGTVPtzLyzfL8ok21r5_BLiosyF_qmOHKS1xWS0dIt75MQ: can't copy - source file is being updated (size changed from 838598656 to 979894272)
2020/05/19 12:04:55 ERROR : Attempt 3/3 failed with 1 errors and: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUnfdN9Q3uVwwnSShcvSwmfDztJ-2CDjcp6-D8nmTGTVPtzLyzfL8ok21r5_BLiosyF_qmOHKS1xWS0dIt75MQ: can't copy - source file is being updated (size changed from 838598656 to 979894272)
2020/05/19 12:04:55 Failed to copy: Post https://www.googleapis.com/upload/drive/v3/files?alt=json&fields=id%2Cname%2Csize%2Cmd5Checksum%2Ctrashed%2CmodifiedTime%2CcreatedTime%2CmimeType%2Cparents%2CwebViewLink&supportsAllDrives=true&uploadType=resumable&upload_id=AAANsUnfdN9Q3uVwwnSShcvSwmfDztJ-2CDjcp6-D8nmTGTVPtzLyzfL8ok21r5_BLiosyF_qmOHKS1xWS0dIt75MQ: can't copy - source file is being updated (size changed from 838598656 to 979894272)

as it failed after 3 retries.

yes. exactly the same. on b2, it is consistent with the other messages after the retry.

I don't think those messages are really an issue. THey are from the file being updated which IS NORMAL. But i'd think --local-no-check-updated should address it (as long as he understands he'll get corrupt transfers).

Would the transfers be entirely corrupted or just truncated?

Just to be clear I am using --local-no-check-updated and the main issue is the retries take a long time. I'm not so concerned about an exact match of source/target but I'd expect the mismatch to be noted and move on immediately.

I dont think they will even transfer. That is why it is failing. It can't copy a file that is being written to. The best thing youi can do is exclude them. or use snapshots (real backups).

I'm not sure if this will work for you. But this is how I do this:

tar -zcvf - --exclude-from=/tmp/exc / | rclone rcat robgs-cryptp:/vpsbackups/data_date "+%Y-%m-%d_%H:%M:%S".tar.gz -v

I use tar rather than having rclone sync this. Just another option. Yes tar still needs to deal with changing files.

EDIT: I'm not 100% sure this will work on b2. You'd have to try it. b2 might need the checksum upfront which means it would need to cache it locally first. I'm not sure.

For what it's worth, I use rsnapshot and you have a copy somewhere else and you can ship that anyway you want so I keep local copies on a separate disk and upload when I want a copy offsite.

From what I am reading, this wont work on b2 in the same way.

B2 needs to know the SHA1 hash of a file before uploading it (for non chunked uploads). For transfers from local disk or compatible remotes (like onedrive & acd which use SHA1) the file is streamed. However when uploading from an incompatible remote which can’t provide SHA1 sums in advance (like crypt) the b2 code will stream the file to a temporary file on disk, calculate the SHA1, then upload the file, but only for files less than -b2-upload-cutoff - files larger than this will be chunked and uploaded without a temporary file. The B2 remote will store temporary files in your OSes normal place for temporary files, or where the “TMPDIR” environment variable points to.

Since it needs to know that beforehand, it can't stream a file to the remote with an unknown size. So it'll cache to disk first.