Panic after updating to v1.63.1

driib · July 23, 2023, 9:11pm

What is the problem you are having with rclone?

I just updated rclone from v1.59.1 to v1.63.1 and after changing the Nextcloud URLs (replacing /webdav with /dav/files/USERNAME), I get the panic. Did the update to try mitigating Rclone v1.60.1 stalls and does not exit correctly when using --max-transfer - #15 in the first place (I found that using --check-first solves my problem, identical to the one described but it's not a good solution for me as I want to run rclone periodically and a full check is counterproductive).

Posting here as requested by the GH issue template.

Run the command 'rclone version' and share the full output of the command.

rclone v1.63.1

os/version: ubuntu 22.04 (64 bit)
os/kernel: 5.15.0-76-generic (x86_64)
os/type: linux
os/arch: amd64
go/version: go1.20.6
go/linking: static
go/tags: none

Which cloud storage system are you using? (eg Google Drive)

The pair in question is OneDrive to Nextcloud (Hetzner).

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone --transfers 3 --tpslimit 12 --max-transfer 64M --max-backlog 1024 --retries-sleep 1s --checkers 2 --cutoff-mode soft --progress -vvv sync onedrive-1:/abc nextcloud-1:/xyz

The rclone config contents with secrets removed.

[onedrive-1]
type = onedrive
token = 🤫
drive_id = 🤫
drive_type = business

[onedrive-2]
type = onedrive
token = 🤫
drive_id = 🤫
drive_type = personal

[dropbox]
type = dropbox
token = 🤫

[nextcloud-1]
type = webdav
url = https://🤫.your-storageshare.de/remote.php/dav/files/🤫
vendor = nextcloud
user = 🤫
pass = 🤫

[nextcloud-2]
type = webdav
url = https://🤫.your-storageshare.de/remote.php/dav/files/🤫
vendor = nextcloud
user = 🤫
pass = 🤫

A log from the command with the `-vv` flag

2023-07-23 22:53:43 DEBUG : %file1%: Need to transfer - File not found at Destination
2023-07-23 22:53:43 DEBUG : webdav root '%folder1%': Waiting for checks to finish
2023-07-23 22:53:43 DEBUG : webdav root '%folder1%': Waiting for transfers to finish
Transferred:      295.595 MiB / 4.773 GiB, 6%, 7.874 MiB/s, ETA 9m43s
Checks:              1047 / 1047, 100%
Transferred:           31 / 472, 7%
Elapsed time:      1m34.0s
Transferring:
 * %file2%: 55% /181.729Mi, 3.197Mi/s, 25s
 * %file3%: 48% /187.884Mi, 3.119Mi/s, 31s
 * %file4%: 65% /150.408Mi, 3.396Mi/s, 15s
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7691e4]

goroutine 2305 [running]:
net/http.(*readTrackingBody).Read(0x346b140?, {0xc003b1e000?, 0xc000f76b68?, 0x780870?})
        net/http/transport.go:646 +0x24
net/http.(*http2clientStream).writeRequestBody(0xc000e62480, 0xc0010e7400)
        net/http/h2_bundle.go:8646 +0x3fa
net/http.(*http2clientStream).writeRequest(0xc000e62480, 0xc0010e7400)
        net/http/h2_bundle.go:8375 +0x7d9
net/http.(*http2clientStream).doRequest(0x55ca0a?, 0x100?)
        net/http/h2_bundle.go:8261 +0x1e
created by net/http.(*http2ClientConn).RoundTrip
        net/http/h2_bundle.go:8190 +0x34a

Can't provide the entire log due to privacy concerns (project names etc.). I added a newline between 15s and panic for readability.

ncw · July 23, 2023, 10:18pm

Is there more of the traceback? The traceback you''ve sent is entirely in the go standard library so this is looking like a Go bug.

Something to do with HTTP2.

Can you reproduce this bug easily?

If you've got more log can you email the entire log to nick@craig-wood.com please and I'll analyse it further.

Thanks

driib · July 23, 2023, 10:20pm

Thanks @ncw,

There is no more traceback. Perhaps, I need a dev build for that? Or some ulimit -c tweaking? Will export GOTRACEBACK=all help on a production build?

Not every time but just got it again, with this command line

rclone --transfers 3 --tpslimit 12 --max-duration 5m --max-backlog 256 --retries-sleep 1s --cutoff-mode hard --progress -v copy:

Transferred:        2.548 GiB / 40.130 GiB, 6%, 11.493 MiB/s, ETA 55m48s
Checks:               120 / 120, 100%
Transferred:          182 / 449, 41%
Elapsed time:      4m19.9s
Transferring:
 *        %file1%:  9% /5.000Gi, 4.468Mi/s, 17m16s
 *        %file2%: 16% /2.928Gi, 4.924Mi/s, 8m29s
 *        %file3%: 13% /76.868Mi, 2.248Mi/s, 29spanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7691e4]

goroutine 3169 [running]:
net/http.(*readTrackingBody).Read(0x346b140?, {0xc000e80000?, 0xc000a18b68?, 0x780870?})
        net/http/transport.go:646 +0x24
net/http.(*http2clientStream).writeRequestBody(0xc000b10180, 0xc001cb0100)
        net/http/h2_bundle.go:8646 +0x3fa
net/http.(*http2clientStream).writeRequest(0xc000b10180, 0xc001cb0100)
        net/http/h2_bundle.go:8375 +0x7d9
net/http.(*http2clientStream).doRequest(0x20?, 0xc001bb8300?)
        net/http/h2_bundle.go:8261 +0x1e
created by net/http.(*http2ClientConn).RoundTrip
        net/http/h2_bundle.go:8190 +0x34a

Not for the faint of heart

driib · July 23, 2023, 10:34pm

Yes, it does. @ncw, here is a better trace with GOTRACEBACK=all: https://pastebin.com/SzTTLK0T

Could net/http: http2 close request body and read it again · Issue #59426 · golang/go · GitHub be helpful?

ncw · July 24, 2023, 1:35am

Thanks.

I'm not much the wiser though!

Looking at the go source this looks like readTrackingBody was set up with a nil io.ReadCloser.

Tracing through the code, I think the only way this could happen is if request.GetBody returned a nil body.

Now that is a user implemented call, so let's see if we have made a mistake...

So we do implement this in the webdav backend which is one of the backends you are using.

And it looks like this is in the new nextcloud chunking code...

Ah yes, here it is

		getBody := func() (io.ReadCloser, error) {
			// RepeatableReader{} plays well with accounting so rewinding doesn't make the progress buggy
			if _, err := in.Seek(0, io.SeekStart); err == nil {
				return nil, err
			}

			return io.NopCloser(in), nil
		}

I think this is probably just a typo but since it is new code I made an issue here to get the submitter involved: webdav: nextcloud: new chunking code causing a panic · Issue #7168 · rclone/rclone · GitHub

system · August 23, 2023, 1:36am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.