1.49.4 / Plex / Internal Errors on Google Drive

thestigma · October 1, 2019, 4:10pm

Or maybe I should try running this instead? (rather than the same test again)

1.49.4 compiled with go1.12
https://pub.rclone.org/rclone-v1.49.4-go1.12.6.zip

If anyone wants to take charge and assign some tasks for who tests what (given that it takes time to reproduce) we may narrow this down faster.

I reproduced it on Win10, torrent client via mount - just re-checking loads of existing data.
It took a very long time though, and that may be due to Plex being far more aggressive in multi-threading it's requests? That might be a factor.

mechanimal82 · October 1, 2019, 7:45pm

I had this after updating the beta (see here)... I thought I cleared it by recreating everything, but the issue came back. I ended up reverting back to a previous stable and assumed I'd cocked something up and so just stayed on the stable 1.49.1 which seems to run OK for me.

ncw · October 1, 2019, 9:11pm

I had a report on this issue that rclone 1.49.4 compiled with go1.12 does not exhibit problems.

So I'd really appreciate it if anyone having problems can try to replicate with this binary

https://pub.rclone.org/rclone-v1.49.4-go1.12.6.zip (linux)

Or this one

https://github.com/rclone/rclone/files/3677423/rclone-v1.49.4-go1.12.zip (windows)

Another thing worth trying is with the original as released v1.49.4 set

export GODEBUG=tls13=0

or for windows

set GODEBUG tls13=0

(These need to be done in the same terminal windows that you run the rclone commands).

That TLS change of all the changes in the go1.13 release log looks like the most likely to cause a regression.

thestigma · October 1, 2019, 9:19pm

Testing that now to see if I can reproduce using this.

EDIT: This version claims it does not have a "mount" command? ...?
That breaks a bunch of my scripts so that quite inconvenient to test from. Is this intentional @ncw ?

Until otherwise instructed - testing this instead:

rclone v1.49.4-158-g1dc8bcd4-beta

os/arch: windows/amd64

go version: go1.13.1

With this set at the top of all my scripts:

set GODEBUG tls13=0

Animosity022 · October 1, 2019, 9:44pm

I've not had much luck after those few days in repeating the issue unfortunately. I tried via the Plex route and no dice.

Wonder if someone that read on the thread can test out as well.

ncw · October 2, 2019, 6:44am

Just a consequence of how I built it...

That is useful - thanks!

ncw · October 2, 2019, 12:39pm

OK I've decided to revert the 1.49.x builds back to using go1.12

Here is a proposed build - testing appreciated! This is code identical to 1.49.4 but

build with go1.12
the rpm generation is fixed

If it tests well I'll turn it into 1.49.5

https://beta.rclone.org/branch/v1.49.4-003-g2c093214-v1.49-fixes-beta/ (uploaded in 15-30 mins)

ncw · October 2, 2019, 12:55pm

Broke that build... try this one

https://beta.rclone.org/branch/v1.49.4-003-gb5ea6af6-v1.49-fixes-beta/ (uploaded in 15-30 mins)

thestigma · October 2, 2019, 5:40pm

Reporting back with test results:

This test (below) failed after almst a day or so of on-and-off use - with error:
2019/10/02 19:36:29 INFO : Google drive root 'Crypt1': Change notify listener failure: Get https://www.googleapis.com/drive/v3/changes?alt=json&fields=nextPageToken%2CnewStartPageToken%2Cchanges(fileId%2Cfile(name%2Cparents%2CmimeType))&includeItemsFromAllDrives=true&pageSize=1000&pageToken=418426&prettyPrint=false&supportsAllDrives=true&teamDriveId=0AFnbCru1fFmnUk9PVA: stream error: stream ID 1215; INTERNAL_ERROR
2019/10/02 19:36:29 ERROR : IO error: open file failed: Get https://www.googleapis.com/drive/v3/files/1puFG-QpSpRf0wT8MAwWgmUt3F0LTvg6_?alt=media: stream error: stream ID 1217; INTERNAL_ERROR
2019/10/02 19:36:37 INFO :

It did seems able to recover from the failure though, but it did generate an I/O error to the OS. Not the exact same error, but I assume it's related since I've never seen these problems before recently.

thestigma:

Until otherwise instructed - testing this instead:

rclone v1.49.4-158-g1dc8bcd4-beta

os/arch: windows/amd64

go version: go1.13.1

With this set at the top of all my scripts:
set GODEBUG tls13=0

thestigma · October 2, 2019, 5:41pm

Testing this next...

ncw · October 2, 2019, 7:13pm

Thanks! If that is OK then I'll wrap it up into a release.

This will come to bite us again when I release 1.50 as that is currently building with go1.13 so hopefully the go team have a fix here: Go1.13.2 Milestone · GitHub

That would be a good thing to test later, to see whether 1.49.4 works or not with the not yet release go1.13.2 compiler...

sweh · October 3, 2019, 12:30am

Do any of those bugs explain the current problem? If not you might want to raise an issue with them. The only http2 issue is a memory leak...

ncw · October 3, 2019, 7:04am

To make a decent bug report to the go team I need to be able to replicate it myself so I can bisect the go source to find the problem commit.

I'm not sure any of those explain the problem but then I'm not sure exactly what the problem is since disabling http2 didn't seem to fix it.

Animosity022 · October 3, 2019, 1:30pm

Yeah, the general problem I've been having is recreating it without breaking my primary system (which I don't want to do).

I've tried a number of things to replicate, but can't seem to do it.

The problem statement seems to be related to a good number of open files (10-25ish) and a lot of quick open and closes on those files.

I tried rescanning a mount with just the problematic version, but the problem didn't reoccur. I tried running multiple ffprobe/mediainfos spawning off 20-30 at a time and couldn't get the problem back.

It's always possible the new http2 and those few days were just of a problem on the API side too. It's hard to tell for sure.

I just know for about 48 hours on that version, I was dead in the water.

VBB · October 3, 2019, 4:14pm

For me, the problem did reoccur the next night when I ran a scan again, and even during the day when some people were streaming. This is when I was still on the latest beta. Yesterday, I switched to the build with the fix, and no more issues. Ran a full scan overnight, unprimed mount. Not a single error.

thestigma · October 3, 2019, 4:32pm

No errors on this so far but... all the other times it took a lot of time to manifest, so I wouldn't call this conclusive until I've seen it stable for significantly longer. I haven't really found any way to trigger it outside of putting the system under load for a long time.

ncw · October 3, 2019, 4:42pm

Thanks for testing...

This sounds like a tricky one all round!

Morphy · October 4, 2019, 9:06am

@ncw

Will you remove 1.49.4 from your download section since its buggy? Maybe put 1.49.3 online?

ncw · October 4, 2019, 9:31am

This link is a proposed v1.49.5 which I'll release once everyone is happy!

https://beta.rclone.org/branch/v1.49.4-003-gb5ea6af6-v1.49-fixes-beta/

So if you could test that it would be much appreciated.

Morphy · October 4, 2019, 1:11pm

Hi ncw

I dont have the option to test today . Im booked the entire weekend.

Well it was just a thought since your download section for the main page will have a faulty rclone . ?