Rclone aborting while copying large directory to ACD?

Hello,

I’m using rclone repeatedly here to copy a large local directory tree (actually a backup of my main machine) with a few million, mostly small files to an Amazon Cloud Drive encrypted remote, and it’s aborting after hours of operation with the following:

2016/11/03 02:24:19 amazon drive root 'egt6ipr9ofpjnfac0587sqrd8g/dbcd091fet91ga6ortqn0dgimg': Finished reading "4c0aqobr2rnf998n2o0nvljhr4/b7ivnbt6enei8k3673
0qh687vk/pj4spvnq8hb1tkgjrh52ttq110/kmtlepfej8hhmdai6e4l5u2rbs/aioh54p49aips0g2496jvkvnpo/bkhi5bfokdq8kvq0k94a0s0nv4/4obet05vev4tvk8ksavfs333lk/jq1nml5humulhf
9sa244rb6uck/de39t1lmfngbd7phm82vft8mh4/m4qubodpigap8eki0lr0k15cgee8kna6ejbnne467ncb15l3jm80/8g3a37iqpramu6q7ek32qrldj4/5h53hnctf79a8ni4vb20u6b90g/lsn4njrfqht
nvpp839g2mcvl3c/u2d1faie3sfjbrf3h4esv9jlbc/"
2016/11/03 02:24:24 amazon drive root 'egt6ipr9ofpjnfac0587sqrd8g/dbcd091fet91ga6ortqn0dgimg': Error reading : HTTP code 502: "502 Bad Gateway", reponse body:
 <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>We're sorry!</title>
</head>

<body style="padding: 4em; color: black; background-color: white; font-family: Verdana,Arial,Helvetica,sans-serif;">

<h2>We're sorry!</h2>
<p>
An error occurred when we tried to process your request.
Rest assured, we're already working on the problem and expect to resolve it shortly.
</p>

<h2>D<C3><A9>sol<C3><A9>s!</h2>
<p>
Une erreur s'est produite lorsque nous avons tent<C3><A9> de traiter votre requ<C3><AA>te.
Soyez assur<C3><A9> que nous travaillons d<C3><A9>j<C3><A0> <C3><A0> la r<C3><A9>solution du probl<C3><A8>me que nous pensons trouver tr<C3><A8>s rapidement.
</p>

</body>
</html>

2016/11/03 02:24:24 Attempt 3/3 failed with 0 errors and: HTTP code 502: "502 Bad Gateway", reponse body: <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>We're sorry!</title>
</head>

<body style="padding: 4em; color: black; background-color: white; font-family: Verdana,Arial,Helvetica,sans-serif;">

<h2>We're sorry!</h2>
<p>
An error occurred when we tried to process your request.
Rest assured, we're already working on the problem and expect to resolve it shortly.
</p>

<h2>D<C3><A9>sol<C3><A9>s!</h2>
<p>
Une erreur s'est produite lorsque nous avons tent<C3><A9> de traiter votre requ<C3><AA>te.
Soyez assur<C3><A9> que nous travaillons d<C3><A9>j<C3><A0> <C3><A0> la r<C3><A9>solution du probl<C3><A8>me que nous pensons trouver tr<C3><A8>s rapidement.
</p>

</body>
</html>

2016/11/03 02:24:24 Failed to copy: HTTP code 502: "502 Bad Gateway", reponse body: <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>We're sorry!</title>
</head>

<body style="padding: 4em; color: black; background-color: white; font-family: Verdana,Arial,Helvetica,sans-serif;">

<h2>We're sorry!</h2>
<p>
An error occurred when we tried to process your request.
Rest assured, we're already working on the problem and expect to resolve it shortly.
</p>

<h2>D<C3><A9>sol<C3><A9>s!</h2>
<p>
Une erreur s'est produite lorsque nous avons tent<C3><A9> de traiter votre requ<C3><AA>te.
Soyez assur<C3><A9> que nous travaillons d<C3><A9>j<C3><A0> <C3><A0> la r<C3><A9>solution du probl<C3><A8>me que nous pensons trouver tr<C3><A8>s rapidement.
</p>

</body>
</html>

I understand that this is happening because Amazon is returning repeated “502 Bad Gateway” errors as logged above.

How can I make rclone not abort in such a scenario. but rather take a delay and try again, repeteadly? I already have about 1M files in the ACD remote directory, and it takes rclone hours just re-reading it to abort when it meets such an error… :-/

Perhaps I should increase one of the “retries” options? In this case, which one?

Thanks,

Durval.

502 isn’t an error message we normally retry on… I’ve added this in

http://beta.rclone.org/v1.33-99-g452c681/ (will be uploaded in 15-20 mins)

You may not need to change --low-level-retries with that but that might help too.

This has happened to me before. It usually ocurrs within the first few minutes of starting rclone, and goes away if I wait a couple of hours before retrying. If I don’t wait, or wait only a couple of minutes, the error persists.

I’ve always assumed it was a problem on Amazon’s end.

Hello Nick,

Thanks for the prompt response! I will be sure to run it the next time the currently running rclone aborts (still not even 50% to finish, so it will almost certainly abort again).

Will do. Would “–low-level-retries=10” be adequate, or would you recommend a different value?

Cheers,

Durval.

It reallly depends on how long you get that 502 error. 10 is usually sufficient, but try with the -v flag and watch the retries and see what happens.

Hello Nick,

Makes sense. Using 10 for now, will monitor and report my results back here, for those that will come a-googling…

Cheers,

Durval.

Any update with usage of this beta? I was getting a lot of errors uploading large files before too. I consumed 6TB of upload, only to see 3.5TB of files uploaded to ACD due to multiple large file failing.

This beta isn’t working so well for me, getting tons of ‘error HTTP code 500: “500 Internal Server Error”, response body: {“message”:“Internal failure”}’ when uploading files >1GB.

Had to revert back to v1.33-93-g5c96781-acd-range-fixβ/

I think that is probably amazon and rather than the beta. Can you try again with http://beta.rclone.org/v1.33-99-g452c681/ ?

Hello @ncw, everyone,

My latest transfer (with the beta version you recommended above) has been going on non-stop for over 36h now, and has so far transferred 624GB and 285K files, not a single abort.

Reviewing the “-v” log, I saw just three “502 Bad Gateway” errors happening around 20:18, 20:24 and 20:37 UTC today, but all the three errors were in different files, so I didn’t get to test your modification yet.

Seems Amazon got its act together for now; I still have a lot of data to upload during the next few weeks, so can you please tell whether this modification has been incorporated on rclone master, or should I keep using the above beta version for now and refrain from updating rclone until I get a large cluster of those errors to be able to test it out?

Thanks,

Durval.

1 Like

Ok upload seems to be working better now. Amazon must have fixed something on their end.

1 Like

Hello,

Answering my own question, to those that may come a-googling:

Apparently the newest “stable” version incorporates it, and so it should become a “common fixture” for the time being:

On a related note: adding the --low-level-retries option as recommended above had the unfortunate side effect of also adding numerous and useless retries to other errors, specially from long-sized dir and file names, see https://github.com/ncw/rclone/issues/219#issuecomment-253397112 . This has caused my transfers to stall completely for many hours when rclone stumbled upon a long-size-named directory containing lots of files… :-/ So I’m redoing it but specifying --exclude="????...????*" (with 175 ‘?’ characters) in the command line as @ncw recommended here: https://github.com/ncw/rclone/issues/762#issuecomment-251376745

EDIT: the limit that worked for me was 160 characters, not 175; see my comment here: https://github.com/ncw/rclone/issues/762#issuecomment-258724572

Cheers,

Durval.