HTTP remote does not retry on 502 Error

What is the problem you are having with rclone?

I'm using rclone to download files via an http remote, and when a particular file throws a 502 error (relatively common), it doesn't do any low-level retries or fail in any way. This can cause files to be deleted when running an rsync operation. It does this for some remotes (e.g. 1Fichier), but I couldn't find any errors it'll automatically retry on in the HTTP code

Run the command 'rclone version' and share the full output of the command.

rclone v1.66.0

  • os/version: Microsoft Windows 11 Home 23H2 (64 bit)
  • os/kernel: 10.0.22631.3593 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.22.1
  • go/linking: static
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

HTTP

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync --fast-list --check-first --transfers 5 remote_http:path out_dir -vv

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[remote_http]
type = http
url = https://remote_http.com/files/

A log from the command that you were trying to run with the -vv flag

INFO: 2024/05/22 08:26:49 DEBUG : rclone: Version "v1.66.0" starting with parameters ["rclone" "sync" "--fast-list" "--check-first" "--transfers" "5" "remote_http:path" "out_dir" "-vv"]
INFO: 2024/05/22 08:26:49 DEBUG : Creating backend with remote "remote_http:path"
INFO: 2024/05/22 08:26:49 DEBUG : Using config file from "XXX\\rclone\\rclone.conf"
INFO: 2024/05/22 08:26:49 DEBUG : Root: remote_http:path
INFO: 2024/05/22 08:26:49 DEBUG : Creating backend with remote "out_dir"
INFO: 2024/05/22 08:26:49 DEBUG : fs cache: renaming cache item "out_dir" to be canonical "//?/out_dir"
INFO: 2024/05/22 08:26:49 INFO  : Local file system at //?/out_dir: Running all checks before starting transfers
INFO: 2024/05/22 08:26:55 DEBUG : file1.zip: skipping because of error: failed to stat: HTTP Error: 502 Bad Gateway
INFO: 2024/05/22 08:26:55 DEBUG : file2.zip: skipping because of error: failed to stat: HTTP Error: 502 Bad Gateway
INFO: 2024/05/22 08:26:56 DEBUG : file3.zip: skipping because of error: failed to stat: HTTP Error: 502 Bad Gateway

welcome to the forum,

what other copy tools have you tested?

Hi,

Thanks for the response. I'm not sure what you mean by other copy tools? I tried rclone copy rather than rclone sync and ran into the same things. wget works but isn't granular enough for the kind of sync I'm trying to do here, which does a lot of filtering. It also occasionally fails on the downloads as well. I've noticed that other remote types have error messages that they'll automatically retry on, I think the HTTP remote should be able to do the same thing as well?

ok, so the error is with the server, not rclone.

that makes sense.

pretty sure, http remotes do not support --fast-list

rclone backend features http | grep "ListR"
                "ListR": false,

for testing, can you run a simpler command and post the full, complete debug output?
rclone sync remote_http:path out_dir -vv

I agree, the server is pretty flaky but I thought rclone had some features that would help that out. You're also right about the --fast-list, I've got it in there for completeness but it doesn't actually do anything. The full log looks much the same:

INFO: 2024/05/22 16:52:32 DEBUG : rclone: Version "v1.66.0" starting with parameters ["rclone" "sync" "--transfers=5" "remote_http:path" out_dir" "-vv" "--filter" "+ {{(file1.zip|file2.zip)}}" "--filter" "- *"]
INFO: 2024/05/22 16:52:32 DEBUG : Creating backend with remote "remote_http:path"
INFO: 2024/05/22 16:52:32 DEBUG : Using config file from "XXX\\rclone\\rclone.conf"
INFO: 2024/05/22 16:52:32 DEBUG : Root: remote_http:path
INFO: 2024/05/22 16:52:32 DEBUG : Creating backend with remote "out_dir"
INFO: 2024/05/22 16:52:32 DEBUG : fs cache: renaming cache item "out_dir" to be canonical "//?/out_dir"
INFO: 2024/05/22 16:52:36 DEBUG : fileX.zip: skipping because of error: failed to stat: HTTP Error: 502 Bad Gateway

[Way more 502s here, inconsistent from run to run]

INFO: 2024/05/22 16:53:26 DEBUG : file1.zip: Size and modification time the same (differ by 0s, within tolerance 1s)
INFO: 2024/05/22 16:53:26 DEBUG : file1.zip: Unchanged skipping
INFO: 2024/05/22 16:53:26 DEBUG : Local file system at //?/out_dir: Waiting for transfers to finish
INFO: 2024/05/22 16:53:26 DEBUG : Waiting for deletions to finish
INFO: 2024/05/22 16:53:26 INFO  : There was nothing to transfer
INFO: 2024/05/22 16:53:26 INFO  : 
INFO: Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
INFO: Checks:                 1/1, 100%
INFO: Elapsed time:        54.2s
INFO: 
INFO: 2024/05/22 16:53:26 DEBUG : 3 go routines active

the real issue is if the 502 happens on the file that's in the include filter, it won't get transferred at the end, but without -vv it won't even show that that's because of this 502 error

that looks weird, with INFO: at the start

i would expect
2024/05/22 16:53:26 DEBUG

Yeah, it's running through a python script. So that's the python logger formatting around the rclone log

for testing, let's keep it simple, run rclone on the command line, ok?

Sure, just putting the command directly into terminal:

2024/05/22 17:30:10 DEBUG : rclone: Version "v1.66.0" starting with parameters ["E:\\Program Files (x86)\\Rclone\\rclone.exe" "sync" "--transfers=5" "remote_http:path" "out_dir" "-vv" "--filter" "+ {{(file1.zip|file2.zip)}}" "--filter" "- *"]
2024/05/22 17:30:10 DEBUG : Creating backend with remote "remote_http:path"
2024/05/22 17:30:10 DEBUG : Using config file from "XXX\\rclone\\rclone.conf"
2024/05/22 17:30:10 DEBUG : Root: remote_http:path
2024/05/22 17:30:10 DEBUG : Creating backend with remote "out_dir"
2024/05/22 17:30:10 DEBUG : fs cache: renaming cache item "out_dir" to be canonical "//?/out_dir"
2024/05/22 17:32:55 DEBUG : fileX.zip: skipping because of error: failed to stat: HTTP Error: 502 Bad Gateway

[More 502s here]

2024/05/22 17:35:02 DEBUG : file1.zip: Unchanged skipping
2024/05/22 17:35:02 DEBUG : Local file system at //?/out_dir: Waiting for transfers to finish
2024/05/22 17:35:02 DEBUG : Waiting for deletions to finish
2024/05/22 17:35:02 INFO  : There was nothing to transfer
2024/05/22 17:35:02 INFO  :
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:                 1 / 1, 100%
Elapsed time:        53.6s

2024/05/22 17:35:02 DEBUG : 3 go routines active

ok, so same output as compared with python.

i do not use http remotes much, so sure why rclone is not doing retries.
maybe it is not supported for http remotes.

DEBUG : fileX.zip: skipping because of error: failed to stat: HTTP Error: 502 Bad Gateway
i would think rclone should output ERROR, not DEBUG
maybe that is the reason, rclone is not retrying?

I agree, I was poking in the code a little earlier and see the HTTP remotes have no codes they'll retry on, which I think they should have? That or the debug should be raised to an error, one or the other. Unless the HTTP protocol doesn't support that? Not an expert here

ok, i think these links provide the answers

about retries
the http backend isn't using the low level retries framework

about the DEBUG versus ERROR in the log
The skipping files that return an HTTP error would be done here

The low level retries thing seems totally sensible to me, would it be possible to get this implemented?

sure, rclone is open-source, you are welcome to write the source code.
or sponsor the primary author.
or post at the github issue.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.