Help To Understand Rclone Retry And Timeout

STOP and READ USE THIS TEMPLATE NO EXCEPTIONS - By not using this, you waste your time, our time and really hate puppies. Please remove these two lines and that will confirm you have read them.

What is the problem you are having with rclone?

Im using rclone rcd to sync (download files) from my remote (aws s3) to local.
If internet disconnects, rclone seems to be doing retries, using the default --retires and --low-level-retries. I'm seeing inconsistent behaviors using --timeout and --retries.

The answers to the below questions will help me understand more about retries

I have the following doubts

  1. Will rclone retry even if internet is not restored after --timeout is met.
  2. Does rclone retry start as soon as internet is disconnected or does it wait till --timeout to start retry ?

Run the command 'rclone version' and share the full output of the command.

rclone v1.66.0

  • os/version: Microsoft Windows 10 Enterprise 22H2 (64 bit)
  • os/kernel: 10.0.19045.4412 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.22.1
  • go/linking: static
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

AWS S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

http://localhost:5572/operations/copyfile

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

Paste config here

A log from the command that you were trying to run with the -vv flag

Paste  log here

@ncw @Animosity022 @asdffdsa

Please someone help.

  1. Will rclone retry even if internet is not restored after --timeout is met ?
  2. Does rclone retry start as soon as internet is disconnected or does it wait till --timeout to start retry ?
  3. I see that even after timeout , rclone "/core/stats" and "/job/status" api's doesn't return back an error. It seems to be retrying indefinitely. Is this the expected behavior.

can help yourself by testing 1. and 2.

I tested this multiple times and here are my observations
rclone retry starts only after --timeout is met.
which means if timeout is set to 5min, rclone retries after 5min.

And also please clarify on point 3

Is this the expected behavior ?

Timeout is relatred to a transaction time out for something like a TCP request.

     --timeout Duration                   IO idle timeout (default 5m0s)

Retries and timeouts are kind of different.

Retries happen at the low and high level depending on the flags and can be for a connection break or something odd. Generally, on a healthy connection, you don't get any retries.

If you have something, a debug log really helps and we can't guess the output or what the issue might be.

Overall, if you are internet dies, a cloud storage tool is going to have issues.

I generally stop my rclone traffic if my internet goes down and fire it back up if it goes off.

I am not sure what command you are running and any flags since we don't have a debug log file.

Pick one file, use the CLI to avoid the API stuff and recreate it in the most simple fashion and we can use the API after and see if the API has an issue.

Thanks for the reply @Animosity022
Rcd command Im using:
rclone rcd --no-console --rc-user= user --rc-pass=pass --s3-upload-concurrency=8
--local-no-check-updated=true --rc-job-expire-duration=24h --s3-upload-cutoff=1G --s3-no-check-bucket

Please note that I have not explicitly set --retries or --low-level-reties, so it should take the default values.
Once rcd initializes Im using
http://localhost:5572/operations/copyfile
And during the copy I use
http://127.0.0.1:5572/job/status & http://127.0.0.1:5572/core/stats to get the progress of copy
to copy files from AWS S3 to local window direcroty.

Scenario1:
Say, the copy/download takes 20 mins to complete. And within 6 mins, internet disconnects. Internet is restored after 15 mins. So from 6-15 which is 9 mins there was no internet. I see the below in the logs after 11 mins, which is 5 mins timeout :
2024/06/12 13:08:05 DEBUG : Data/abc.txt: Reopening on read failure after 20511 bytes: retry 1/10: read tcp 192.168.0.103:26749->3.5.3.167:443: i/o timeout
2024/06/12 13:08:54 DEBUG : Data/DataFile5/158_sample0158_d17f39fa-16ca-408d-8dd6-c68094c4e2a0.wiff2: Received error: read tcp 192.168.0.103:26749->3.5.3.167:443: i/o timeout - low level retry 1/10

The main issue in this scenario is that /job/status & /core/stats doesnt give an error and there are no more retries, just 1/10.

Scenario 2:
Say, the copy/download takes 20 mins to complete. And within 6 mins, internet disconnects. The internet restored after 1 min of disconnection.
In this case Im seeing inconsistent behavior. At times rclone retries soon after internet restoration and at times retry happens after 5 mins for internet restoration.

I need help to understand what is the expected behavior in both the above cases ?

If you want help, please help me help you by giving me what I'm looking for to help answer your question.

Please see the attached logs
retry and low level retry occurred only once, and this happened exactly after 5 mins.

rclonelog.txt (1.6 MB)