Copyurl fails with stream error (wget and curl works)

What is the problem you are having with rclone?

rclone is failing to download this publicly available file:

rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip .

Run the command 'rclone version' and share the full output of the command.

(top) [ec2-user@ip-172-31-31-213 ~]$ rclone version
rclone v1.65.2

  • os/version: amazon 2023 (64 bit)
  • os/kernel: 6.1.72-96.166.amzn2023.aarch64 (aarch64)
  • os/type: linux
  • os/arch: arm64 (ARMv8 compatible)
  • go/version: go1.21.6
  • go/linking: static
  • go/tags: none

The rclone config contents with secrets removed.

Not using rclone config.

A log from the command with the -vv flag

2024/02/01 10:59:45 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/01 10:59:45 ERROR : Attempt 1/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 1; INTERNAL_ERROR; received from peer
2024/02/01 10:59:45 ERROR : Attempt 2/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 3; INTERNAL_ERROR; received from peer
2024/02/01 10:59:45 ERROR : Attempt 3/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 5; INTERNAL_ERROR; received from peer

Additional info

I can download this file via browser, using curl and wget.

Firstly your command will save it as local file named . (dot).
Add --auto-filename flag to use original filename or specify name explicitly.

Secondly all works (but slow).

$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip --auto-filename . -vv -P
2024/02/01 11:48:40 DEBUG : rclone: Version "v1.65.2" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "-a" "." "-vv" "-P"]
2024/02/01 11:48:40 DEBUG : Creating backend with remote "."
2024/02/01 11:48:40 DEBUG : Using config file from "/Users/kptsky/.config/rclone/rclone.conf"
2024/02/01 11:48:40 DEBUG : fs cache: renaming cache item "." to be canonical "/Users/kptsky/Temp/test"
2024/02/01 11:48:41 DEBUG : 2023_30m_cdls.zip: File name found in url
Transferred:   	    1.918 GiB / 1.918 GiB, 100%, 2.762 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:     10m16.8s
2024/02/01 11:58:57 INFO  :
Transferred:   	    1.918 GiB / 1.918 GiB, 100%, 2.762 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:     10m16.8s

2024/02/01 11:58:57 DEBUG : 7 go routines active

I succeeded only after few failed attempts with errors like:

2024/02/01 11:34:09 NOTICE: .: Removing partially written file on error: read tcp 172.20.10.10:54954->23.64.47.233:443: read: operation timed out
2024/02/01 11:34:09 ERROR : .: Post request put error: read tcp 172.20.10.10:54954->23.64.47.233:443: read: operation timed out
2024/02/01 11:34:09 ERROR : Attempt 1/3 failed with 2 errors and: read tcp 172.20.10.10:54954->23.64.47.233:443: read: operation timed out

It means that most likely problem is with source server network not being very stable.

Thank you for your reply! I'm sorry, I was a bit too fast, indeed, it works on my machine, which has the following:

rclone v1.62.2
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.5.0-14-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.20.2
- go/linking: static
- go/tags: none

It does not, however, work on two ec-2 instances on AWS. I tried arm64 (r6g) and amd64 (t3). I'm getting stream error as pasted above. arm64 version is on the first post, amd64 is this:

(top) [ec2-user@ip-172-31-31-24 ~]$ rclone version
rclone v1.65.2-DEV
- os/version: amazon 2023 (64 bit)
- os/kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.6
- go/linking: static
- go/tags: none

I'm installing rclone with mamba install rclone, getting an error, then running rclone selfupdate, and getting the same error.

edit:

I tried to install using sudo -v ; curl https://rclone.org/install.sh | sudo bash, same error.

So there is some strange network setup there. Proxy? etc.

Definitely does not look like rclone problem.

To my understanding these instances are as "default" as it gets, with no specific configuration. I am able to download files from other locations, for example:

rclone copyurl https://geodata.ucdavis.edu/gadm/gadm4.1/shp/gadm41_USA_shp.zip out

Maybe there is some kind of test suite I could run?

Sometimes things break when some network infrastructure is not configured properly (can be anywhere between your server and geodata.ucdavis.edu). Often related to HTTP2 support and IPv6 path configuration

I would try:

--disable-http2
--bind 0.0.0.0 or --bind ::0

e.g.:

rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip --auto-filename . -vv -P --disable-http2 --bind 0.0.0.0

to disable HTTP2 and force IPv4 path

Thank you for the suggestion! In this case I do not get an error, but transferring does not start (cancelled after 6 minutes):

(top) [ec2-user@ip-172-31-24-69 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip --auto-filename . -vv -P --disable-http2 --bind 0.0.0.0
2024/02/01 14:38:50 DEBUG : rclone: Version "v1.65.2-DEV" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "--auto-filename" "." "-vv" "-P" "--disable-http2" "--bind" "0.0.0.0"]
2024/02/01 14:38:50 DEBUG : Creating backend with remote "."
2024/02/01 14:38:50 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/01 14:38:50 DEBUG : fs cache: renaming cache item "." to be canonical "/home/ec2-user"
2024/02/01 14:43:50 ERROR : Attempt 1/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": net/http: timeout awaiting response headers
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:      6m32.0s^C

try --disable-http2 only then if still issues,

--disable-http2 --bind ::0 or only --bind ::0

With only --disable-http2 I get the same non-starting transfer as with --disable-http2 --bind 0.0.0.0.

With --disable-http2 --bind ::0 and --bind ::0 I get "network is unreachable"

(top) [ec2-user@ip-172-31-24-69 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip --auto-filename . -vv --disable-http2 --bind ::0
2024/02/01 15:24:40 DEBUG : rclone: Version "v1.65.2-DEV" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "--auto-filename" "." "-vv" "--disable-http2" "--bind" "::0"]
2024/02/01 15:24:40 DEBUG : Creating backend with remote "."
2024/02/01 15:24:40 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/01 15:24:40 DEBUG : fs cache: renaming cache item "." to be canonical "/home/ec2-user"
2024/02/01 15:24:40 ERROR : Attempt 1/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": dial tcp6 [::]:0->[2600:1409:d000:58d::2938]:443: connect: network is unreachable
2024/02/01 15:24:40 ERROR : Attempt 2/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": dial tcp6 [::]:0->[2600:1409:d000:596::2938]:443: connect: network is unreachable
2024/02/01 15:24:40 ERROR : Attempt 3/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": dial tcp6 [::]:0->[2600:1409:d000:58d::2938]:443: connect: network is unreachable
2024/02/01 15:24:40 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:         0.0s

2024/02/01 15:24:40 DEBUG : 4 go routines active
2024/02/01 15:24:40 Failed to copyurl: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": dial tcp6 [::]:0->[2600:1409:d000:58d::2938]:443: connect: network is unreachable

Maybe others have some better ideas - my skills in troubleshooting network issues are limited.

As @kapitainsky noted above this is an HTTP2 error. It is from the server so the server looks unhappy for some reason.

The experiments with --bind mean that it isn't an IPv4 vs IPv6 issue.

The download works fine with and without --disable-http2 so it isn't an HTTP1 vs HTTP2 problem.

I think the remaining reasons could be

  • some kind of network proxy between you and the server - assuming you are just starting up a vanilla VM then this unlikely, but if you are in some kind of VPC then I guess it could be possible.
  • the server has banned downloads from AWS IPs for some reason (abuse maybe)

You might also want to experiment with setting a --user-agent to that used by a browser - that might help.

Thank you ncw!

I created a brand new AWS account, and created a default EC2 instance (next, next, next..), installed rclone from https://rclone.org/install.sh script. Same failure. Version displays this:

[ec2-user@ip-172-31-18-212 ~]$ rclone version
rclone v1.65.2
- os/version: amazon 2023 (64 bit)
- os/kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.6
- go/linking: static
- go/tags: none

I tried the top user agent from https://www.useragents.me/. Same failure. Command looks like this:

[ec2-user@ip-172-31-18-212 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip out --user-agent="MozilMozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.3"
2024/02/02 14:01:24 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/02 14:01:24 ERROR : Attempt 1/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 1; INTERNAL_ERROR; received from peer
2024/02/02 14:01:24 ERROR : Attempt 2/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 3; INTERNAL_ERROR; received from peer
2024/02/02 14:01:24 ERROR : Attempt 3/3 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 5; INTERNAL_ERROR; received from peer
2024/02/02 14:01:24 Failed to copyurl: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 5; INTERNAL_ERROR; received from peer

IP is not blocked, as I can get that file with curl & wget:

[ec2-user@ip-172-31-18-212 ~]$ wget https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip
--2024-02-02 14:02:13--  https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip
Resolving www.nass.usda.gov (www.nass.usda.gov)... 23.6.101.171, 2600:1409:9800:989::2938, 2600:1409:9800:985::2938
Connecting to www.nass.usda.gov (www.nass.usda.gov)|23.6.101.171|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2059094665 (1.9G) [application/zip]
Saving to: ‘2023_30m_cdls.zip’

2023_30m_cdls.zip                                0%[                                                                                                  ]       0  --.-KB/s           
<truncated>          
2023_30m_cdls.zip                              100%[=================================================================================================>]   1.92G  62.6MB/s    in 30s     

2024-02-02 14:02:46 (64.7 MB/s) - ‘2023_30m_cdls.zip’ saved [2059094665/2059094665]

I'm still suspecting a bug in rclone, let me know if I can help you further triage this.

The next thing to try would be attempting the transfer with

-vv --dump headers --retries 1 

And see what that says.

This is what I got:

[ec2-user@ip-172-31-25-145 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip out -vv --dump headers --retries 1 
2024/02/04 10:20:06 DEBUG : rclone: Version "v1.65.2" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "out" "-vv" "--dump" "headers" "--retries" "1"]
2024/02/04 10:20:06 DEBUG : Creating backend with remote "."
2024/02/04 10:20:06 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/04 10:20:06 DEBUG : fs cache: renaming cache item "." to be canonical "/home/ec2-user"
2024/02/04 10:20:06 DEBUG : You have specified to dump information. Please be noted that the Accept-Encoding as shown may not be correct in the request and the response may not show Content-Encoding if the go standard libraries auto gzip encoding was in effect. In this case the body of the request will be gunzipped before showing it.
2024/02/04 10:20:06 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/04 10:20:06 DEBUG : HTTP REQUEST (req 0xc00090d200)
2024/02/04 10:20:06 DEBUG : GET /Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip HTTP/1.1
Host: www.nass.usda.gov
User-Agent: rclone/v1.65.2
Accept-Encoding: gzip

2024/02/04 10:20:06 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/04 10:20:06 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/02/04 10:20:06 DEBUG : HTTP RESPONSE (req 0xc00090d200)
2024/02/04 10:20:06 DEBUG : Error: stream error: stream ID 1; INTERNAL_ERROR; received from peer
2024/02/04 10:20:06 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/02/04 10:20:06 ERROR : Attempt 1/1 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 1; INTERNAL_ERROR; received from peer
2024/02/04 10:20:06 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:         0.0s

2024/02/04 10:20:06 DEBUG : 5 go routines active
2024/02/04 10:20:06 Failed to copyurl: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": stream error: stream ID 1; INTERNAL_ERROR; received from peer
[ec2-user@ip-172-31-25-145 ~]$ 

Hmm that was less enlightening than I hoped.

Can you try the same thing with

--disable-http2

Also?

And then try --dump bodies instead of --dump headers if that replicates the problem.

Thanks

First command (ctrl+c after 3 minutes):

[ec2-user@ip-172-31-23-19 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip out -vv --dump headers --retries 1 --disable-http2
2024/02/05 09:23:31 DEBUG : rclone: Version "v1.65.2" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "out" "-vv" "--dump" "headers" "--retries" "1" "--disable-http2"]
2024/02/05 09:23:31 DEBUG : Creating backend with remote "."
2024/02/05 09:23:31 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/05 09:23:31 DEBUG : fs cache: renaming cache item "." to be canonical "/home/ec2-user"
2024/02/05 09:23:31 DEBUG : You have specified to dump information. Please be noted that the Accept-Encoding as shown may not be correct in the request and the response may not show Content-Encoding if the go standard libraries auto gzip encoding was in effect. In this case the body of the request will be gunzipped before showing it.
2024/02/05 09:23:31 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/05 09:23:31 DEBUG : HTTP REQUEST (req 0xc00094d200)
2024/02/05 09:23:31 DEBUG : GET /Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip HTTP/1.1
Host: www.nass.usda.gov
User-Agent: rclone/v1.65.2
Accept-Encoding: gzip

2024/02/05 09:23:31 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/05 09:24:31 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       1m0.0s

2024/02/05 09:25:31 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       2m0.0s

2024/02/05 09:26:31 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       3m0.0s

The second command (this time left for longer and it timed out)

[ec2-user@ip-172-31-23-19 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip out -vv --dump bodies --retries 1 --disable-http2
2024/02/05 09:31:02 DEBUG : rclone: Version "v1.65.2" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "out" "-vv" "--dump" "bodies" "--retries" "1" "--disable-http2"]
2024/02/05 09:31:02 DEBUG : Creating backend with remote "."
2024/02/05 09:31:02 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/05 09:31:02 DEBUG : fs cache: renaming cache item "." to be canonical "/home/ec2-user"
2024/02/05 09:31:02 DEBUG : You have specified to dump information. Please be noted that the Accept-Encoding as shown may not be correct in the request and the response may not show Content-Encoding if the go standard libraries auto gzip encoding was in effect. In this case the body of the request will be gunzipped before showing it.
2024/02/05 09:31:02 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/05 09:31:02 DEBUG : HTTP REQUEST (req 0xc00090b200)
2024/02/05 09:31:02 DEBUG : GET /Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip HTTP/1.1
Host: www.nass.usda.gov
User-Agent: rclone/v1.65.2
Accept-Encoding: gzip

2024/02/05 09:31:02 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/05 09:32:02 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       1m0.0s

2024/02/05 09:33:02 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       2m0.0s

2024/02/05 09:34:02 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       3m0.0s

2024/02/05 09:35:02 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       4m0.0s

2024/02/05 09:36:02 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       5m0.0s

2024/02/05 09:36:02 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/02/05 09:36:02 DEBUG : HTTP RESPONSE (req 0xc00090b200)
2024/02/05 09:36:02 DEBUG : Error: net/http: timeout awaiting response headers
2024/02/05 09:36:02 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/02/05 09:36:02 ERROR : Attempt 1/1 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": net/http: timeout awaiting response headers
2024/02/05 09:36:02 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:       5m0.0s

2024/02/05 09:36:02 DEBUG : 6 go routines active
2024/02/05 09:36:02 Failed to copyurl: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": net/http: timeout awaiting response headers

That's quite different behaviour - a timeout rather than an internal server error when you use HTTP1 rather than HTTP2.

Try the first of these again with

--user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"

And we can see if it does something different.

You can also try --timeout 15m to giver it longer but I doubt that will help.

Here it is:

[ec2-user@ip-172-31-23-19 ~]$ rclone copyurl https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip out -vv --dump headers --retries 1 --disable-http2 --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
2024/02/05 15:09:34 DEBUG : rclone: Version "v1.65.2" starting with parameters ["rclone" "copyurl" "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip" "out" "-vv" "--dump" "headers" "--retries" "1" "--disable-http2" "--user-agent" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"]
2024/02/05 15:09:34 DEBUG : Creating backend with remote "."
2024/02/05 15:09:34 NOTICE: Config file "/home/ec2-user/.config/rclone/rclone.conf" not found - using defaults
2024/02/05 15:09:34 DEBUG : fs cache: renaming cache item "." to be canonical "/home/ec2-user"
2024/02/05 15:09:34 DEBUG : You have specified to dump information. Please be noted that the Accept-Encoding as shown may not be correct in the request and the response may not show Content-Encoding if the go standard libraries auto gzip encoding was in effect. In this case the body of the request will be gunzipped before showing it.
2024/02/05 15:09:34 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/05 15:09:34 DEBUG : HTTP REQUEST (req 0xc000949200)
2024/02/05 15:09:34 DEBUG : GET /Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip HTTP/1.1
Host: www.nass.usda.gov
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
Accept-Encoding: gzip

2024/02/05 15:09:34 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/02/05 15:10:34 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       1m0.0s

2024/02/05 15:11:34 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       2m0.0s

2024/02/05 15:12:34 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       3m0.0s

2024/02/05 15:13:34 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       4m0.0s

2024/02/05 15:14:34 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       5m0.0s

2024/02/05 15:14:34 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/02/05 15:14:34 DEBUG : HTTP RESPONSE (req 0xc000949200)
2024/02/05 15:14:34 DEBUG : Error: net/http: timeout awaiting response headers
2024/02/05 15:14:34 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/02/05 15:14:34 ERROR : Attempt 1/1 failed with 1 errors and: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": net/http: timeout awaiting response headers
2024/02/05 15:14:34 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:       5m0.0s

2024/02/05 15:14:34 DEBUG : 6 go routines active
2024/02/05 15:14:34 Failed to copyurl: Get "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip": net/http: timeout awaiting response headers

Can you do

host www.nass.usda.gov

And check you get the same IPs on the failing machine and the one that works?

Maybe the DNS is messed up

You could check with netstat -tuanp that rclone is connecting to the correct IP. It might be something up with Go's DNS resolution.

IP addresses are indeed different.

My machine (where rclone works):

📎 psarka@xps:~$ host www.nass.usda.gov
www.nass.usda.gov is an alias for www.nass.usda.gov.edgekey.net.
www.nass.usda.gov.edgekey.net is an alias for e10552.dscx.akamaiedge.net.
e10552.dscx.akamaiedge.net has address 23.197.137.116
e10552.dscx.akamaiedge.net has IPv6 address 2a02:26f0:3900:3af::2938
e10552.dscx.akamaiedge.net has IPv6 address 2a02:26f0:3900:3a4::2938

ec2 instance (where it does not)

[ec2-user@ip-172-31-23-19 ~]$ host www.nass.usda.gov
www.nass.usda.gov is an alias for www.nass.usda.gov.edgekey.net.
www.nass.usda.gov.edgekey.net is an alias for e10552.dscx.akamaiedge.net.
e10552.dscx.akamaiedge.net has address 23.6.101.171
e10552.dscx.akamaiedge.net has IPv6 address 2600:1409:9800:985::2938
e10552.dscx.akamaiedge.net has IPv6 address 2600:1409:9800:989::2938

Netstat confirms that rclone is connecting to the address returned by host:

[ec2-user@ip-172-31-23-19 ~]$ sudo netstat -tuanp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2162/sshd: /usr/sbi 
tcp        0    540 172.31.23.19:22         18.237.140.164:31818    ESTABLISHED 68019/sshd: ec2-use 
tcp        0      0 172.31.23.19:33842      52.119.167.123:443      ESTABLISHED 2157/amazon-ssm-age 
tcp        0      0 172.31.23.19:22         138.2.234.220:48850     TIME_WAIT   -                   
tcp        0      0 172.31.23.19:58744      23.6.101.171:443        ESTABLISHED 69516/rclone        
tcp6       0      0 :::22                   :::*                    LISTEN      2162/sshd: /usr/sbi 
udp        0      0 127.0.0.1:323           0.0.0.0:*                           2191/chronyd        
udp        0      0 172.31.23.19:68         0.0.0.0:*                           1966/systemd-networ 
udp6       0      0 ::1:323                 :::*                                2191/chronyd        
udp6       0      0 fe80::17:97ff:fe61::546 :::*                                1966/systemd-networ 

wget is using the same IP and downloading successfully:

[ec2-user@ip-172-31-23-19 ~]$ wget -vv https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip 
--2024-02-06 10:17:04--  https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2023_30m_cdls.zip
Resolving www.nass.usda.gov (www.nass.usda.gov)... 23.6.101.171, 2600:1409:9800:985::2938, 2600:1409:9800:989::2938
Connecting to www.nass.usda.gov (www.nass.usda.gov)|23.6.101.171|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2059094665 (1.9G) [application/zip]
Saving to: ‘2023_30m_cdls.zip’

2023_30m_cdls.zip                                     0%[                                                                                                                   ]       0  --.-KB/s        
2023_30m_cdls.zip                                     1%[>                                                                                                                  ]  20.71M   103MB/s        
2023_30m_cdls.zip                                     2%[=>                                                                                                                 ]  46.19M   115MB/s        
2023_30m_cdls.zip                                     3%[===>                                                                                                               ]  74.33M   123MB/s        
2023_30m_cdls.zip                                     4%[====>                                                                                                              ]  97.79M   122MB/s        
2023_30m_cdls.zip                                     6%[======>                                                                                                            ] 121.16M   121MB/s        
]

^C