Corrupt files when transferring from FTP to local/S3, only sometimes

What is the problem you are having with rclone?

I am having -seemingly- random problems with transferring files from an FTP server to an S3 bucket. I saw seemingly because I cannot reproduce them effectively, which makes me feel it is a server problem and not an Rclone problem.

The logs that are presented below are part of some experimentation I did yesterday, where I transferred 20 files & their .md5 counterparts from the FTP server to S3. After, I compared the content of the .md5 file to the ETag and in all 8 runs, between 16-18 files were getting corrupt. Gunzip-ing them threw an error, etc.

However, today I retried the test but transferring to local and all files were transferring correctly. After thinking for a bit that the corruption was happening when streaming to S3, I re-tested the FTP -> S3 run and files were arriving proper.

I believe that the issue is the 450 errors I am having. Opinions? Ways to solve this issue? Rclone ends up copying the files, so should I manually check the checksums and delete corrupt -> retry the copy?

Run the command 'rclone version' and share the full output of the command.

rclone v1.62.2

  • os/version: darwin 13.3.1 (64 bit)

  • os/kernel: 22.4.0 (arm64)

  • os/type: darwin

  • os/arch: arm64 (ARMv8 compatible)

  • go/version: go1.20.2

  • go/linking: dynamic

  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

AWS S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy pubmed-ftp:pubmed/baseline aws-s3:technology-experiments/temp/test-rclone-cli --include "*23n{101,056}*.xml*" -vv

The rclone config contents with secrets removed.

[pubmed-ftp]
type = ftp
host = ftp.ncbi.nlm.nih.gov
pass = <obscured-email>
user = anonymous

[aws-s3]
type = s3
provider = AWS
env_auth = true
region = eu-west-1
location_constraint = eu-west-1

A log from the command with the -vv flag

I will cut and paste select items from yesterday's log since it ended up huge. Today's log was very clean, if needed let me know and I'll paste it in a reply.

It begins with this

2023/06/14 13:24:26 DEBUG : rclone: Version "v1.62.2" starting with parameters ["rclone" "copy" "pubmed-ftp:pubmed/baseline" "aws-s3:technology-experiments/temp/test-rclone-cli-3" "--include" "*23n{101,056}*.xml*" "-vv" "--log-file=test-rclone-cli-3.txt"]
2023/06/14 13:24:26 DEBUG : Creating backend with remote "pubmed-ftp:pubmed/baseline"
2023/06/14 13:24:26 DEBUG : Using config file from "/Users/<user>/.config/rclone/rclone.conf"
2023/06/14 13:24:26 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: Connecting to FTP server
2023/06/14 13:24:26 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","ftp.ncbi.nlm.nih.gov:21")
2023/06/14 13:24:27 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:28 DEBUG : Creating backend with remote "aws-s3:technology-experiments/temp/test-rclone-cli-3"
2023/06/14 13:24:28 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50262")
2023/06/14 13:24:28 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:31 DEBUG : pubmed23n0003.xml.gz.md5: Excluded
2023/06/14 13:24:31 DEBUG : pubmed23n0003.xml.gz: Excluded
2023/06/14 13:24:31 DEBUG : pubmed23n0004.xml.gz.md5: Excluded
2023/06/14 13:24:31 DEBUG : pubmed23n0004.xml.gz: Excluded

It continues with plenty of exclusions, until

2023/06/14 13:24:31 DEBUG : pubmed23n1165.xml.gz: Excluded
2023/06/14 13:24:31 DEBUG : pubmed23n1165.xml.gz.md5: Excluded
2023/06/14 13:24:31 DEBUG : README.txt: Excluded
2023/06/14 13:24:31 DEBUG : S3 bucket technology-experiments path temp/test-rclone-cli-3: Waiting for checks to finish
2023/06/14 13:24:31 DEBUG : S3 bucket technology-experiments path temp/test-rclone-cli-3: Waiting for transfers to finish
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: Connecting to FTP server
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: Connecting to FTP server
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: Connecting to FTP server
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","ftp.ncbi.nlm.nih.gov:21")
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","ftp.ncbi.nlm.nih.gov:21")
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","ftp.ncbi.nlm.nih.gov:21")
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50032")
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:31 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:32 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41f:250::230]:50399")
2023/06/14 13:24:32 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:32 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41f:250::230]:50003")
2023/06/14 13:24:32 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41f:250::230]:50399")
2023/06/14 13:24:32 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:32 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:33 INFO  : pubmed23n0560.xml.gz.md5: Copied (new)
2023/06/14 13:24:33 INFO  : pubmed23n0561.xml.gz.md5: Copied (new)
2023/06/14 13:24:33 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41f:250::230]:50421")
2023/06/14 13:24:33 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41f:250::230]:50368")
2023/06/14 13:24:33 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:33 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:33 INFO  : pubmed23n0562.xml.gz.md5: Copied (new)
2023/06/14 13:24:33 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41f:250::230]:50442")
2023/06/14 13:24:33 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:51 INFO  : pubmed23n0560.xml.gz: Copied (new)
2023/06/14 13:24:51 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50189")
2023/06/14 13:24:51 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:24:52 INFO  : pubmed23n0563.xml.gz.md5: Copied (new)
2023/06/14 13:24:52 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50322")
2023/06/14 13:24:52 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:18 INFO  : pubmed23n0564.xml.gz: Copied (new)
2023/06/14 13:25:18 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50456")
2023/06/14 13:25:18 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:18 INFO  : pubmed23n0564.xml.gz.md5: Copied (new)
2023/06/14 13:25:18 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50195")
2023/06/14 13:25:19 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:28 INFO  : 
Transferred:   	  135.096 MiB / 913.124 MiB, 15%, 2.189 MiB/s, ETA 5m55s
Transferred:            7 / 40, 18%
Elapsed time:       1m1.9s
Transferring:
 *                          pubmed23n0561.xml.gz: 93% /28.855Mi, 429.869Ki/s, 4s
 *                          pubmed23n0562.xml.gz: 76% /31.347Mi, 433.851Ki/s, 17s
 *                          pubmed23n0563.xml.gz: 51% /31.125Mi, 327.070Ki/s, 47s
 *                          pubmed23n0565.xml.gz: 32% /27.554Mi, 955.955Ki/s, 19s

2023/06/14 13:25:39 ERROR : pubmed23n0561.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost

2023/06/14 13:25:39 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: Connecting to FTP server
2023/06/14 13:25:39 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","ftp.ncbi.nlm.nih.gov:21")
2023/06/14 13:25:40 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:41 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50325")
2023/06/14 13:25:41 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:41 INFO  : pubmed23n0565.xml.gz.md5: Copied (new)
2023/06/14 13:25:41 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50100")
2023/06/14 13:25:41 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:42 ERROR : pubmed23n0562.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost

2023/06/14 13:25:42 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: Connecting to FTP server
2023/06/14 13:25:42 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","ftp.ncbi.nlm.nih.gov:21")
2023/06/14 13:25:42 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:43 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50066")
2023/06/14 13:25:43 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:44 INFO  : pubmed23n0566.xml.gz.md5: Copied (new)
2023/06/14 13:25:44 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50478")
2023/06/14 13:25:44 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:46 INFO  : pubmed23n0565.xml.gz: Copied (new)
2023/06/14 13:25:46 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50122")
2023/06/14 13:25:46 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:46 INFO  : pubmed23n0567.xml.gz.md5: Copied (new)
2023/06/14 13:25:46 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: dial("tcp","[2607:f220:41e:250::13]:50280")
2023/06/14 13:25:46 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/14 13:25:53 ERROR : pubmed23n0563.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost

This keeps going and finally, it ends with this

2023/06/14 13:32:17 DEBUG : pubmed23n1165.xml.gz: Excluded
2023/06/14 13:32:17 DEBUG : pubmed23n1165.xml.gz.md5: Excluded
2023/06/14 13:32:17 DEBUG : README.txt: Excluded
2023/06/14 13:32:17 DEBUG : S3 bucket technology-experiments path temp/test-rclone-cli-3: Waiting for checks to finish
2023/06/14 13:32:17 DEBUG : pubmed23n0561.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0561.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0562.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0562.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0562.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0562.xml.gz: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0563.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0563.xml.gz: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0560.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0560.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0563.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0563.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0560.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0560.xml.gz: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0561.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0561.xml.gz: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0565.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0565.xml.gz: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0564.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0564.xml.gz: Unchanged skipping
2023/06/14 13:32:17 DEBUG : pubmed23n0564.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:17 DEBUG : pubmed23n0564.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0566.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0566.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0565.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0565.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0568.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0568.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0566.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0566.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0567.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0567.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0568.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0568.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0569.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0569.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0567.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0567.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n0569.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n0569.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1010.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1010.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1011.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1011.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1010.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1010.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1012.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1012.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1011.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1011.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1012.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1012.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1013.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1013.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1013.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1013.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1014.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1014.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1015.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1015.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1014.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1014.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1016.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1016.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1015.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1015.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1016.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1016.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1017.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1017.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1017.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1017.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1018.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1018.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1018.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1018.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1019.xml.gz.md5: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1019.xml.gz.md5: Unchanged skipping
2023/06/14 13:32:18 DEBUG : pubmed23n1019.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/14 13:32:18 DEBUG : pubmed23n1019.xml.gz: Unchanged skipping
2023/06/14 13:32:18 DEBUG : S3 bucket technology-experiments path temp/test-rclone-cli-3: Waiting for transfers to finish
2023/06/14 13:32:18 ERROR : Attempt 2/3 succeeded
2023/06/14 13:32:18 INFO  : 
Transferred:   	 1007.072 MiB / 1007.072 MiB, 100%, 1.506 MiB/s, ETA 0s
Checks:                40 / 40, 100%
Transferred:           23 / 23, 100%
Elapsed time:      7m51.3s

2023/06/14 13:32:18 DEBUG : 27 go routines active
2023/06/14 13:32:18 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: closing 1 unused connections

maybe this will spark an idea.
https://forum.rclone.org/t/rclone-ftps-explicit-copy-failed-to-copy-update-stor-450-transfer-aborted-link-to-file-server-lost/24721

perhaps use rclone sync

Most likely.

When the file transfer is aborted rclone should notice and delete the partially uploaded file.

Do you think this isn't happening? Are the uploaded files too small?

The other thing that might be happing is if rclone gets an error while download a file it will retry by opening the file and skipping a certain number of bytes. It is possible this isn't working properly on the FTP server you are using.

So are the corrupted files the correct length?

Can you link the corrupted files with the 450 Transfer aborted error? This is quite difficult unless you set --transfers 1

Can you post a link to a full log with -vv --dump headers

Can you share an example of a corrupted file and the original uncorrupted one - that would be very helpful.

1 Like

Hello Nick, thanks for the response!

Rclone unfortunately does not delete the corrupt files, instead it keeps moving them from the ftp server to the s3 bucket.

Files are about 30-60MB large. Corrupted file size ends up exactly the same, I guess that's why Rclone doesn't remove them.

As far as I can tell from my logs and the checks I'm running, a 450 error will result in a corrupted file but not all corrupted files show up as 450 errors.

Another interesting point is that the problem appears when directly moving stuff from FTP to S3. When I move from FTP to local, Rclone works as intended. When moving from local to S3, again no problem although I only have tried 2 testing runs.

What I think is causing the issue is the added latency from reading from FTP and writing to S3, or something along these lines. Worst case scenario, I split the transfer into 2 parts and since this will run from inside an EC2 instance, it will be practically the same. Best case, the rapid speed transfer from EC2 to S3 will remove the issue by itself.

As for an example of a corrupted and uncorrupted file, I can tell you that these are gziped xml files and gunzip cannot open the corrupted ones. However, I can read them with bufio and compress/gzip and here is the point of corruption, where the XML just goes crazy after line 5159247.

5159236         </Reference>
5159237         <Reference>
5159238           <Citation>Molecules. 2020 Sep 09;25(18):</Citation>
5159239           <ArticleIdList>
5159240             <ArticleId IdType="pubmed">32916992</ArticleId>
5159241           </ArticleIdList>
5159242         </Reference>
5159243         <Reference>
5159244           <Citation>Chem Biol Interact. 2007 Jan 30;165(2):106-16</Citation>
5159245           <ArticleIdList>
5159246             <ArticleId IdType="pubmed">1718867     <Reference>
5159247           <Citation>Hepatology. 2007 Sep;46(3):823-30</Citation>
5159248           <ArticleIdLiC     >
5159249    4dwar e>
5159250  Pt>
5159251             8tol. 2019 Jan;sIdLi7 Jane>
5159252         <Reference>
5159253           <Citation>Mo</QualeUI=;p(2):106-105e>
5159254 ce>
5159255 tespirence>
5159256   a<LastName>Bhattacharh <Art86<   an anatitmon>
5159257   ;rticle;Krnln, t 20H-58, Haridwar 249 405, Uttarakhand, India.</Affil>
5159258   -2019 Jan;sIdLi7 Jane>
5159259         <Reference>
5159260           <Citation>Mo</QualeUI=;p(2):4
5159261   a<Las</RefR          <ArticleIdList>
5159262             <ArticleId IdType="pubmed">24949y>30</Day>
5159263       </DateCompleted>
5159264       <DateRevised>
5159265         <Year>2021</Yea       </AS8c9Journal>
5159266           <ISSN IssnType="Electronic">1420-3049</ISSN>
5159267           <Journa>)-105eiIdList>
5159268         </Reference>
5159269         <Reference>
5159270           <Citation>BMB Rep.  i>
5159271           <Citation>Molecules. 2020 Sep 09;25(18):</Citation>
5159272           <ArtCt30266an;sId</Initifarrti2   , 3oiQualifieier444   0    tati6n aw      </Aeference>
5159273         <Reference>
5159274           <Citatie="Electronic">1420-3049</ISSN>
5159275           <Journa>)-105eiIdList>
5159276         </Reference>
5159277         <Reference>
5159278        2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2     <48 6an;sId</Initifarrti2   , 3oiQualifieier444   0    tati6n aw      </Aeference>
5159279         <Reference>
5159280           e4 2 2 nocriptorName>
5159281   eId    l/Aefelce>
5159282  3elycerol</p5v5on of anaesthesia, attempted to: (1) clarify the evidence for Russell and Bur8>
5159283    atio600e5a7 <RefeX;leIdList>
5159284                    <ArticleId IdType="pubmed">26490400</Art    ,ualifieier444   0    tati6n aw   nO rn of anaesthesia, attempted to: 3     cleName>Joshi</LastNa01vyrrticl   <LastNa2</ArticleId>
5159285   040 </Rep.      0g</RefeGsue>100aR25 <Ci5Srust,jali Research Foundation Trust, NH-58, Haridwar 249 405, Uttarakhanilieeeame U   ,uM1n      59x58v6eameOfSubsanaesthesalifi <DescriptorNaptorNaptorferenceMr2):1atliationInfo>
5159286     onists;37(3):3187 Survival</DescriptorName>
5159287           <QualifierName UI="Q000187" MajorTopia.3390srts;37adiSpp5e e823-</Artctros-6   CjorTo rn of anaem l, 3arakhanilieeeame U   ,uM1n      59x58v6eameOfSubsanaesthesalifi <Descrticn

Yes it sounds like the corruption is in the file stream but leaves the file the same size.

Unfortunately FTP doesn't have checksums so we can't detect the corruption except by the file being the wrong size.

How sure are you that there are corrupted files without 450 errors? It would be nice to think that the 450 errors and the subsequent re-open of the file were the cause of the problem.

Yes I'd agree with you that it is probably the extra latency which is causing the 450 errors. Rclone streams the files so it will couple the upload to s3 with the download from ftp.

I'd be very surprised if there was any problem from local to s3 as this is error checked in multiple different ways.

I just noticed - this error code is wrong... Transfer aborted should be 426 and rclone would be retrying that error code.

* 450 Transfer aborted. Link to file server lost

I can make the server give these 450 errors quite easily like this

rclone -vv copy --retries 1 --multi-thread-cutoff 10M pubmed-ftp:pubmed/baseline/pubmed23n0561.xml.gz .

Which attempts to download the file using 2 download threads - it really doesn't like that!

So I suspect the 450 errors are something to do with too many connections

I tried your exact command line with the latest beta and I get slightly different errors

2023/06/20 12:23:05 DEBUG : pacer: low level retry 1/1 (error RequestError: send request failed
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n0566.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=32835909 with Body length 38997941)

And that looks like the FTP server stopping sending us data before the file finished. The file seems to go through on a retry though.

Do you see these errors in your log?

Maybe you could give the latest beta a try?

1 Like

I've managed to replicate this locally.

It looks like data is being repeated in the corrupted files.

  • 0 is the hex dump of the original
  • 1 is the hex dump of the corrupted version
$ grep ZUBW 0 1

0:02022a70: d5 5a 55 42 57 ad 40 61 b9 b2 c7 2f 8c d1 56 09  .ZUBW.@a.../..V.

1:02022a70: d5 5a 55 42 57 ad 40 61 b9 b2 c7 2f 8c d1 56 09  .ZUBW.@a.../..V.
1:02269b80: 9d 6c 51 f2 7a d2 e0 66 d5 5a 55 42 57 ad 40 61  .lQ.z..f.ZUBW.@a
1:024b6530: d5 5a 55 42 57 ad 40 61 b9 b2 c7 2f 8c d1 56 09  .ZUBW.@a.../..V.
1:02702ed0: 9d 6c 51 f2 7a d2 e0 66 d5 5a 55 42 57 ad 40 61  .lQ.z..f.ZUBW.@a
1:0294f880: d5 5a 55 42 57 ad 40 61 b9 b2 c7 2f 8c d1 56 09  .ZUBW.@a.../..V.
1:02b7c510: 9d 6c 51 f2 7a d2 e0 66 d5 5a 55 42 57 ad 40 61  .lQ.z..f.ZUBW.@a
1:02dc8ec0: d5 5a 55 42 57 ad 40 61 b9 b2 c7 2f 8c d1 56 09  .ZUBW.@a.../..V.

I found the target strings with

strings pubmed-ftp-copy/pubmed23n1012.xml.gz | sort | uniq -c | sort -n > 0strings-count
strings pubmed-s3-copy/pubmed23n1012.xml.gz | sort | uniq -c | sort -n > 1strings-count

And made the hex dumps with

xxd -g1 pubmed-ftp-copy/pubmed23n1012.xml.gz > 0
xxd -g1 pubmed-s3-copy/pubmed23n1012.xml.gz > 1

I checked the MD5 digests and the files which went ftp -> local were all ok, but the files which went s3 -> ftp -> local had corruptions which is all as you stated above.

Here is the log for the corrupted transfer

$ grep -P 'pubmed23n1012\.xml\.gz[^.]' pubmed.log
2023/06/20 12:19:46 DEBUG : pubmed23n1012.xml.gz: Need to transfer - File not found at Destination
 *                          pubmed23n1012.xml.gz: 18% /59.254Mi, 573.104Ki/s, 1m25s
 *                          pubmed23n1012.xml.gz: 77% /59.254Mi, 602.150Ki/s, 22s
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n1012.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=62131835 with Body length 68674067)
2023/06/20 12:25:09 DEBUG : pubmed23n1012.xml.gz: Received error: RequestError: send request failed
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n1012.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=62131835 with Body length 68674067 - low level retry 1/10
 *                          pubmed23n1012.xml.gz: 27% /59.254Mi, 482.706Ki/s, 1m30s
 *                          pubmed23n1012.xml.gz: 78% /59.254Mi, 496.474Ki/s, 26s
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n1012.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=62131835 with Body length 69025931)
2023/06/20 12:27:09 DEBUG : pubmed23n1012.xml.gz: Received error: RequestError: send request failed
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n1012.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=62131835 with Body length 69025931 - low level retry 2/10
 *                          pubmed23n1012.xml.gz: 26% /59.254Mi, 477.992Ki/s, 1m33s
 *                          pubmed23n1012.xml.gz: 76% /59.254Mi, 500.632Ki/s, 28s
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n1012.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=62131835 with Body length 74868579)
2023/06/20 12:29:13 DEBUG : pubmed23n1012.xml.gz: Received error: RequestError: send request failed
caused by: Put "https://rclone-pubmed-issue.s3.eu-west-2.amazonaws.com/pubmed23n1012.xml.gz": net/http: HTTP/1.x transport connection broken: http: ContentLength=62131835 with Body length 74868579 - low level retry 3/10
 *                          pubmed23n1012.xml.gz: 28% /59.254Mi, 557.217Ki/s, 1m17s
 *                          pubmed23n1012.xml.gz: 84% /59.254Mi, 556.091Ki/s, 17s
2023/06/20 12:31:02 INFO  : pubmed23n1012.xml.gz: Copied (new)
2023/06/20 12:41:40 DEBUG : pubmed23n1012.xml.gz: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/06/20 12:41:40 DEBUG : pubmed23n1012.xml.gz: Unchanged skipping

So it looks like things went wrong with the server quite a few times whereas copying it straight to local went hitch free.

I'll investigate further...

Wow, first of all thanks for all the effort and the thoughtful replies.

I'm trying myself with subsets of the data and testing --transfers values of 1, 2, 3 to see if the issue, which is probably FTP server side, is solved if I just reduce the number of parallel connections.

I'm not very knowledgable on networking and network protocols, do you think Rclone should have behaved differently in this case? Should it have aborted the transfers with errors altogether?

The following command ended up downloading 30 files & their md5 counterparts without any issues at all. The log file didn't have any error messages at all and the checksums ended up matching perfectly. Will continue the tests and post the results.

rclone copy pubmed-ftp:pubmed/baseline aws-s3:technology-experiments/temp/transfer1 --include "*23n{101,056,022}*.xml*" -v --transfers 1 --log-file=transfer1.txt

Sure enough, with --transfers 2 we get 450s. The following is the result of running

cat transfer2.txt | grep -C 1 ERROR
2023/06/20 17:12:13 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/20 17:12:24 ERROR : pubmed23n0225.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost
--
2023/06/20 17:30:14 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/20 17:30:37 ERROR : pubmed23n0568.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost
--

2023/06/20 17:32:07 ERROR : pubmed23n1010.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost
--
2023/06/20 17:33:10 DEBUG : ftp://ftp.ncbi.nlm.nih.gov:21/pubmed/baseline: > dial: conn=*fshttp.timeoutConn, err=<nil>
2023/06/20 17:33:34 ERROR : pubmed23n1013.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost
--

2023/06/20 17:35:10 ERROR : pubmed23n1016.xml.gz: Failed to copy: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost
--
2023/06/20 17:36:12 INFO  : pubmed23n1018.xml.gz: Copied (new)
2023/06/20 17:36:12 ERROR : Attempt 1/3 failed with 5 errors and: 1 error occurred:
	* 450 Transfer aborted. Link to file server lost
--
2023/06/20 17:36:15 DEBUG : S3 bucket technology-experiments path temp/transfer2: Waiting for transfers to finish
2023/06/20 17:36:15 ERROR : Attempt 2/3 succeeded
2023/06/20 17:36:15 INFO  : 

The reason I said previously that 450s corrupt files but not all corrupt files are 450s, is that when checking the results of this run, I got 8 corrupt files while only 5 in the logs. It could be a logic bug with my code, or it could be a currupt .md5 file while the data file is correct (this one is highly unlikely).

Good thinking with --transfers 1. I suspect the server is behind some firewall or proxy which is closing the connections when they are idle and doing a bit of a messy job of it.

When I do a transfer from ftp -> local with --bwlimit 1M I see the same issues so we can rule s3 out of this equation.

This gives

ERROR : pubmed23n0562.xml.gz.hurohug1.partial: corrupted on transfer: sizes differ 32869870 vs 35247486

and

pubmed23n1012.xml.gz: Received error: 1 error occurred:

  • EOF

Which I think are different manifestation of the same problem.

The transfer from ftp -> local with bandwidth limit did not produce any corrupted files though - the local backend noticed every aborted transfer so there is something which could be improved here

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.