AWS S3 to DO Spaces


#1

Hi, I’m the author of https://natalian.org/2018/01/27/S3_versus_Spaces/ and I’m simply trying to mirror my authoratative S3 bucket to Digital Ocean’s “Space” with rclone sync --fast-list -v s3:s.natalian.org spaces:natalian.

It’s never completed without error. After an hour it usually exits like so:

2018/02/28 14:50:29 INFO  : S3 bucket natalian: Waiting for checks to finish
2018/02/28 14:50:30 INFO  : S3 bucket natalian: Waiting for transfers to finish
2018/02/28 14:50:30 ERROR : S3 bucket natalian: not deleting files as there were IO errors
2018/02/28 14:50:30 ERROR : S3 bucket natalian: not deleting directories as there were IO errors
2018/02/28 14:50:30 ERROR : Attempt 3/3 failed with 1 errors and: corrupted on transfer: sizes differ 361475 vs 5487094
2018/02/28 14:50:30 Failed to sync: corrupted on transfer: sizes differ 361475 vs 5487094

Any ideas how to proceed? I’ve retried the command about 10 times over the space of two weeks. I don’t have a good idea if it’s working through the issues.

I’m an Archlinux user.

rclone v1.39
- os/arch: linux/amd64
- go version: go1.9.2

Once the mirror is complete, I will hopefully setup a service to run rclone based on a YYYY-MM prefix every day, to make sure the mirror is good.

Since I’m having so much difficulty mirroring my data, I’m wondering how other businesses backup their data amongst cloud services !!


#2

If you could make a log with -vv and email it to me at nick@craig-wood.com then I’ll take a look. Can you put a link to this thread in the email so I can keep it straight!

Thanks


#3

Thanks for the log!

The problem file is this one

2018/03/01 17:34:18 ERROR : 2013-07-30/encode.log.gz: corrupted on transfer: sizes differ 361475 vs 5487094

Is this file being written to during the transfer?

Another thing you could try is --no-gzip-encoding - that might help if for some reason DO is ungzipping the file on the fly.


#4

--no-gzip-encoding did appear to make it complete. Before there was no “2013-07-30/encode.log.gz” on the destination btw.

Though… is this a successful completion of an rclone invocation?

2018/03/09 09:57:18 INFO  : Waiting for deletions to finish
2018/03/09 09:57:18 INFO  : 2018-01-27/signup.png: Deleted
2018/03/09 09:57:18 INFO  : 2018-01-27/: Deleted
2018/03/09 09:57:18 INFO  :·
Transferred:   84.434 MBytes (137.833 kBytes/s)
Errors:                 0
Checks:             43150
Transferred:           42
Elapsed time:    10m27.2s

2018/03/09 09:57:18 DEBUG : Go routines at exit 16
2018/03/09 09:57:18 DEBUG : rclone: Version "v1.39" finishing with parameters ["rclone" "-vv" "--log-file" "rclone.log" "sync" "--no-gzip-encoding" "--fast-lis
t" "-v" "s3:s.natalian.org" "spaces:natalian"]

I didn’t expect any deletions. Furthermore, exit 16?

Exit 16 is not present on the List of exit codes in the man page on my Archlinux system!


#5

Yes :slight_smile:

DO put some files in your first bucket for you - rclone deleted those as they didn’t exist on the source and you used sync. Use copy if you don’t want that.

That is only a DEBUG message and it means “16 go routines at exit” . You won’t see that with -v.

I’ll re-word it since you aren’t the first person to have mis-read it.


#6

I was trying to replicate this problem to report it to digital ocean but I couldn’t… What is in that 2013-07-30/encode.log.gz file? Any chance you could email it to me at nick@craig-wood.com so I can experiment further?


#7

Sorry for the belated reply. I sent you the link via email Nick!

Though I do have two more questions, that might warrant new threads, though here goes:

  1. Can --no-gzip-encoding be set in the ~/.config/rclone/rclone.conf?
  2. How do people keep two buckets in sync? A daily cron job, or is there a smarter way? I.e. triggering copies on events.

#8

Thanks.

I don’t have any problems uploading that file, and it looks like you didn’t either in your no-gzip log. :frowning:

No, but you can set an environment variable, eg export RCLONE_NO_GZIP_ENCODING=true

A daily cron job is what most people do. Though I expect you could be cleverer if you knew something about S3 events…

Note that if you use --checksum it will likely run faster as the default modtime uses an extra transaction, so I’d expect the check with --fast-list to be extremely quick!