Incrementally copy append-only files to S3

What is the problem you are having with rclone?

I'm continuously syncing log files to S3 with rclone. As a log file, these are only appended to, but rclone appears to re-upload the whole file each time. In the worst case this means I'll pay for N^2 ingress into S3.

What is your rclone version (output from rclone version)

1.56.0

Which cloud storage system are you using? (eg Google Drive)

AWS S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone.exe copy -v --config rclone.conf ..\logs s3-log-upload:my-bucket-name/some-path/

The rclone config contents with secrets removed.

[s3-log-upload]
type = s3
provider = AWS
...
region = us-west-1
acl = private
storage_class = REDUCED_REDUNDANCY

A log from the command with the -vv flag

This is the log I get after appending 1MB to an existing file and copying to S3:

2021/08/31 12:08:44 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone.exe" "copy" "--checksum" "-vv" "--config" "rclone.conf" "..\\logs" "s3-log-upload:juicelabs-logs/testtest/"]
2021/08/31 12:08:44 DEBUG : Creating backend with remote "..\\logs"
2021/08/31 12:08:44 DEBUG : Using config file from "C:\\Users\\adam\\juicelabs\\juice\\ops\\rclone.conf"
2021/08/31 12:08:44 DEBUG : fs cache: renaming cache item "..\\logs" to be canonical "//?/C:/Users/adam/juicelabs/juice/logs"
2021/08/31 12:08:44 DEBUG : Creating backend with remote "s3-log-upload:juicelabs-logs/testtest/"
2021/08/31 12:08:44 DEBUG : fs cache: renaming cache item "s3-log-upload:juicelabs-logs/testtest/" to be canonical "s3-log-upload:juicelabs-logs/testtest"
2021/08/31 12:08:44 DEBUG : S3 bucket juicelabs-logs path testtest: Waiting for checks to finish
2021/08/31 12:08:44 DEBUG : randlog: Sizes differ (src 5242880 vs dst 4194304)
2021/08/31 12:08:44 DEBUG : juice.log: md5 = 6ca536fcefc7e9aa46b481fbde99d9ac OK
2021/08/31 12:08:44 DEBUG : juice.log: Size and md5 of src and dst objects identical
2021/08/31 12:08:44 DEBUG : juice.log: Unchanged skipping
2021/08/31 12:08:44 DEBUG : S3 bucket juicelabs-logs path testtest: Waiting for transfers to finish
2021/08/31 12:08:48 DEBUG : randlog: md5 = e938d0bf7e58e1bd2d8d9b1f7328be85 OK
2021/08/31 12:08:48 INFO  : randlog: Copied (replaced existing)
2021/08/31 12:08:48 INFO  :
Transferred:            5Mi / 5 MiByte, 100%, 1.475 MiByte/s, ETA 0s
Checks:                 2 / 2, 100%
Transferred:            1 / 1, 100%
Elapsed time:         4.1s

2021/08/31 12:08:48 DEBUG : 5 go routines active

hello and welcome to the forum,

the s3 api does not support append operations.

might be a workaround
https://aws.amazon.com/blogs/developer/efficient-amazon-s3-object-concatenation-using-the-aws-sdk-for-ruby/

Thanks. That actually seems like a pretty nice way of implementing append, but I guess it's not in rclone at the moment.

@ahupp If you log a feature request on github, that's the best way to get the ball rolling.

Link the post here.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.