Copy to AWS S3 with minimal policy

Looking a nice way to use copy command without allow the user to get the files from my bucket.
So here is the policy I've used:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowPutObject",
            "Action": [
                "s3:ListBucket",
                "s3:PutObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::my-bucket-name",
                "arn:aws:s3:::my-bucket-name/*"
            ]
        }
    ]
}

and I ran the copy command with this policy, and the files were copied. But the logs show something strange:
Here is the log of one of the files (it's repeated in all of them):

2022/12/29 11:27:55 DEBUG : fs cache: renaming cache item "s3_aws:/data-collector-nd/DF/Nano/Test/019/Coast" to be canonical "s3_aws:data-collector-nd/DF/Nano/Test/019/Coast"
2022/12/29 11:27:56 NOTICE: Coast-Test (1).jpeg: Failed to read metadata: Forbidden: Forbidden
status code: 403, request id: KZ66B0XZJWTJZ7RS, host id: sPmpbG2pR11bus6T0xlqtMwvDz9wgejCQ4KaGnrMPSUHNFIzM+zOARfjVu+qlLI/sURYpBJ8pWk=
2022/12/29 11:27:56 DEBUG : Coast-Test (1).jpeg: Modification times differ by 335h41m28.24063s: 2022-12-15 11:46:27.8482423 +0200 IST, 2022-12-29 11:27:56.0888723 +0200 IST m=+0.853214601
2022/12/29 11:27:56 DEBUG : Coast-Test (1).jpeg: md5 = 186f5d7eaaa2d3035844f3790f85b7da OK

Seem at the end it's looking for the checksum and the file look OK, so no need to copy it, but before that rclone looks for metadata match and can't get the file to check that. I think I miss understand the way copy works.
Is for each file, it downloading the file into the temp folder, checks metadata and checksum, and if something changed copies? or it's listing it? why my policy isn't enough?

1 Like

as far as i know, to allow rclone to read the metadata, need to use s3:GetObject

rclone does not use a temp folder with S3.
rclone does not download the dest file, rclone simply reads the metadata.

for each source file, rclone checks the dest by reading metadata.
then rclone decides is there is a need to copy the source file to the dest.

in addition, before rclone copies a file from source to dest,
rclone calculates the md5 of the source file.
after the upload completes, rclone compares the md5 of the source to the md5 as generated by s3 provider.

to prevent that, i use the following:
--- MFA for IAM users - without that, rclone, or any app, cannot download
--- SSE-C - without that, rclone, or any app, cannot download
--- session token - again, without that, rclone cannot download.
--- another option, i have not yet tested, create a policy that requires a specific header.

have a read of
https://rclone.org/s3/#reducing-costs

1 Like

Thx for your answer, it is very informative.
Maybe some context will make you understand my situation:
Let's say I have 3 customers that's need to upload data that improve my algorithm,
I want to provide them a rclone schedule with the same access key for all of them, this key will allow them to upload data on-going, but if someone will try to abuse my keys and see other customers' data, he will be blocked, since he will able to list only.

Maybe I need to improve my policy, but I just drop by to check if we can avoid metadata checks.

Bytheway - I saw the logs report they aren't able to get the modification DateTime, but using Cyberduck with the same key, I was able to see it, so we going back to "not need access to the object"

which modtime do you mean?
did you read the link i shared?

the modtime, as saved by s3, which is the time the file was uploaded.
or
the modtime that rclone saves as metadata, with the acutal modtime of the source file.
by default, rclone uses its own modtime, and that is stored as metadata that rclone has to head

notice the HEAD on the first run, to get X-Amz-Meta-Mtime

rclone lsl wasabi01:5gib --dump=headers 
DEBUG : HTTP REQUEST (req 0xc0007bd000)
DEBUG : GET /?delimiter=&encoding-type=url&list-type=2&max-keys=1000&prefix= HTTP/1.1
Host: 5gib.s3.us-east-2.wasabisys.com
User-Agent: rclone/v1.60.1
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20221229T152442Z
Accept-Encoding: gzip

2022/12/29 10:24:42 DEBUG : HTTP RESPONSE (req 0xc0007bd000)
2022/12/29 10:24:42 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Thu, 29 Dec 2022 15:24:42 GMT
Server: WasabiS3/7.10.1198-2022-12-14-39a7a2e69e (B33-U25)
X-Amz-Bucket-Region: us-east-2
X-Amz-Id-2: zN5oEx5cpjLfnezcv1gQt4wPEWUWfr6+42Da3jpyf8KfqrJ5Om7V2FItSoZiHmJpvfqCDplOGf57
X-Amz-Request-Id: 59DEEC7686C62699:B

2022/12/29 10:24:42 DEBUG : HTTP REQUEST (req 0xc000195b00)
2022/12/29 10:24:42 DEBUG : HEAD /file.ext HTTP/1.1
Host: 5gib.s3.us-east-2.wasabisys.com
User-Agent: rclone/v1.60.1
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20221229T152442Z

2022/12/29 10:24:42 DEBUG : HTTP RESPONSE (req 0xc000195b00)
2022/12/29 10:24:42 DEBUG : HTTP/1.1 200 OK
Content-Length: 1
Accept-Ranges: bytes
Content-Type: application/octet-stream
Date: Thu, 29 Dec 2022 15:24:42 GMT
Etag: "c4ca4238a0b923820dcc509a6f75849b"
Last-Modified: Thu, 29 Dec 2022 15:12:11 GMT
Server: WasabiS3/7.10.1198-2022-12-14-39a7a2e69e (B33-U25)
X-Amz-Id-2: hq2+76BhqNDgM9AtZdmkzMeZ/ZqgyUPONjQKD2SkrcDCC83hzodJ3a748c31oyAKX+xPsD3HYxCM
X-Amz-Meta-Mtime: 18000
X-Amz-Request-Id: E889A98A1768D6F8:B

#################################333

rclone.exe lsl wasabi01:5gib --use-server-modtime  --dump=headers 
DEBUG : HTTP REQUEST (req 0xc000792c00)
DEBUG : GET /?delimiter=&encoding-type=url&list-type=2&max-keys=1000&prefix= HTTP/1.1
Host: 5gib.s3.us-east-2.wasabisys.com
User-Agent: rclone/v1.60.1
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20221229T152442Z
Accept-Encoding: gzip

2022/12/29 10:24:42 DEBUG : HTTP RESPONSE (req 0xc000792c00)
2022/12/29 10:24:42 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Thu, 29 Dec 2022 15:24:43 GMT
Server: WasabiS3/7.10.1198-2022-12-14-39a7a2e69e (XB27-U40)
X-Amz-Bucket-Region: us-east-2
X-Amz-Id-2: KztM4ZOkbISLJvyckhmq7dfrs8eqDogr88y9cK6ixSxf3dDSubbF72xBl1Rcsk9nbQH4wegR9ivj
X-Amz-Request-Id: 35FE2DF195B0FC42:B

        1 2022-12-29 10:12:11.000000000 file.ext

using this policy and this command, was able to upload a file

rclone copy file.ext zork: -vv --s3-no-check-bucket --s3-no-head --s3-no-head-object

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::redacted:user/name"
      },
      "Action": "s3:PutObject",
      "Resource": [
        "arn:aws:s3:::bucket/*",
        "arn:aws:s3:::bucket"
      ]
    }
  ]
}

and redacted debug output

rclone copy file.ext zork: -vv --s3-no-check-bucket --s3-no-head --s3-no-head-object 
DEBUG : file.ext: Sizes differ (src 1 vs dst 0)
DEBUG : file.ext: md5 = c4ca4238a0b923820dcc509a6f75849b OK
INFO  : file.ext: Copied (replaced existing)
INFO  : 
Transferred:   	          1 B / 1 B, 100%, 0 B/s, ETA -
Transferred:            1 / 1, 100%

Using PUT only, rclone can't know what is already on the bucket, and copy everything again, right?

i believe that is right.

1 Like

For such a case, I would use GUI tools like Goodsync and Gs Richcopy 360 to upload to AWS S3 and handle this issue easily.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.