Amazon S3 Credentials Expiring

What is the problem you are having with rclone?

Amazon S3 credentials from IAM Role isn't being refreshed

Run the command 'rclone version' and share the full output of the command.

# rclone version
rclone v1.62.2
- os/version: alpine 3.16.2 (64 bit)
- os/kernel: 5.15.0-1019-aws (aarch64)
- os/type: linux
- os/arch: arm64 (ARMv8 compatible)
- go/version: go1.20.2
- go/linking: static
- go/tags: none
yes

Which cloud storage system are you using? (eg Google Drive)

Amazon S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

/usr/bin/rclone mount -v -v s3-bucket:my-bucket/my-folder /mnt/my-folder/ --write-back-cache --vfs-cache-mode full --local-no-sparse --read-only --allow-other --file-perms 755 --transfers 16 --daemon --checksum --log-file /var/log/rclone.log

The rclone config contents with secrets removed.

[s3-bucket]
type = s3
provider = AWS
env_auth = true
region = eu-west-1
location_constraint = eu-west-1
server_side_encryption = aws:kms
sse_kms_key_id = arn:aws:kms:us-east-1:*
storage_class = STANDARD

A log from the command with the -vv flag

2023/04/21 15:00:16 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2023/04/21 15:01:16 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2023/04/21 15:01:55 DEBUG : /: Lookup: name="index.html"
2023/04/21 15:01:55 DEBUG : : Re-reading directory (23h31m44.379977112s old)
2023/04/21 15:02:16 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2023/04/21 15:02:35 ERROR : /: Dir.Stat error: ExpiredToken: The provided token has expired.
	status code: 400, request id: ...REDACTED..., host id: ...REDACTED...
2023/04/21 15:02:35 ERROR : IO error: ExpiredToken: The provided token has expired.
	status code: 400, request id: ...REDACTED..., host id: ...REDACTED...
2023/04/21 15:02:35 DEBUG : /: >Lookup: node=<nil>, err=ExpiredToken: The provided token has expired.
	status code: 400, request id: ...REDACTED..., host id: ...REDACTED...
2023/04/21 15:02:35 DEBUG : /: Lookup: name="index.html"
2023/04/21 15:02:35 DEBUG : : Re-reading directory (23h32m24.743309925s old)
2023/04/21 15:03:16 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2023/04/21 15:03:31 ERROR : /: Dir.Stat error: ExpiredToken: The provided token has expired.
	status code: 400, request id: ...REDACTED..., host id: ...REDACTED...
2023/04/21 15:03:31 ERROR : IO error: ExpiredToken: The provided token has expired.
	status code: 400, request id: ...REDACTED..., host id: ...REDACTED...
2023/04/21 15:03:31 DEBUG : /: >Lookup: node=<nil>, err=ExpiredToken: The provided token has expired.
	status code: 400, request id: ...REDACTED..., host id: ...REDACTED...
2023/04/21 15:03:31 DEBUG : /: Attr: 
2023/04/21 15:03:31 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
2023/04/21 15:03:31 DEBUG : /: Attr: 
2023/04/21 15:03:31 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxr-xr-x, err=<nil>
/ # 

hello and welcome to the forum,

what kind of token, session token?

rclone is using the IAM Role credentials (assigned to the EC2 instance running the docker container in which rclone runs in), with a role expiry time of 1hr 30mins.

The IAM Role has the following permissions assigned:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:GetAccessPoint",
                "s3:ListAllMyBuckets",
                "s3:CreateJob",
                "s3:GetObjectRetention",
                "s3:GetObjectVersionTagging",
                "s3:GetObjectAttributes",
                "s3:GetObjectVersionAttributes",
                "s3:GetObjectVersionTorrent",
                "s3:PutObject",
                "s3:GetObjectAcl",
                "s3:GetObject",
                "s3:GetObjectVersionAcl",
                "s3:GetObjectTagging",
                "s3:GetObjectVersionForReplication",
                "s3:GetObjectVersion",
                "s3:GetBucketTagging",
                "s3:ListBucketVersions",
                "s3:ListBucket",
                "s3:GetBucketVersioning",
                "s3:GetBucketAcl",
                "s3:GetBucketLocation",
                "s3:GetBucketPolicy"
            ],
            "Resource": [
                "arn:aws:s3:::*"
            ]
        }
    ]
}

On the AWS CloudTrail console, it appears that the tokens are being successfully requested every hour. But rclone doesn't seem to be using the latest token that was fetched. This was by looking at the /proc/<RCLONE_PID>/environ file, which was having a mismatched AWS_SESSION_TOKEN with the latest one on CloudTrail event.

Rclone delegates all of this stuff to the AWS Go SDK so the problem will with rclone's use of the SDK or perhaps a bug in the SDK.

By rclone? Can you tell that?

That looks like the case in the log. Are there requests which work too? Or do they all fail?

Is that how rclone picks up the auth? From an environment variable? That would explain why it isn't being updated.

Can you get rclone to fetch the STS token itself? It should know how to do that (at least I know others have done it successfully!) If it fetches itself then it will know how to update it.

If you look at this post you'll see a description of how to make rclone fetch the STS token itself using the profile.

yes, as rclone is the only service on the EC2 calling the AWS API

When rclone first starts up, the access seems to be fine, the files are accessible by the webserver as seen in the logs soon after rclone startup:

2023/04/20 13:11:26 DEBUG : /: >Lookup: node=<nil>, err=no such file or directory
2023/04/20 13:12:16 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2023/04/20 13:13:10 DEBUG : /: Lookup: name="index.html"
2023/04/20 13:13:10 DEBUG : /: >Lookup: node=index.html, err=<nil>
2023/04/20 13:13:10 DEBUG : index.html: Attr: 
2023/04/20 13:13:10 DEBUG : index.html: >Attr: a=valid=1s ino=0 size=25169 mode=-rwxr-xr-x, err=<nil>
2023/04/20 13:13:10 DEBUG : index.html: Open: flags=OpenReadOnly+OpenNonblock+0x20000
2023/04/20 13:13:10 DEBUG : index.html: Open: flags=O_RDONLY|0x20800
2023/04/20 13:13:10 DEBUG : index.html: newRWFileHandle: 
2023/04/20 13:13:10 DEBUG : index.html: >newRWFileHandle: err=<nil>
2023/04/20 13:13:10 DEBUG : index.html: >Open: fd=index.html (rw), err=<nil>
2023/04/20 13:13:10 DEBUG : index.html: >Open: fh=&{index.html (rw)}, err=<nil>
2023/04/20 13:13:10 DEBUG : &{index.html (rw)}: Read: len=28672, offset=0
2023/04/20 13:13:10 DEBUG : index.html(0x40005d0200): _readAt: size=28672, off=0
2023/04/20 13:13:10 DEBUG : index.html(0x40005d0200): openPending: 
2023/04/20 13:13:10 DEBUG : index.html: vfs cache: checking remote fingerprint "25169,2023-04-13 11:40:42 +0000 UTC," against cached fi
ngerprint ""
2023/04/20 13:13:10 DEBUG : index.html: vfs cache: truncate to size=25169
2023/04/20 13:13:10 DEBUG : : Added virtual directory entry vAddFile: "index.html"
2023/04/20 13:13:10 DEBUG : index.html(0x40005d0200): >openPending: err=<nil>
2023/04/20 13:13:10 DEBUG : vfs cache: looking for range={Pos:0 Size:25169} in [] - present false
2023/04/20 13:13:10 DEBUG : index.html: ChunkedReader.RangeSeek from -1 to 0 length -1
2023/04/20 13:13:10 DEBUG : index.html: ChunkedReader.Read at -1 length 32768 chunkOffset 0 chunkSize 134217728
2023/04/20 13:13:10 DEBUG : index.html: ChunkedReader.openRange at 0 length 134217728
2023/04/20 13:13:10 DEBUG : index.html(0x40005d0200): >_readAt: n=25169, err=EOF
2023/04/20 13:13:10 DEBUG : &{index.html (rw)}: >Read: read=25169, err=<nil>

We aren't specifying or supplying any AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY or AWS_SESSION_TOKEN anywhere explicitly. It does seem that rclone knows how to fetch the token when it starts up, (which is available in the EC2 instance by doing curl http://169.254.169.254/latest/meta-data/iam/security-credentials/<<IAM_ROLE_NAME>>). Hence the files in the S3 bucket are accessible when rclone starts up.

but after around 12 hours, the ExpiredToken errors appear in the logs. However, even after the ExpiredToken errors appear in rclone logs, CloudTrail console still shows successful token requests.

This seems like it might be a bug in the SDK if it is fetching the token but not using it.

You could try using the latest beta. I went through the issues and git log for the SDK and I didn't see anything that looked relevant though.

Are there any other errors in the rclone log?

You could try with -vv --dump headers to see if you can see the token being refreshed in rclone's log.

Hi,

Am sorry to waste your time. It turns out we were supplying the AWS_SESSION_TOKEN in some environment variable during the container startup. We have removed that in the script and it seems that rclone gets the session token dynamically now using IMDSv2 as indicated by the --dump headers.
Thank you very much for your help.

Great - glad you got it sorted. AWS auth problems are really hard to debug!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.