S3 back-end appears to be pulling incorrect timestamps

What is the problem you are having with rclone?

I routinely generate data files in AWS and store them in S3 buckets. Sometimes, I have to deliver these files to users in Box, rather than S3. I've correctly configured rclone to recognize both my S3 buckets and my Box spaces, and would like to 'sync' from S3 to Box occasionally, using a command line like "rclone -Pv sync s3:bucket/path/ box:new/path/". And the syncs work, at least the first time through.

While my 'sync' commands seem to work, I ran across something I thought was very odd: rclone appears to report a different timestamp when accessing S3 than the AWS cli and the AWS s3api. It's off by more than a month.

Question: where does rclone get the timestamp for the s3 target come from? I am not doing well following the source (not much of a go coder) but it kind of looks like something's being incorrectly processed.

Run the command 'rclone version' and share the full output of the command.

$ rclone --version
rclone v1.59.2
- os/version: ubuntu 20.04 (64 bit)
- os/kernel: 5.15.0-1020-aws (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.18.6
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

AWS S3 in the us-east-2 region is where the problem is evident.

The command you were trying to run (eg rclone copy /tmp remote:tmp)

So far, so good:

$ rclone -Pv sync s3:bucket/path/ box:new/path/
2022-09-29 01:56:52 INFO  : There was nothing to transfer
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:               3 / 3, 100%
Elapsed time:         3.0s
2022/09/29 01:56:52 INFO  :
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:               3 / 3, 100%
Elapsed time:         3.0s

$ rclone lsl s3:bucket/path/
  5473345 2021-07-13 17:19:36.000000000 file1
    10393 2021-07-13 17:19:37.000000000 file2
    47341 2021-07-13 17:19:36.000000000 file3

$ rclone lsl box:new/path/
  5473345 2021-07-13 17:19:36.000000000 file1
    10393 2021-07-13 17:19:37.000000000 file2
    47341 2021-07-13 17:19:36.000000000 file3

However, the timestamps in rclone's output for the s3: target don't match the timestamps reported by S3 using Amazon's software (which is authoritative).

$ aws --version
aws-cli/1.18.69 Python/3.8.10 Linux/5.15.0-1020-aws botocore/1.16.19

$ aws s3 ls s3://bucket/path/
2021-08-13 14:51:52    5473345 file1
2021-08-13 14:51:52      10393 file2
2021-08-13 14:51:52      47341 file3

$ aws s3api list-objects-v2 --bucket bucket --prefix path/
{
    "Contents": [
        {
            "Key": "path/file1",
            "LastModified": "2021-08-13T14:51:52.000Z",
            "ETag": "\"1a3548fa5915aaa146eb8c5e31fa5a5a\"",
            "Size": 5473345,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "path/file2",
            "LastModified": "2021-08-13T14:51:52.000Z",
            "ETag": "\"68c67e0ef8f0b589375a19613335c6cc\"",
            "Size": 10393,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "path/file3",
            "LastModified": "2021-08-13T14:51:52.000Z",
            "ETag": "\"b7ec034626e20719f6da13fe41beef05\"",
            "Size": 47341,
            "StorageClass": "STANDARD"
        }
    ]
}

The rclone config contents with secrets removed.

[s3]
type = s3
provider = AWS
env_auth = false
access_key_id = ACCESS_KEY_ID
secret_access_key = SECRET_ACCESS_KEY
region = us-east-2
location_constraint = us-east-2
acl = bucket-owner-full-control
storage_class = STANDARD

[box]
type = box
token = {"access_token":"ACCESS_TOKEN","token_type":"bearer","refresh_token":"REFRESH_TOKEN","expiry":"2022-09-29T03:02:24.240289647Z"}

A log from the command with the -vv flag

rclone -vv lsl s3:bucket/path/
2022/09/29 12:30:40 DEBUG : rclone: Version "v1.59.2" starting with parameters ["rclone" "-vv" "lsl" "s3:bucket/path/"]
2022/09/29 12:30:40 DEBUG : Creating backend with remote "s3:bucket/path/"
2022/09/29 12:30:40 DEBUG : Using config file from "/home/wyang/.config/rclone/rclone.conf"
2022/09/29 12:30:40 DEBUG : fs cache: renaming cache item "s3:bucket/path/" to be canonical "s3:bucket/path/"
  5473345 2021-07-13 17:19:36.000000000 file1
    10393 2021-07-13 17:19:37.000000000 file2
    47341 2021-07-13 17:19:36.000000000 file3

hello and welcome to the forum,

when rclone transfers a file to aws, it adds a header with the modtime of the file.
if you transfer a file to aws, not using rclone, that header is missing.

take a read of these, and let me know if you have any questions.
https://rclone.org/s3/#modified-time

https://rclone.org/docs/#use-server-modtime

1 Like

I'm not sure we're talking about the same thing. I am copying files from aws, to Box. AWS is the authoritative data source, and its timestamps are being modified. But when rclone accesses the AWS, it reports incorrect date/time stamps for ModTime.

the files in aws, did you upload them with rclone?

did you read the two links i shared up above?

I uploaded the files into S3 using the AWS CLI. At no point in time is rclone writing to S3.

I did read both the links -- unless I'm misreading these docs, it reads as though the modified-time and use-server-modtime sections are talking about writing to S3 (which is never being done with rclone). This potential bug is about rclone seeming to read date/time stamps out of S3 incorrectly.

ok,
i uploaded a file to from local to aws s3 via the web browser.

seems ok to me

rclone_1.59.2 lsl ./file.ext 
  1448900 2021-12-05 16:49:44.944618100 file.ext

rclone_1.59.2 lsl aws01:bucket\file.ext 
  1448900 2022-09-29 09:44:59.000000000 file.ext

aws s3 ls bucket 
2022-09-29 09:44:59    1448900 file.ext

I'm concerned we're talking about different issues.

To reproduce my symtpoms, you should be doing the following:

$ date | aws s3 cp - s3://bucket/test-file

$ rclone lsl aws01:bucket/test-file

aws cli doesn't use X-Amz-Meta-Mtime
rclone uses X-Amz-Meta-Mtime

Without X-Amz-Meta-Mtime, rclone doesn't see the right mod time.

1 Like

The "right" mod time is the time stored in S3 as the LastModified element of the JSON returned by the API. Rclone doesn't seem to see or use that.

Where does the X-Amz-Meta-Mtime element get stored in S3, and how would I access/clear it?

it is stored as a header, as seen from s3 browser software tool

1 Like

The "right" modtime for rclone is stored here and documented as @asdffdsa shared:

Amazon S3 (rclone.org)

Okay, I get it, and I did misunderstand the documentation. You can ignore the x-amz-meta-mtime object metadata in S3 by using the '--use-server-mtime' (which is what I want to do).

uses metadata, not LastModified element for S3 object key:

$ rclone lsl s3:bucket/path/
  5473345 2021-07-13 17:19:36.000000000 file1
    10393 2021-07-13 17:19:37.000000000 file2
    47341 2021-07-13 17:19:36.000000000 file3

uses LastModified element for S3 object key:

$ rclone lsl s3:bucket/path/ --use-server-modtime
  5473345 2021-08-13 14:51:52.000000000 file1
    10393 2021-08-13 14:51:52.000000000 file2
    47341 2021-08-13 14:51:52.000000000 file3

Thanks!

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.