I am using the S3 backend and listing object versions using --s3-versions. I am interested not only in when an object version was created but also in when it was deleted. I think of the deletion time as when the version becomes non-current, which corresponds to the creation time of a new version or a deletion marker.
Is there currently a way to get this information from rclone? Does this seem like a reasonably common want?
For me, the most convenient implementation would be a flag that sets the deletion time as the object version's modification date. I understand this could be confusing, as the modification date is already used both for rclone's modification date and object birth time (when using --use-server-modtime).
We understand each other :). On AWS, information on the time of deletion is indeed available in the delete marker with the same prefix, not in the object version metadata itself. If an object is overridden, the "deletion time" of the old object version version can be the creation of the new version.
Setting a bit of metadata (dtime) would indeed make a lot of sense and probably be a better option compared to setting modification times. It would be great though if it were possible to get dtime without also retrieving (rclone) modification times. Asking rclone to retrieve this metadata (e.g. by setting --metadata) slows down the list operation a lot (at least two orders of magnitude).
My use case is to analyze the contents of a bucket (distribution of file sizes, growth over time, etc). For this I export a list of object versions using rclone lsjson. I am connecting to buckets residing on MinIO; on AWS I would probably use Amazon S3 Inventory.
Thanks for pointing me to the relevant piece of source code. I don't quite understand the code structure yet, but I do get from this that I should be able to see the delete markers themselves if I would be able to show "hidden" files somehow. Is that possible?
It makes delete markers visible with the --s3-version-deleted flag or version_deleted=true in the config.
From the docs:
Show deleted file markers when using versions.
This shows deleted file markers in the listing when using versions. These will appear as 0 size files. The only operation which can be performed on them is deletion.
Deleting a delete marker will reveal the previous version.
Deleted files will always show with a timestamp.
This does mean you can now undelete files by deleting the delete marker which there was no way of doing in rclone previously!
Apologies for taking so very long to get back to you.
The listing works perfectly and allows me to do the analytics I wanted to do. Many thanks!
Unfortunately, I could not get removing a delete marker to work. Maybe I am doing something wrong? I would expect my attempts below to work, as deleting object versions works like this.
export RCLONE_S3_REGION=eu-west-1
export RCLONE_S3_ENV_AUTH=true
export RCLONE_S3_PROVIDER=AWS
./rclone version
# rclone v1.65.0-beta.7469.ee1c4d9d0.fix-s3-version-delete
# - os/version: darwin 14.1.1 (64 bit)
# - os/kernel: 23.1.0 (arm64)
# - os/type: darwin
# - os/arch: arm64 (ARMv8 compatible)
# - go/version: go1.21.4
# - go/linking: dynamic
# - go/tags: cmount
./rclone ls :s3:$bucket/tmp --s3-versions --s3-version-deleted
# 0 testfile-v2023-11-21-211605-000
# 69409026 testfile-v2023-11-21-211543-000
# 0 testfile-v2023-11-20-233459-000
# this is correct: there is a single object version, and two delete markers, one of which is current
./rclone ls :s3:$bucket/tmp
# no results, as expected, since the current object version is a delete marker
./rclone deletefile :s3:$bucket/tmp/testfile-v2023-11-21-211605-000 --s3-versions --s3-version-deleted
# 2023/11/21 22:24:18 ERROR : Attempt 1/3 failed with 1 errors and: :s3:.../tmp/testfile-v2023-11-# 21-211605-000 is a directory or doesn't exist: object not found
# 2023/11/21 22:24:18 ERROR : Attempt 2/3 failed with 1 errors and: :s3:.../tmp/testfile-v2023-11-# 21-211605-000 is a directory or doesn't exist: object not found
# 2023/11/21 22:24:18 ERROR : Attempt 3/3 failed with 1 errors and: :s3:.../tmp/testfile-v2023-11-# 21-211605-000 is a directory or doesn't exist: object not found
# 2023/11/21 22:24:18 Failed to deletefile: :s3:.../tmp/testfile-v2023-11-21-211605-000 is a directory or doesn't exist: object not found
./rclone delete :s3:$bucket/tmp/testfile-v2023-11-21-211605-000 --s3-versions --s3-version-deleted
# no result, does not remove delete marker
./rclone delete :s3:$bucket/tmp/testfile-v2023-11-21-211543-000 --s3-versions
# no result, does remove the object version
I agree that being able to undelete files using rclone would be a cool new feature!