Happy to relog as a bug, with the requisite fields - just let me know. thank you
I've been running a couple of tests with rclone to list the objects in an S3 bucket.
I noticed that it was pretty slow in comparison to the aws cli, and attribute this to the HEAD requests per object that are done.
bucket with 15k objects, nested 1-3 folders deep:
rclone lsl remote:s3_bucket/prefix/ --fast-list --recursive
takes around 25 mins
aws s3 ls bucket/prefix --recursive > out.txt
takes about 30seconds
I was expecting a few seconds, since a list_objects API call can return 1k objects per call, so about 15 calls to list 15k objects shouldn't take very long. And my understanding is that --fast-list makes rclone use this API so it should be quick.
However as I understand it there are extra per-object HEAD requests being done to retrieve additional data to the user.
The rclone docs on ls explain the purpose of the additional HEAD requests when working with S3, is basically to obtain the last modification time and the mime type.
Only if I disable both those 2 features then the command is quick (few seconds):
rclone lsl remote:bucket/prefix --fast-list --recursive --no-modtime --no-mimetype > out
And I can see (with Dump) that only the list_objects API call is done, 1000objects returned each time while it pages through, and no per-object requests are being done.
In this case, as expected the modtime is missing in the output.
However, the last modification time is actually returned in the list objects XML response (the mime type is unfortunately not returned, fair enough).
It would therefore be great if for s3 rclone could use the last modified date from the list_objects response - and only make the per-object HEAD requests if the mime-type is needed.
So then in theory this command would be super quick (no HEAD requests per object), and include the mod time, but not the mimetype:
rclone lsl remote:bucket/prefix --fast-list --recursive --no-mimetype > out
Since lsl does not display the mimetype, this could help a lot as a default behavior. But lsjson for example does show the mimetype and therefore the user would need to opt-out of mimetype specifcally if they want the extra fast list.