Can copy with rclone but not list (aws s3 cli)

What is the problem you are having with rclone?

I'm using rclone to proxy and cache an AWS s3 bucket. I am able to run aws s3 cp s3://mybucket/myfolder/file.txt --endpoint-url http://rclone-s3-service to copy a file down just fine - but I'm not able to list the contents of any folder in the same bucket. I don't receive any errors, just zero files are returned. Here's the command, it just always returns empty with no errors: aws s3 ls s3://mybucket/myfolder/ --endpoint-url http://rclone-s3-service --debug

From the aws cli --debug output, I noticed that the AWS CLI isn't including the folder I'm trying to search in the path when sending to my rclone server. Instead, it's pass it as a 'prefix' query parameter in the url.

It's sending this:
http://rclone-s3-service/mybucket?list-type=2&prefix=myfolder%2F&encoding-type=url

I would expect it to be this:

http://rclone-s3-service/mybucket/myfolder?list-type=2&prefix=myfolder%2F&encoding-type=url

AWS debug output here:

2024-09-06 16:13:10,969 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=GET, url=http://rclone-s3-service/mybucket?list-type=2&prefix=myfolder%2F&encoding-type=url, headers={..}

I have verified if I use curl to hit http://rclone-s3-service/mybucket/myfolder it does list properly .. it just seems to be an issue with aws cli. I am using aws cli v2

Run the command 'rclone version' and share the full output of the command.

rclone v1.67.0

  • os/version: alpine 3.20.0 (64 bit)
  • os/kernel: 6.8.0-1008-aws (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.22.4
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

AWS S3

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[s3]
type = s3
provider = AWS
env_auth = true
region = us-west-1
endpoint = s3.amazonaws.com

A log from the command that you were trying to run with the -vv flag

No logs are generated on the server site despite having -vv set.

I tried to replicate this with the latest beta (shortly to become v1.68)

Setup

mkdir /tmp/src
mkdir /tmp/src/bucket
mkdir /tmp/src/bucket/dir
echo hello > /tmp/src/bucket/dir/file.txt

Serve the s3 bucket

rclone serve s3 -vv --auth-key ACCESS_KEY_ID,SECRET_ACCESS_KEY /tmp/src

And in another terminal

export AWS_ACCESS_KEY_ID=ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY=SECRET_ACCESS_KEY

Listing works ok using aws v2.15.37

$ aws s3 ls s3://bucket/dir/ --endpoint-url http://127.0.0.1:8080/
2024-09-07 17:06:19          6 file.txt

What are you doing differently?

1 Like

I’m serving an aws s3 bucket, it seems like you’re serving local files in your example. I’m also setup for caching and using a different version.

Additionally, I had to set the base url flag to be the name of the bucket to even get copying to work, otherwise I kept getting 404/not found errors. So now copying works, but not listing

It would be helpful if you could supply the rclone command line then I can compare.

I am guessing it is the base URL flag that is the problem - I don't think you should need that.

Try without the base URL with the latest beta also.

1 Like

Here you go, sorry I should have included this in my original message. This is from my kubernetes deployment yaml.

args:
        - serve
        - s3
        - s3:mybucket
        - --log-level=DEBUG
        - --addr=:8080
        - --baseurl=mybucket
        - --vfs-cache-mode=full
        - --vfs-cache-max-age=1m
        - --vfs-cache-max-size=10G

The reason I added the baseurl is because none of the commands would work without it. I spent many hours trying to figure out why until I noticed that the AWS CLI wasn't hitting the URL in the way rclone was expecting.

Without setting the baseurl, I get The specified bucket does not exist errors for everything. When I debug the AWS CLI output, I saw that it was appending the bucket name to the URL when hitting rclone. By default, rclone needs to hit at the root URL. Setting baseurl was the only way I could find to make the two communicate properly. .. for copy commands anyway.

Is there a docker image available on dockerhub with the new beta version? Happy to try that. I'm using the newest one available that I could find.

@ncw I was able to find the instructions for using the beta docker image, so I went ahead and updated and removed the baseurl flag. Now, like before, I get The specified bucket does not exist error from all AWS commands.

Here's the command I ran aws s3 ls s3://mybucket/myfolder/ --endpoint-url http://rclone-s3-service --debug

Here's a partial output of the AWS CLI debug output. As you can see, the URL AWS CLI is trying to hit is http://rclone-s3-service/mybucket instead of http://rclone-s3-service/myfolder.

2024-09-07 19:39:42,379 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=GET, url=http://rclone-s3-service/mybucket?list-type=2&prefix=folder%2F&delimiter=%2F&encoding-type=url, headers={...}

Setting the baseurl flag fixes this for the aws s3 cp command and im able to copy files, but not for the aws s3 ls command.

My environment is a base ubuntu docker image with only aws cli v2 installed, nothing more. And my rclone environment is your provided docker image. Nothing else is custom. It's really a pretty basic setup.

Here is the problem. If you change s3:mybucket to just s3: and remove the --baseurl then it will work with the aws s3 commands you've been using.

Or alternatively if you only want to serve s3:mybucket then remove the --baseurl and in your aws s3 commands use s3://myfolder/ so leave the mybucket out.

Remember that rclone serve s3 treats each directory in the root of the source as a bucket. In your case you are using s3:mybucket as the source, so it is treating myfolder as a bucket.

ok, I have made those changes, but am now getting The specified bucket does not exist from the AWS CLI.

Then in the rclone logs I see:

2024/09/08 16:32:04 DEBUG : rclone: Version "v1.68.0-beta.8292.9a02c0402" starting with parameters ["rclone" "serve" "s3" "s3:" "--log-level=DEBUG" "--addr=:8080" "--vfs-cache-mode=full" "--vfs-cache-max-age=1m" "--vfs-cache-max-size=10G"]
2024/09/08 16:32:04 DEBUG : Creating backend with remote "s3:"
2024/09/08 16:32:04 DEBUG : Using config file from "/config/rclone/rclone.conf"
2024/09/08 16:32:04 NOTICE: serve s3: No auth provided so allowing anonymous access
2024/09/08 16:32:04 INFO  : S3 root: poll-interval is not supported by this remote
2024/09/08 16:32:04 DEBUG : vfs cache: root is "/root/.cache/rclone"
2024/09/08 16:32:04 DEBUG : vfs cache: data root is "/root/.cache/rclone/vfs/s3"
2024/09/08 16:32:04 DEBUG : vfs cache: metadata root is "/root/.cache/rclone/vfsMeta/s3"
2024/09/08 16:32:04 DEBUG : Creating backend with remote ":local,encoding='Slash,Dot':/root/.cache/rclone/vfs/s3/"
2024/09/08 16:32:04 DEBUG : :local: detected overridden config - adding "{bxYPm}" suffix to name
2024/09/08 16:32:04 DEBUG : fs cache: renaming cache item ":local,encoding='Slash,Dot':/root/.cache/rclone/vfs/s3/" to be canonical ":local{bxYPm}:/root/.cache/rclone/vfs/s3"
2024/09/08 16:32:04 DEBUG : Creating backend with remote ":local,encoding='Slash,Dot':/root/.cache/rclone/vfsMeta/s3/"
2024/09/08 16:32:04 DEBUG : :local: detected overridden config - adding "{bxYPm}" suffix to name
2024/09/08 16:32:04 DEBUG : fs cache: renaming cache item ":local,encoding='Slash,Dot':/root/.cache/rclone/vfsMeta/s3/" to be canonical ":local{bxYPm}:/root/.cache/rclone/vfsMeta/s3"
2024/09/08 16:32:04 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2024/09/08 16:32:04 NOTICE: S3 root: Starting s3 server on [http://[::]:8080/]


2024/09/08 16:32:53 DEBUG : serve s3: LIST BUCKET
2024/09/08 16:32:53 ERROR : /: Dir.Stat error: operation error S3: ListBuckets, https response error StatusCode: 400, RequestID: BZ5MTJ1ABMSR1VV6, HostID: Q3ScbHLZJORclD5fpfQcCR7hgzUs47yAeHAP2mLZxozsiKPe6TBLOpcYVbO2nRP3aJi1rG0n1jE=, api error AuthorizationHeaderMalformed: The authorization header is malformed; the region 'us-west-1' is wrong; expecting 'us-east-1'



2024/09/08 16:33:04 INFO  : vfs cache: cleaned: objects 0 (was 0) in use 0, to upload 0, uploading 0, total size 0 (was 0)
2024/09/08 16:33:20 DEBUG : serve s3: LIST BUCKET
2024/09/08 16:33:20 ERROR : /: Dir.Stat error: operation error S3: ListBuckets, https response error StatusCode: 400, RequestID: X32MBG9QKJS4VDAW, HostID: FOvgDCjiSdNA4TcQtg/nP9Bk1iw1oyo3pE5uzn48zrlhzC58of1ICDEQZFTTBc7HuMlmV0om7xg=, api error AuthorizationHeaderMalformed: The authorization header is malformed; the region 'us-west-1' is wrong; expecting 'us-east-1'

Here's some debug output from AWS CLI

2024-09-08 09:38:55,338 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=GET, url=http://rclone-s3-service/mybucket?list-type=2&prefix=myfolder&delimiter=%2F&encoding-type=url, headers={...}

Here's my rclone config

    [s3]
    type = s3
    provider = AWS
    env_auth=true
    region = us-west-1
    endpoint = s3.amazonaws.com

My bucket is in us-west-1 and that's what I have set everywhere. Not sure why it's complaining about us-east-1. Everything works fine when I remove rclone from the mix so it doesn't seem to be an issue on AWS/IAM/S3 side.

--- update ----

I just tried your suggestion of :

Or alternatively if you only want to serve s3:mybucket then remove the --baseurl and in your aws s3 commands use s3://myfolder/ so leave the mybucket out.

And that does work for all commands! Yay! That said - I would prefer to be able to have the bucket name in the s3:// path for a more seamless experience. We have a lot of automation built for our app and changing that would be a PITA. It would be nice if we could just add the endpoint url and leave everything else as-is. Of course, if that's not possible we will adapt .. hoping you might have another tip though :). Thanks so much.

Does rclone have permissions to list buckets? You'll need that for this method.

So does

rclone lsd s3:

work?

If you want you can use the combine backend to do that, something like this

rclone serve s3 -vv --auth-key ACCESS_KEY_ID,SECRET_ACCESS_KEY ':combine,upstreams="mybucket=s3:mybucket":'

It is probably easier to use if you put it in the config file

[s3bucket]
type = combine
upstreams = mybucket=s3:mybucket

then

rclone serve s3 -vv --auth-key ACCESS_KEY_ID,SECRET_ACCESS_KEY s3bucket:

Ok, I have updated to your suggested configuration. Things do appear to work now (I can copy and list) with one small issue .. I'm not able to list anything in the root of the bucket, i just get the error list index out of range

I have tried both:

aws s3 ls s3://mybucket/ --endpoint-url http://rclone-s3-service

and (without the trailing slash)

aws s3 ls s3://mybucket --endpoint-url http://rclone-s3-service

If I add a folder (aws s3 ls s3://mybucket/myfolder/), it does work.

Here's the message in the rclone log:

2024/09/09 18:32:25 DEBUG : serve s3: bucketname:%!(EXTRA string=mybucket, string=prefix:, gofakes3.Prefix=prefix:"", delim:"/", string=page:, string={Marker: HasMarker:false MaxKeys:1000})

Here is my config:

    [s3]
    type = s3
    provider = AWS
    env_auth=true
    region = us-west-1
    endpoint = s3.amazonaws.com

    [s3bucket]
    type = combine
    upstreams = mybucket=s3:mybucket

Here's the command I'm using:

args:
        - serve
        - s3
        - "s3bucket:"
        - --log-level=DEBUG
        - --addr=:8080
        - --vfs-cache-mode=full
        - --vfs-cache-max-age=1m
        - --vfs-cache-max-size=10G

Also - per your question about the command rclone lsd s3:, when I run that from an sh shell on my rclone pod, I get this error:

/data # rclone lsd s3:
2024/09/09 18:36:39 ERROR : : error listing: operation error S3: ListBuckets, https response error StatusCode: 400, RequestID: BN5P0BFKPNDWPSFS, HostID: FzFvDzhrEl5ux/lIt/qeLdYn0cT/We3Z5rkv/NzlWpapmKKz2P5HLJ/+2FkmWGL3A3Ol/i3JdNs=, api error AuthorizationHeaderMalformed: The authorization header is malformed; the region 'us-west-1' is wrong; expecting 'us-east-1'
2024/09/09 18:36:39 NOTICE: Failed to lsd with 2 errors: last error was: operation error S3: ListBuckets, https response error StatusCode: 400, RequestID: BN5P0BFKPNDWPSFS, HostID: FzFvDzhrEl5ux/lIt/qeLdYn0cT/We3Z5rkv/NzlWpapmKKz2P5HLJ/+2FkmWGL3A3Ol/i3JdNs=, api error AuthorizationHeaderMalformed: The authorization header is malformed; the region 'us-west-1' is wrong; expecting 'us-east-1'

The role does have full permissions to the bucket and I'm able to run these commands without using rclone and they work fine .. doesn't seem to be permission related.

Great

I saw some of those too when testing. They look more like a bug in the aws s3 tool to me, but they are being provoked by something rclone is sending.

If you can make that work then the original config will work.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.