List --s3-versions files looping bug

What is the problem you are having with rclone?

rclone ls --s3-versions is going into an infinite loop repeating the same files.

the problem seems to be with unversioned files. are relisted infinitely. Only killing the process to stop.

Is there any way to list only versioned files?

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.2

  • os/version: ubuntu 22.04 (64 bit)
  • os/kernel: 6.2.0-1018-gcp (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.21.3
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

google cloud storage

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone ls --s3-versions remote:archive/2022/

The rclone config contents with secrets removed.

remote]
type = s3
provider = GCS
access_key_id = 
secret_access_key = 
endpoint = https://storage.googleapis.com

A log from the command with the -vv flag

2023/11/20 12:43:56 DEBUG : rclone: Version "v1.64.2" starting with parameters ["rclone" "ls" "--s3-versions" "remote:archive/2022/7" "-vv"]
2023/11/20 12:43:56 DEBUG : Creating backend with remote "remote:archive/2022/7"
2023/11/20 12:43:56 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2023/11/20 12:43:56 DEBUG : remote: detected overridden config - adding "{pO73u}" suffix to name
2023/11/20 12:43:56 DEBUG : Resolving service "s3" region "us-east-1"
2023/11/20 12:43:56 DEBUG : fs cache: renaming cache item "remote:archive/2022/7" to be canonical "remote{pO73u}:nd-srv/pacs-data/archive/2022/7"

the loop log. rename .log to .rar file
log.rar.log (12.6 KB)

This is usually caused by --s3-list-version being set wrong for the provider. Can you try different versions for that and see if it makes a difference?

I tried --s3-list-version=1 and --s3-list-version=2. same problem.

if remove --s3-versions works fine.

Can you do a log with -vv --dump bodies that should show what the problem is.

Thanks

log.rar.txt (82.7 KB)

rename "log.rar.txt" to "log.txt.rar"

Thank you.

It has definitely got stuck in a loop - you can see it fetching the same thing over and over again

GET /?delimiter=&encoding-type=url&max-keys=1000&prefix=pacs-data%2Farchive%2F2022%2F8%2F&versions= HTTP/1.1

Which is getting a prefix of pacs-data/archive/2022/8/ when I decode it.

But why is the question?

Looking at the list response we see

<?xml version='1.0' encoding='UTF-8'?>
<ListBucketResult xmlns='http://doc.s3.amazonaws.com/2006-03-01'>
  <Name>nd-srv</Name>
  <Prefix>pacs-data/archive/2022/8/</Prefix>
  <Marker>
  </Marker>
  <GenerationMarker>
  </GenerationMarker>
  <NextMarker>pacs-data/archive/2022/8/1/10/11992158/8EC0C0C4/77407295</NextMarker>
  <NextGenerationMarker>1683228165974407</NextGenerationMarker>
  <MaxKeys>1000</MaxKeys>
  <IsTruncated>true</IsTruncated>
  <Encoding-Type>url</Encoding-Type>
  <Version>
    <Key>pacs-data/archive/2022/8/1/10/08E75EBF/F9A5759E/271EE0F6</Key>
    <Generation>1683224794134204</Generation>
    <MetaGeneration>1</MetaGeneration>
    <IsLatest>true</IsLatest>
    <LastModified>2023-05-04T18:26:34.136Z</LastModified>
    <DeletedTime>
    </DeletedTime>
    <ETag>"ea549ded855170e0670a00e5d2e13a0a"</ETag>
    <Size>54366</Size>
  </Version>

...snip lots of similar blocks...

  <Version>
    <Key>pacs-data/archive/2022/8/1/10/11992158/8EC0C0C4/77407295</Key>
    <Generation>1683228165974407</Generation>
    <MetaGeneration>1</MetaGeneration>
    <IsLatest>true</IsLatest>
    <LastModified>2023-05-04T19:22:45.975Z</LastModified>
    <DeletedTime>
    </DeletedTime>
    <ETag>"361b28d02accd86fef5cf92791795699"</ETag>
    <Size>243496</Size>
  </Version>
</ListBucketResult>

Something strange has gone on here for definite.

According to the AWS docs, this is what the response should look like

<?xml version="1.0" encoding="UTF-8"?>
<ListVersionsResult>
   <IsTruncated>boolean</IsTruncated>
   <KeyMarker>string</KeyMarker>
   <VersionIdMarker>string</VersionIdMarker>
   <NextKeyMarker>string</NextKeyMarker>
   <NextVersionIdMarker>string</NextVersionIdMarker>
   <Version>
      <ChecksumAlgorithm>string</ChecksumAlgorithm>
      ...
      <ETag>string</ETag>
      <IsLatest>boolean</IsLatest>
      <Key>string</Key>
      <LastModified>timestamp</LastModified>
      <Owner>
         <DisplayName>string</DisplayName>
         <ID>string</ID>
      </Owner>
      <RestoreStatus>
         <IsRestoreInProgress>boolean</IsRestoreInProgress>
         <RestoreExpiryDate>timestamp</RestoreExpiryDate>
      </RestoreStatus>
      <Size>long</Size>
      <StorageClass>string</StorageClass>
      <VersionId>string</VersionId>
   </Version>
   ...
   <DeleteMarker>
      <IsLatest>boolean</IsLatest>
      <Key>string</Key>
      <LastModified>timestamp</LastModified>
      <Owner>
         <DisplayName>string</DisplayName>
         <ID>string</ID>
      </Owner>
      <VersionId>string</VersionId>
   </DeleteMarker>
   ...
   <Name>string</Name>
   <Prefix>string</Prefix>
   <Delimiter>string</Delimiter>
   <MaxKeys>integer</MaxKeys>
   <CommonPrefixes>
      <Prefix>string</Prefix>
   </CommonPrefixes>
   ...
   <EncodingType>string</EncodingType>
</ListVersionsResult>

In particular note that the response should be a ListVersionsResult but gcs has given us a ListBucketResult

This is why rclone is looping because it is expecting to find NextKeyMarker and NextVersionIdMarker but it isn't so it is starting the listing over again each time.

However that doesn't explain why gcs is giving us the wrong listing result.

A bit of searching finds a comment which indicates gcs is being incompatible here. There is some more discussion on the drupal forum about this.

I can't find a defininitive statement in the GCS docs about this, but looking at the docs for the API call it just doesn't look compatible with S3.

Rclone should definitely notice the response is wrong and give an error which would have saved your looping problem.

I have fixed that here:

v1.65.0-beta.7523.15747cc9f.fix-s3-gcs-looping on branch fix-s3-gcs-looping (uploaded in 15-30 mins)

However I think the real problem here is that the GCS S3 interface isn't fully AWS S3 compatible. I couldn't find any google issues about this so it would probably be worth making one if you agree with my diagnosis.

I can't make a workaround for this - the AWS S3 SDK is very inflexible unfortunately.

1 Like

nice!

2023/11/21 14:03:56 Failed to ls: s3 protocol error: received versions listing with IsTruncated set with no NextKeyMarker

by now we solved our problem using the google api... (slow and painful)

It's not something we need to use often. In the future we can think and see how to work with versioning using the Google API.

Glad that worked :slight_smile:

Sorry rclone couldn't hack it for you this time!

Do you think it is worth reporting a Google bug about this? I guess if I do that, then at least I can link to it in the rclone docs!

I have reported lots of bugs but Google rarely seem do anything about them alas.

2 Likes

no problem. Thank you for the assistance!

where is the right place to report this to google? I never went back.

https://issuetracker.google.com/u/1/issues/312292516

I did it here. Not sure is the right place.

That is perfect. I added some more detail to the bug report!

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.65

I put a note in the docs also.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.