Wasabi directory with more than 1,000 files

What is the problem you are having with rclone?

When syncing files to Wasabi S3 storage I see that the same file gets synced over and over.
I noticed that the files that get copied multiple times are file numbers 1001 and higher in the directory. So it seems that rclone is only fetching the first 1,000 files in the directory when doing the comparison and then assumes everything after that needs to be uploaded.

This is causing my storage usage to sky-rocket, because it's actually uploading the file again each time it syncs. :cry:

To make things worse, because it thinks the file is new it is also overwriting the modify date which there is currently no fix for (see Rclone sync between two WebDav issues ), except to delete the file and re-upload, which would also be very bad for usage charges.

I looked at this same folder in the WebUI and it also seems to only show the first 1,000 files.

rclone ls "wasabi3:/derek/path to/timelapse files" also shows only 1,000 files.

What is your rclone version (output from rclone version)

rclone v1.50.2

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Linux, 64 Bits

Which cloud storage system are you using? (eg Google Drive)

Nextcloud (with Local Storage) and Wasabi S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync NC: wasabi3:/derek -P --size-only -v

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp)

2020-02-02 21:12:10 INFO : Photos/Timelapse/File1001.JPG: Copied (new)

hello,
can you supply this?

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp )

Here you go. You can see it quickly checks the 1,000 files and then proceeds to copy the rest :man_facepalming:

2020-02-02 23:14:46 DEBUG : G0026935.JPG: Size and modification time the same (differ by 0s, within tolerance 1s)
2020-02-02 23:14:46 DEBUG : G0026935.JPG: Unchanged skipping
2020-02-02 23:14:46 DEBUG : G0026936.JPG: Size and modification time the same (differ by 0s, within tolerance 1s)
2020-02-02 23:14:46 DEBUG : G0026936.JPG: Unchanged skipping
2020-02-02 23:14:46 DEBUG : G0026937.JPG: Size and modification time the same (differ by 0s, within tolerance 1s)
2020-02-02 23:14:46 DEBUG : G0026937.JPG: Unchanged skipping
2020-02-02 23:14:46 DEBUG : G0026938.JPG: Size and modification time the same (differ by 0s, within tolerance 1s)
2020-02-02 23:14:46 DEBUG : G0026938.JPG: Unchanged skipping
2020-02-02 23:14:46 DEBUG : G0026939.JPG: Size and modification time the same (differ by 0s, within tolerance 1s)
2020-02-02 23:14:46 DEBUG : G0026939.JPG: Unchanged skipping
2020-02-02 23:14:46 INFO : S3 bucket derek path Photos/path to/Time Lapse: Waiting for transfers to finish
2020-02-02 23:14:47 INFO : G0026949.JPG: Copied (new)
2020-02-02 23:14:47 INFO : G0026948.JPG: Copied (new)
2020-02-02 23:14:47 INFO : G0026950.JPG: Copied (new)
2020-02-02 23:14:48 INFO : G0026951.JPG: Copied (new)
2020-02-02 23:14:52 INFO : G0026953.JPG: Copied (new)
2020-02-02 23:14:52 INFO : G0026952.JPG: Copied (new)
2020-02-02 23:14:52 INFO : G0026954.JPG: Copied (new)
2020-02-02 23:14:52 INFO : G0026955.JPG: Copied (new)
2020-02-02 23:14:56 INFO : G0026956.JPG: Copied (new)
2020-02-02 23:14:56 INFO : G0026957.JPG: Copied (new)
2020-02-02 23:14:56 INFO : G0026958.JPG: Copied (new)
2020-02-02 23:14:56 INFO : G0026959.JPG: Copied (new)
Transferred: 45.745M / 2.450 GBytes, 2%, 2.009 MBytes/s, ETA 20m26s
Errors: 0
Checks: 1000 / 1000, 100%
Transferred: 20 / 1092, 2%
Elapsed time: 22.7s
Transferring:

  •                              G0026960.JPG: transferring
    
  •                              G0026961.JPG: transferring
    
  •                              G0026962.JPG: transferring
    
  •                              G0026963.JPG: transferring

perhaps you can update to the lastest version and test again.
i have many TB of data in wasabi.

there was a bug concerning file count above 1,000 files.
i helped do some testing to prove that there was a bug and testing to prove the bug was fixed.
but if i remember, that was when using rclone mount

in addition, just now, i re-ran the same scripts i used when testing for that bug.
i am not having a problem with rclone ls showing only the first 1,000 files.

to help see if there is a bug, can you do what i did and see what happens.

  1. on your local file system, create a NEW folder with 2,000 files using this script like this.
    For /L %%i in (1,1,2000) do fsutil file createnew ".\2000files\test.%%i.txt" 1
  2. copy those files to wasabi to a NEW bucket and folder
    rclone.1.51.0.exe copy .\2000files\ wasabieast2:\2000files\test\ -vv --log-file=copy2000files.txt
    and check the log
    Transferred: 2000 / 2000, 100%
  3. do the rclone ls command
    rclone.1.51.0.exe ls wasabieast2:\2000files\test\ > ls2000files.txt
  4. do a rclone sync and check the log
    rclone.1.51.0.exe sync .\2000files\ wasabieast2:\2000files\test\ -vv --log-file=sync2000files.txt

i found that old post with the bug, it was with rclone mount and having the space character in the path+fllename

so, just now, i re-ran my tests again, with a space in the source and dest.
again, no problems.

For /L %%i in (1,1,2000) do fsutil file createnew ".\2000 files\test %%i.txt" 1
rclone.1.51.0.exe copy ".\2000 files" "wasabieast2:\2000files\t est" -vv --log-file=copy2000files.txt
rclone.1.51.0.exe ls "wasabieast2:\2000files\t est" > ls2000files.txt
rclone.1.51.0.exe sync ".\2000 files" "wasabieast2:\2000files\t est" -vv --log-file=sync2000files.txt

Thanks - I installed the latest version according to your suggestion (using the manual install script because my repos aren't updated yet) and now it seems to work fine.

Yes this was fixed a while back and I had it queued to go on a point release for 1.50.x but I never made the release - sorry! It is all fixed in 1.51 though.

No worries. Thanks for such a great tool. In the end, I think S3 just isn't a good storage back-end for Nextcloud or other use cases where the modified date is so important.

The only reason I got such an old version is because I'm using Nethserver. I'll raise a feature request to ask them to update their repository to a newer version. :slight_smile:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.