Aws s3 lot of checking (with template)

What is the problem you are having with rclone?

getting lot of....

Checking:

  • 2021/3/1/12/53…152B/07727062/03457411: checking
  • 2021/3/1/12/53…152B/07727062/0345E886: checking
  • 2021/3/1/12/53…152B/07727062/0346D145: checking
  • 2021/3/1/12/53…152B/07727062/03482E63: checking

Reading the docs say about the "Avoiding HEAD requests to read the modification time" (ModTime)

So I use the --checksum but still keep checking... am I doing the right thing?

How could I avoid this check and what would be the cons?

google cloud storage don't have this problem... this is with AWS S3.

--checksum still keep checking but --size-only seems help.

Using --size-only the files are not checking the files again with rclone sync. Is there a problem using this flag? Are there another way to avoid this checking?

What is your rclone version (output from rclone version)

rclone v1.56.0

Which cloud storage system are you using? (eg Google Drive)

aws s3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync archive remote:/archive -v -P --checkers 16 --transfers 8

The rclone config contents with secrets removed.

type = s3
access_key_id = 
secret_access_key = 
storage_class = ONEZONE_IA

A log from the command with the -vv flag

2021-09-03 11:37:18 DEBUG : 2021/9/3/6/57669A05/6165DD8F/5E9915A5: Unchanged skipping
2021-09-03 11:37:18 DEBUG : 2021/9/3/6/57669A05/6165DD8F/5E9977BD: Size and modification time the same (differ by 0s, within tolerance 1ns)
2021-09-03 11:37:18 DEBUG : 2021/9/3/6/57669A05/6165DD8F/5E9977BD: Unchanged skipping
2021-09-03 11:37:18 DEBUG : 2021/9/3/6/57669A05/6165DD8F/5E998A81: Size and modification time the same (differ by 0s, within tolerance 1ns)
2021-09-03 11:37:18 DEBUG : 2021/9/3/6/57669A05/6165DD8F/5E998A81: Unchanged skipping
2021-09-03 11:37:18 DEBUG : 2021/9/3/6/57669A05/6165DD8F/5E99EC18: Size and modification time the same (differ by 0s, within tolerance 1ns)

rclone sync has to check the source against the dest.

that is what the log shows, checking each source file against the corresponding the dest file.

why google cloud storage don't have it?

how could avoid it on aws?

sorry, not sure what you are asking?

rclone sync has to check the source to dest, regardless of the backend.

the files are already on the backend and when run the rclone sync again on aws apply this "checking" on all files again.

the files might be in the backend, but rclone does not know that.

every time rclone sync is run, rclone has to compare all the source files against the dest.

yes. I got it.

google cloud storage don't do this checks again. It seens aws is doing something more that would be billed.

with aws s3, there is a cost per api call
as far as i know, with gcs, there is a cost per api call
https://cloud.google.com/storage/pricing#price-tables

wasabi, a s3 rclone, that i use, does not charge for api calls.

with aws s3 or any s3 clone that charges per api calls.
if you want to reduce the number of api calls, then use one of the options as documented at
https://rclone.org/s3/#avoiding-head-requests-to-read-the-modification-time

great. I'm here.... Amazon S3

Now we get the question as I ask first.

--checksum still keep checking but --size-only seems help.

Using --size-only the files are not checking the files again with rclone sync. Is there a problem using this flag? Are there another way to avoid this checking?

there is no problem using that flag

DEBUG : rclone: Version "v1.56.0" starting with parameters ["c:\\data\\rclone\\scripts\\rclone.exe" "sync" "wasabi01:testfolder01" "d:\\test" "--size-only" "-vv"]
DEBUG : Creating backend with remote "wasabi01:testfolder01"
DEBUG : file01.txt: Sizes identical
DEBUG : file01.txt: Unchanged skipping
DEBUG : file02.txt: Sizes identical
DEBUG : file02.txt: Unchanged skipping
DEBUG : rclone: Version "v1.56.0" starting with parameters ["c:\\data\\rclone\\scripts\\rclone.exe" "sync" "wasabi01:testfolder01" "d:\\test" "--checksum" "-vv"]
DEBUG : file01.txt: md5 = 898a314f5208bb3d0eec9280c3793acf OK
DEBUG : file01.txt: Size and md5 of src and dst objects identical
DEBUG : file01.txt: Unchanged skipping
DEBUG : file02.txt: md5 = 898a314f5208bb3d0eec9280c3793acf OK
DEBUG : file02.txt: Size and md5 of src and dst objects identical
DEBUG : file02.txt: Unchanged skipping

why the --checksum didn't solve in this case?

can you explain in more detail?

solve what?

what case?

with --checksum keeps a lot of checking like in the first messages...
with --size-only not.

is the --checksum the default value?

as you can see there are two source files

  • files01.txt
  • files02.txt

in each debug log, rclone sync compare all source files against the dest.

there is more than one way to compare a file:

  • --checksum compares using checksums.
  • '--size-only` compares files using file size.

when don't define nome in command like bellow...

rclone sync archive remote:/archive -v -P --checkers 16 --transfers 8

neither --checksum neither --size-only

with is the default type of compare?

that is documented at https://rclone.org/commands/rclone_sync/
"testing by size and modification time"

to see that for yourself, you can run the command with debug ouput.
change -v to change -vv

2021/09/03 14:51:08 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "sync" "archive" "/archive" "--config" "/rclone.conf" "-vv" "-P" "--checkers" "16" "--transfers" "8"]

where say with is the default?

the docs say testing by size and modification time or MD5SUM. witch is the default?

the default as per the debug log

DEBUG : rclone: Version "v1.56.0" starting with parameters ["c:\\data\\rclone\\scripts\\rclone.exe" "sync" "wasabi01:testfolder01" "d:\\test" "-vv"]
DEBUG : file01.txt: Size and modification time the same (differ by 0s, within tolerance 100ns)
DEBUG : file01.txt: Unchanged skipping
DEBUG : file02.txt: Size and modification time the same (differ by 0s, within tolerance 100ns)
DEBUG : file02.txt: Unchanged skipping
1 Like

The reason you see checking with --checksum is that it is making checksums of your local disk which takes some time.

If you want it to run quicker, either use --size-only, or --update --use-server-modtime as in the avoiding HEAD requests docs.

1 Like

the docs says this is applied when rclone sync or rclone copy.

Does it work for rclone move? (in case the file already exists in destination)