I expected it to be already in sync. But rclone tries to copy again all the files from s3 to the file system.
If I increase the verbosity, I see:
Modification times differ by 9229h50m27.441916489s: 2018-09-23 23:12:03 +0200 CEST, 2019-10-13 13:02:30.441916489 +0200 CEST`
It looks like it is comparing the original creation time of the file stored in the encrypted s3 with the time I synced the file on the filesystem, and wants to copy it again. But this does not happen with the encrypted-sftp backend. Does anyone know why, please? I do not understand, even after reading the rclone's s3 backend documentation.
The data is almost 2Tb. I would like to avoid copying it again if possible
It sounds what may be happening is the encrypted file's modtime is being compared rather than the real file's modtime (inside the encrypted file). A second complication is that there are 3 different filesystems in play here. It may not be obvious to most users but each of these systems actually have different rules about how their timestamps work, so something might have gone sideways in that regard.
The exact how or why of what might trigger this is a little beyond me though, we might need @ncw to chime in here and suggest how we best troubleshoot this.
But what I can say is that have many ways to work around this at least.
We could compare on hashes instead of the usual modtime+size. This is much more accurate anyway (at the cost of a little CPU), but the only problem is I'm not sure if you can use this directly on an encrypted S3 --> local.
I would ask you to just try first of all.
add --checksum to your rclone command
and see if you get any "--checksum ignored because filesystems do not share a common hash" message.
Rclone may be able to work aorund that with a manual calculation automatically (if so - no error) - but this is the part I can't say for sure off the top of my head.
There are also more ways to go about this, but give me some feedback on what you think may be appropriate and test --checksum first.
So, I tried with the checksum, and it does not try to copy the files anymore, so that seems to work.
It seems that for some files the date was changed, but for others it did not. It is not clear why. I am now running a full --dry-run for the full dataset to have an idea of how much would be copied
If --checksum works then I'd just consider that your fix.
If you don't get that error I talked about at the very start then it means it is working.
But of course if you want to dig further down into this to see if there is a bug somewhere to correct, you can do that too if you want. In that case I suspect we are going to need to use some -vv logging track the modtime and comparison.
So just to be clear - is rclone doing the comparison wrong but the file has a correct modtime?
Or does the file actually have a wrong modtime because it got changed/corrupted somewhere?
If it is the latter, can you maybe see where this happens? It's pretty likely it is happening in one of the spesific transfer steps between remotes.