Not understanding why a file is deleted

hello to all,
i do not understand why the log is stating that "EN07.vbm: Deleted"
that file was already on wasabi and the md5 matched but there was difference in modtime.
so why is that file is reported as deleted yet that file was not deleted in wasabi

thanks,

2019/10/09 12:43:50 DEBUG : rclone: Version "v1.49.3" starting with parameters ["C:\data\rclone\scripts\rclone.exe" "move" "wasabiwest01:vserver03-en07.veaamfull\backup\" "wasabiwest01:vserver03\vserver03.en07.veaamfull\rclone\backup\" "--stats=0" "--progress" "--log-level" "DEBUG" "--log-file=C:\data\rclone\scripts\move\rclone.log"]
2019/10/09 12:43:50 DEBUG : Using RCLONE_CONFIG_PASS password.
2019/10/09 12:43:50 DEBUG : Using config file from "c:\data\rclone\scripts\rclone.conf"
2019/10/09 12:43:51 INFO : S3 bucket vserver03 path vserver03.en07.veaamfull/rclone/backup: Waiting for checks to finish
2019/10/09 12:43:51 DEBUG : EN07.vbm: Modification times differ by 41s: 2019-10-09 16:40:52 +0000 UTC, 2019-10-09 16:41:33 +0000 UTC
2019/10/09 12:43:51 DEBUG : EN07.vbm: MD5 = 49776cabff22963b28e242f696973b28 OK
2019/10/09 12:43:51 INFO : EN07.vbm: Updated modification time in destination
2019/10/09 12:43:51 DEBUG : EN07.vbm: Unchanged skipping
2019/10/09 12:43:51 INFO : EN07.vbm: Deleted

If EN07.vbm already existed in the source and the destination.
and then then did a MOVE
then it would be normal that the file in the source gets deleted after the file in the destination has been checked (and in this case just had it's modtime updated also).

TLDR: I assume the log is talking about deleting the file in the source, not the destination - and that would be as intended.

Does that make sense to you, or did I miss something?

sorry,
the monkey in me got confused.

He who knows he is a fool is not the biggest fool; He who knows he is confused is not in the worst confusion.

-Chuang Tzu

nice quote,

as this is a server-side move, which really is a server-side-copy and then delete, i do not understand why rclone.exe is checking the files before transfer?
the files in wasabi should already have a md5 checksum, correct?

how is rclone doing the checking, is it downloading the file to my local server or what?

Transferred: 0 / 202.788 GBytes, 0%, 0 Bytes/s, ETA -
Errors: 0
Checks: 2 / 3, 67%
Transferred: 0 / 5, 0%
Elapsed time: 10m49.8s
Checking:
EN072019-05-09T073008.vbk: checking
Transferring: EN072019-05-09T073008.vbk: checking
EN072019-05-09T073008.vbk: transferring

"checking" can mean a lot of things depending on the circumstance.

Normally it "checks" the size/modtime (as well as general existence) of both the file on the source and destination. It then uses that information to make a choice about what to do. skip, transfer, skip + update modtime ect.

Assuming both drives have matching hash types and it's not unencrypted-->encrypted or encrpted-->differentencryption then it will check hash too if needed. I think it did that automatically in your deletion case after it saw a different modtime. Then it saw the hash was the same and just decided to update the modtime rather than retransfer.

You can force to always compare on hashes (if possible) using the --checksum flag.
This is generally a good idea to do when you can, but it's also not really needed in most cases. Consider it "paranoid-mode".

All the info it needs to do a "check" is gotten from the listing of the files and their metadata (which it has to do just to know files exist in the first place). Local processing or transferring is not needed. The reason it can check a hash without processing it locally is because cloud-filesystem already have pre-calculated the hash and made it metadata for the file, so we can just use that.

The only time it costs resource to check a hash is if you force --checksum while transferring from a local drive to a remote. In that spesific case you have to read the entire file and calculate it on the fly (which costs some CPUtime). Normal filesystems don't store hash in metadata, but since it's your local system rclone can of course do that for you on the fly. That also means that your local filesystem can technically be compatible with any type of hash, unlike the cloud-filesystems which just use a spesific type.

thanks but that did answer my question.

in this specific situation, server-side-move, which is a server-copy-copy and delete.
i want to know exactly the sequence of events?
what exactly is rclone checking?

thanks

I'll try to be more succinct (which is not my best trait) :slight_smile:

  • You order rclone to move all files in locationA/folderA to locationB/folderB
  • rclone then needs to know what files you are talking about so it issues a "list" request to server (for that spesific folder) to get a text-list of path, name, accesstime, modtime, hash ect. for all files (this is called metadata).
  • rclone then needs to know what to do with those files, because it doesn't want to just dumbly move and overwrite. That would be inefficient. It issues a "list" request to the destination directory too.
  • Rclone now knows the files+metadata for both sides (including hashes). It compared the two lists and goes "aha, we need to transfer these ones, skip these ones, and just do some metadata updates here". The goal is to make all files on the destination be the same as on the source, but do as little as possible to make that happen. Comparing 2 files like this is what "one check" on the -P progress indicator means. This check is done locally on your CPU - but it's only checking some text-data and is trivial.
  • Rclone executes it's plan - doing transfers and/or changing metadata on existing files as needed.

Note that the default way of listing is folder-by-folder (although with multiple threads if needed - this is what "--checkers 8" is about).

An alternative way of listing is using --fast-list. This can allow very large listing operations (such as syncing an entire drive or other large folder structure) to be asked for in a single request rather than many small ones. This can be much much faster, so consider using it if your cloud-provider supports it.

EDIT: yes, S3 does seem to support fast list. I absolutely recommend you use this on any operation involving files across many folders or complex directory-structures. A fast-list of my wntire drive 85K file takes about 60sec. A traditional list would take more like 13-15min.
https://rclone.org/s3/#fast-list

thanks again.
let me digest that

Here is how I think of things...

The core copying parts of rclone sync, copy and move all do exactly the same thing! They are implemented by the same code.

So all 3 operations do the equivalent of rclone copy and then

  • sync deletes any files in the destination that weren't on the source
  • move deletes any files in the source that are on the destination

So if you copy up a file with move that is already there (checked by size+modtime or size+checksum), move won't copy it, it will just delete the source file.

thanks, that helps.
nice and succinct!

by the way, rclone is awesome, glad that i donated

Thank you and thank you :smiley:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.