Feature question: how does RClone track renames and moves?

Hi there!

I've been using RClone with success for quite some time and it works extremely well.

I've got one question though: how does RClone track renames?

Let's say I have a file named a and I rename it to b, or even move it to another directory. How is to able to tell that b is a renamed a? Does it just check all files and see that if a file has been removed but there is a new file with the exact same size and modification time, then it's a rename?

I'm just curious about how this works internally :slight_smile:

Hi Clément,

I guess you already read the docs:
https://rclone.org/docs/#track-renames
https://rclone.org/docs/#track-renames-strategy-hash-modtime-leaf-size

I can't make a better explanation, but see you are a developer, so I suggest you just skim the code by doing a free text seach for "trackRenames" in rclone/sync.go at master · rclone/rclone · GitHub

1 Like

Thanks, I was hoping there was some deep dive in the docs but it seems like it's not the case. I'll go check the source code then :slight_smile:

I remember @ncw explaining it somewhere in the forum or Github, but can't find it at the moment. You may have better luck :slightly_smiling_face:

We should probably put an explanation for this in the docs...

What happens is this conceptually like this:

An rclone sync with --track-renames runs like a normal sync, but keeps track of objects which exist in the destination but not in the source (which would normally be deleted), and which objects exist in the source but not the destination (which would normally be transferred). These objects are then candidates for renaming.

After the sync, rclone matches up the source only and destination only objects using the --track-renames-strategy and either renames the destination object or transfers the source and deletes the destination object.

The actual implementation works like this - you can compare this with the source code:

When --track-renames is enabled the sync is done as normal, but

  • if an object is only in the source, it is added to the renameCheck list and not transferred
  • if an object is only in the destination it is added to the dstFiles map keyed by path and not deleted

At the end of the sync rclone then

  • creates the renameMap from the dstFiles using --track-renames-strategy to define the map key
  • matches all of the files in renameCheck against renameMap and if they match, renames them, and if they don't transfers them
  • After this, any remaining files in dstFiles are deleted.
1 Like

Ok so if I understand correctly RClone is stateless? It doesn't store informations about the source objects anywhere? It's just making a list of source-only and destination-only files and check if a file has the same size and modification time to see if it's a renamed file?

That is correct.

Yes, that is it.

1 Like

INSEE, thanks for your explanation :slight_smile:

Just to be pedantic for the sake of clarity, most of rclone is stateless. Some notable exception are bisync and the hasher remote.

Also, just a warning on modtime strategy: It will not check for a unique match. So if you have to files with identical sizes (to the byte) and ModTimes (to the resolution of the system), it can have a false match. Not super likely but it can happen. I opened a ticket or a forum post about it and, eventually, plan to muddle my way through golang to add the option to enforce uniqueness but it has been OBE.

If you're interested, I investigated my own file system. The few false matches are from:

  • Small files in a directory where something like touch * has been executed
  • macOS Sparsebundle Disk Images where exactly 8388608 byte blocks are created
1 Like

That's exactly what I thought. I'm currently building a synchronisation tool myself (a kind of Rclone for more specialized usages) and that's the problem I had as well.

I don't see how to ensure correctness while still being stateless. FFS (FreeFileSync) achieves reliable renaming détection by storing each file's node ID in a state file and re-using it each times. But that requires to have a state and does not work on every filesystem.

And I'm not sure whether it's possible to actually achieve stateless and reliable renaming tracking :confused:

I'm not sure whether it's possible to actually achieve stateless and reliable renaming tracking

I've given a lot of thought to this when I developed syncrclone (a competitor to the built-in bisync with some notable pros and cons). But, alas, syncrclone is stateful.

It is possible to be reliable if you have hashes. If you don't have hashes, what I do in syncrclone is only allow the move if it can uniquely match. Still a (small) risk but that's worth taking. And the thing is, the safe answer is to just transfer again.

Remember, from @ncw's answer, it only compared deleted files. So having more than one file with the size and ModTime are only an issue if more than one has been moved.

1 Like

Interesting point. Rclone just picks one according to the source, but maybe it should be giving a warning or an error if there are duplicates?

I think so. We talked about it in the past and you pointed me to where in the code to start. I just haven't had the time.

I think it should be a flag though since (a) you otherwise break backward compatibility and (b) you no longer short-circuit the loop so it can be slower in theory (I mainly use Python so loops are very inefficient. Less of an issue with Golang)

Different question - but fits to the topic @ncw :
According to the docs, this doesn't work on encrypted destinations currently. Are there any plans to implement this feature (or is there a technical limitation for this usecase?)?

See:

rclone cryptcheck

Already there.

1 Like

You can use ModTimes (and some of the others) if the remote supports it. Maybe even the hasher remote around encrypted if you wanted.

--track-renames-strategy modtime is a possibility.

It would be possible in theory to use the same methodology that rclone cryptcheck does to caclulate the encrypted checksums and use those, but that is a lot of work!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.