Script that checks for rearranged local files and moves them remotely

Hi, Is anyone aware of a script (using rclone underneath) that will compare files (name/size/date is fine) that exist in both a local and remote but in differing directories/structure. Then it will move the remote files to match the local hierarchy? Move is supported on remote (sftp). I have a big collection of photos that have been reorganized locally (in very different folder structures) and would rather not have to re-upload them all (or manually move them around). It's an encrypted remote so other cli tools won't work as they need rclone. I'd also rather not use mount for this, but perhaps that'll be an easier route to find a solution.
Thanks!

1 Like

Separate from RClone I have been meaning to write a script that would do this - every now and again during a backup I forget that I have done some major re-organisation recently and a whole lot of stuff gets deleted from one dir on the destination and then recopied to the new dir on the destination . . so annoying - why doesn't my computer automatically understand the situation and just do what I want?

Hi allobrogica and philip_rhoades, it sounds like you are looking for these sync flags:

--track-renames
--track-renames-strategy

Please use caution when changing your sync flags, use --dry-run first.

Thanks. I'd forgotten about it and the last time I tried it didn't work on crypted with files in radically different folder structure. I'll test it.

Thanks for that - I will have a look at those switches!

Just be aware that rclone currently does not handle the case well of multiple moved files having the same ModTime and size. I had a post about it before but have not had the chance to learn enough golang to add the fix. It should refuse the move if it is not unique.

OP is using SFTP which does support hash but it will be very slow to compute the hashes on both sides (though may be worth it)

1 Like

Thanks, good to know. In my case I'll use track-renames as a first pass to minimize re-uploads then do a second pass (without track-renames). The dry-run tests went well (with modtime,leaf) and it appears track-renames will do exactly what I wanted even on a repository with thousands of files.

To be clear, will your first pass be with --track-renames-strategy modtime? Assuming you're using that because the remote doesn't have hashes (e.g. crypt, http, webDAV), then the second pass will not fix any false-positives (though, honestly, leaf may be sufficient).

It really depends on your system. I know that on my computer, I have files with the exact same size, modtime. But you may not. (Mine come from macOS parsediskbundles where the bands are all exactly 8388608 bytes, have a modtime from when written (which is the same to macOS time resolution), and have hex names.

Of course, YMMV. I just wanted to give you the heads up

I'll luse modtime,leaf (and yes it's a crypt remote). In my current use case it's only photo/video files. I'm really only looking to have it mostly move around the files that I've recently massively reorganized locally. Once they're moved around remotely, then I'll run the normal backup without track-renames going forward since this is a one time re-org of files.

Thousands of the files were actually removed from the hierarchy locally (rejected photos) so I'll also be able to use the backup-dir switch to move those files (mostly) into the separate rejected hierachy. Rclone's flexibility at work. Now if only CarbonCopyCloner could become as smart as rclone has become.

Your notes are good in case I want to use track-renames for other stuff in future. Separately, I learned the hard way long ago not to trust any kind of incremental backups of sparsebundles (and long ago switched important stuff to sparseimages just to be safe). But if its time machine backups or something you don't really have a choice I suppose.

The question being asked in the OP is essentially unsolved and possibly unsolvable (for a fragmented and diverse enough directory structure).

Personally I've taken to just littering the cloud with extra copies of files that fit this case, wasting the cloud storage space (because mine is unlimited).

You can easily run dedupe style operations but only within a folder.