The dedupe command doesn't work unless the files have identical names

https://rclone.org/commands/rclone_dedupe/

D:\rclone-v1.47>rclone dedupe -vv --dedupe-mode newest "D:\gallery-dl\gallery-dl"

2020/03/10 06:03:36 DEBUG : rclone: Version "v1.47.0" starting with parameters [
"rclone" "dedupe" "-vv" "--dedupe-mode" "newest" "D:\gallery-dl\gallery-dl"]
2020/03/10 06:03:36 DEBUG : Using config file from "MYREALNAME.conf"
2020/03/10 06:03:36 INFO : Local file system at \?\D:\gallery-dl\g
allery-dl: Looking for duplicates using newest mode.
2020/03/10 06:03:36 DEBUG : 2 go routines active
2020/03/10 06:03:36 DEBUG : rclone: Version "v1.47.0" finishing with parameters
["rclone" "dedupe" "-vv" "--dedupe-mode" "newest" "D:\gallery-dl\gallery-dl"]

note: this folder is full of 100s of duplicate files, and the process ran instantly, I don't think it checked a single md5.

note2: none of the files have the same name because windows wouldn't allow that, but all the duplicates have been renamed to _001 or _002. I guess dedupe doesn't compmare md5's between files without identical names? It would be useful if it would, because, my computer won't let any such file exist, so my cloud backup is unlikely to either.

note3:

I uploaded the same folder to google drive, and I ran the dedupe again, it did nothing at all, and completed instantly.

Dedupe is for use with Google Drive:

https://rclone.org/commands/rclone_dedupe/#synopsis

ah, dang, so there's no command that's similar but for files with different names?

the thing is, I've got two directories I'm trying to merge. there are many duplicates. Any way I use rclone copy "localdrive" "googledrive" will result in copy overwriting things as it sees fit.

Unfortunately, I'm not a Windows guy so I really don't have a good answer for deduping Windows folders. I'm sure one of the many ones that lurk around here can help you out though!

This is a problem small enough that I could just scan like 500 duplicates and delete them myself but I'd like to try and take this as a chance to learn.

I've had this problem literally 1000s and 1000s of times. sometimes I solve it, and sometimes I just let my cloud backup get cluttered.

I know running deduping on a million files without caring about filename would be impossible, but for a few 100 small files it actually wouldn't take that long to just make all the md5's and compare all of them to all of them. I guess rclone doesn't have a command for that yet though.

Well I deleted the 250 or so duplicates one at a time by hand. Now i'm reuploading them to googledrive, but really, it'd be great if i could find a tool that would do this for me. It'd be great if it worked via rclone too, because like I said, in the past, literally maybe 1000 times, I've had this problem, and given up, and just sent the files to googledrive.

Thing is, i don't need to deduplicate the whole cloud drive, just within specific folders. I know that the files in different folders won't be duplicates, but often files within one folder ARE duplicates that I stupidly let windows make before cloning that error to googledrive.

There is an issue for rclone to be able to do this

You can do it manually with rclone using rclone md5sum sorting the output and finding duplicated hashes.

It would be a nice addition to rclone dedupe though.

1 Like

on windows,
you can use dupeGuru
or

and you might want to update your version of rcone to the latest stable version 1.51

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.