Will dedupe work like this?
I did not fill form since this is general question
But I am running windows with rclone-v1.65.2
So I have 2 main folders inside Google TD
Clean
Spaces
Inside those folders are thousands of folders all exactly 1 depth, meaning there are no nested folders
So like this
Clean\Folder1\Files
Clean\Folder2\Files
Spaces\Folder1\Files
Spaces\Folder2\Files
My goal is to KEEP all files inside Clean and ONLY remove duplicate files inside Spaces folders where the MD5 matches files inside Clean
I am trying to clarify whether rclone will go through folders in alphabetical order the way I see them.
I notice when I do moves with rclone it can skip around, which is not an issue when moving, but my concern is dedupe not do the same.
Is this the right cmd for rclone for what I want it to do?
rclone -vvP --fast-list --tpslimit=100 --drive-use-trash=false dedupe --by-hash Remote:/ --dedupe-mode first
Meaning it would iterate all folders and files inside Clean\ then only delete the ones with matching MD5 in Spaces?
As per docs:
dedupe
considers files to be identical if they have the same file path and the same hash.
So you can not dedupe across different paths.
Crap is not what I thought, I totally missed that!
I guess the only option then is doing this in rclone
rclone lsf --recursive Remote:/Clean > .Remote-Clean.txt --format hi --separator ","
rclone lsf --recursive Remote:/Spaces > .Remote-Spaces.txt --format hi --separator ","
At least then I can compare Clean to Spaces hash then have to use Google API to purge the dupe hash fileid.
A bit more work, but it will do it.
I would only call hi since I do not care about file names, just deduping the spaces dupe hash when Clean hash exists
A lot more manual work, and more tools to do it.
Unless you know a better solution.
I would go similar path I think. And if repetitive task I would spend some time to automate it as much as possible.
Having full list of files with hashes I think should be relatively easy to process by script and list dups. Some work first time, pressing enter only the next.
My solution is I have all my remotes inside excel which auto generates my rclone cmd lines so I can simply copy and paste.
To dedupe I use emeditor as it will compare 2 files based on MD5 and only give a report where there are matches on the MD5 value
Then is matter of copying the GDriveID from results and paste into tool that calls Google API to purge the FileID's
Is the best automation I have come up with.
I am not aware of rclone having ability to purge by fileid, I wish it could that would actually be a little easier if it did
1 Like
asdffdsa
(jojothehumanmonkey)
July 30, 2024, 12:59pm
7
InfoR3aper:
To dedupe I use emeditor
might be easier to use rclone mount
and then run whatever dedupe tool you want.
system
(system)
Closed
August 29, 2024, 1:00pm
8
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.