Best way to manage massive Google Drive

I have a massive Google Drive (about 12TB in about 1.2M files) that I want to dedupe and ncdu and maybe some other operations.

Would you use a cache (Cache) backend? Or is that no longer valid?

Would you use fast-list?

Any other tweaks?

I will not be adding data while I do this ... I just want to clean things up, delete large folders & junk, and dedupe.

Thanks!

This is not really massive amount of data. Especially for maintenance when you are not down or up loading anything.

Do not use cache - it is deprecated.

fasy-list is good idea

As long as you have decent network connection and modern computer all should work out of the box.

Make sure that your remote is using customised client_id/secret.

Well, I tried running dedupe on a single folder that has
Total usage: 706.166Gi, Objects: 4.189k
and it canceled it after 6 hours (returning 1 set of dupes)
And I didn't see any indication of how much was left.

Well, nobody can see what you are doing and what your setup it.

4k objects is nothing really, Definitely something is wrong.

Should I run with logging and show you? Not sure what you're asking.

I just ran this:
rclone dedupe drive:folder1 --interactive --by-hash -P -v

dedupe considers files to be identical if they have the same file path and the same hash. If the backend does not support hashes (e.g. crypt wrapping Google Drive) then they will never be found to be identical. If you use the --size-only flag then files will be considered identical if they have the same size (any hash will be ignored). This can be useful on crypt backends which do not support hashes.

Are you using the --size-only? This isn't a reccommendation, but based on the documentation you'll need to deal with the caveats of this flag.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.