Need to dedupe about 200TB ~ on a weekly basis.
Can I use this on a cache? If I can’t, then I will be getting 403d every time I do this.
Can I only dedupe folders? I don’t care about duplicate files, all I want to do is merge any duplicate folders.
I think (but not 100% sure) that the cache doesn’t allow duplicate file names, so no you can’t do this via the cache.
@remus.bunduc is that correct?
Not currently no. It would need just about the same amount of directory traversing either way,
To avoid bans, I suggest you use a very small TPS limit, say
--tpslimit 0.1 do do one transaction every 10 seconds say. It might take a very long time but it shouldn’t get you banned.