Crypt overlay now uses base32 to encode the filename resulting in a large overhead
I use a crypt backend with Dropbox, which is semi-case sensitive as well as not supporting the full character set of base32k.
I was hoping to add another option to --crypt-filename-encoding, something like base512 (or base260?) that would:
attempt a base32 encoding, and if the encoding length passed a threshold would:
encode in base512, using a specifically curated character set supported on many (all?) backends where none of the characters could case-fold into each other. The character set could be different from b32 to prevent confusion
Why base512? Dropbox has the same filename limit that most OS do (260 characters), so this way I would be able to put any file I can have in my filesystem into my crypt rclone (base32 and to a lesser extent base64 wont let me do this, and base32k doesn't work with Dropbox).
The benefit of doing it this way is that people could use it as a drop-in replacement, without having to re-encode filenames.
Would a feature like this be welcomed? I would be happy to implement it.
The question would be whether dropbox counts in UTF-8 or something else?
If they uses UTF-8, then larger codebase will result in longer filename as in UTF-8 code counts. Because only the first 128 characters is encoded to 1-byte in UTF-8. Excluding the control characters and caps, base32 is basically all.
(uuencoding)[uuencoding - Wikipedia] is another set without caps but including '', '/' and '*' that most filesystem would not allow.
Base32k is basically a hack for UTF-16 counting OneDrive.