Crypt backend filename encoding

Related github request: Base32768 file name encoding for crypt backend · Issue #5801 · rclone/rclone · GitHub
Related forum post: Base32768 to compress filename length

Crypt overlay now uses base32 to encode the filename resulting in a large overhead

I use a crypt backend with Dropbox, which is semi-case sensitive as well as not supporting the full character set of base32k.

I was hoping to add another option to --crypt-filename-encoding, something like base512 (or base260?) that would:

  • attempt a base32 encoding, and if the encoding length passed a threshold would:
  • encode in base512, using a specifically curated character set supported on many (all?) backends where none of the characters could case-fold into each other. The character set could be different from b32 to prevent confusion

Why base512? Dropbox has the same filename limit that most OS do (260 characters), so this way I would be able to put any file I can have in my filesystem into my crypt rclone (base32 and to a lesser extent base64 wont let me do this, and base32k doesn't work with Dropbox).

The benefit of doing it this way is that people could use it as a drop-in replacement, without having to re-encode filenames.

Would a feature like this be welcomed? I would be happy to implement it.

1 Like

Seems like a really good idea!

1 Like

The question would be whether dropbox counts in UTF-8 or something else?
If they uses UTF-8, then larger codebase will result in longer filename as in UTF-8 code counts. Because only the first 128 characters is encoded to 1-byte in UTF-8. Excluding the control characters and caps, base32 is basically all.
(uuencoding)[uuencoding - Wikipedia] is another set without caps but including '', '/' and '*' that most filesystem would not allow.
Base32k is basically a hack for UTF-16 counting OneDrive.

Check out

I think I made the length test use various width of unicode chars

rclone test info is very slick.

// Output of rclone test info --check-length

// Dropbox backend - not encrypted
maxFileLength = 255 // for 1 byte unicode characters
maxFileLength = 255 // for 2 byte unicode characters
maxFileLength = 255 // for 3 byte unicode characters
maxFileLength = -1 // for 4 byte unicode characters


// Dropbox backend - encrypted with base32
maxFileLength = 143 // for 1 byte unicode characters
maxFileLength = 71 // for 2 byte unicode characters
maxFileLength = 47 // for 3 byte unicode characters
maxFileLength = 35 // for 4 byte unicode characters

So Dropbox at least counts characters - not bytes

2 Likes

Nice. Can you try a test with base32768 encoding on Dropbox?

The problem with base32768 is that Dropbox doesn't support newer unicode code points which seem to be used by b32k. E.g.,

azmi@stiorra:~$ rclone touch azmi-dbx:/good_stuff/foo
azmi@stiorra:~$ rclone ls azmi-dbx:/good_stuff
        0 foo
azmi@stiorra:~$ rclone touch azmi-dbx:/good_stuff/ꡀ
azmi@stiorra:~$ rclone ls azmi-dbx:/good_stuff
        0 foo

I did run the test though, and it seems to have worked?

// crypt-b32k
maxFileLength = 463 // for 1 byte unicode characters
maxFileLength = 231 // for 2 byte unicode characters
maxFileLength = 154 // for 3 byte unicode characters
maxFileLength = 115 // for 4 byte unicode characters

Before amzi responded, I kicked off a Dropbox test as well using:

rclone test info --all dbcrypt-test:/test

with config:

[DB]
type = dropbox
client_id = <REDACTED>
client_secret = <REDACTED>
token = <REDACTED>
server_side_across_configs = true

[dbcrypt-test]
type = crypt
remote = DB:/Test
password = <REDACTED>
password2 = <REDACTED>
directory_name_encryption = true
filename_encoding = base32768
server_side_across_configs = true

The result of which was:

// dbcrypt-test (filename_encoding = base32768)
stringNeedsEscaping = []rune{
	'/', '\x00'
}
maxFileLength = 463 // for 1 byte unicode characters
maxFileLength = 231 // for 2 byte unicode characters
maxFileLength = 154 // for 3 byte unicode characters
maxFileLength = 115 // for 4 byte unicode characters
canWriteUnnormalized = true
canReadUnnormalized   = true
canReadRenormalized   = false
canStream = true

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.