Improve encrypted directory entry performance

thought on encrypted directory names. As I understand it, the only way to find if a directory entry exists is to read the entire set of encrypted directory entries and decrypt them all.

now, with caching this isn't so bad, but I'm wondering if we can do better.

every dircetory entry is encrypted with the key and the salt. currently the salt is random, but I'm wondering if it could be the parent directory entry's encrypted name.

this way, one could easily determine without having to read a large directory if a directory entry exists.

i.e.

dir_a
file_a
dir_b
file_c
file_d

lets ignore how we encrypt things in the root right now.

but if dir_a gets encrypted to ASFS (i.e. what is existing on the remote)
and dir_b gets encrypted to $RFWD
then
file_a -> encFunc(ASFS, key, file_a)
file_c -> encFunc($RFWD, key, file_c)
file_d -> encFunc($RFWD, key, file_d)

when a file is moved to a new directory its filename gets rencrypted (which probably has to happen anyways today within move support) based on its new parent.

for root directories, perhaps can generate a random salt to be used by all of them.

now, I see the security implication, you severly cut down the amount of keys that have to be tried to decrypt all files, but this might be a valid tradeoff in terms of performance.

thoughts?

No, we know exactly the name of a directory when encrypted so we don't have to search for it. This is why crypt uses the EME encryption: https://rclone.org/crypt/#name-encryption

so my system would actually be stronger then. i.e. today, if I have

/dir_a/file_a
/dir_b/file_a

I'll get something like

SFSF/DEFA
WFSF/DEFA

but in what I described, it still be easy to do lookups, but each directory entry would encrypt differently based on the directory it was part of.

but ok, perhaps not as big a win as I was thinking.

I think using the parent directories name is the system that another crypted overlay uses. You can also have a per directory IV which is used by gocryptfs.

It is a good idea, but in my view compromises data integrity - namely you can't rename the parent directory without renaming all files inside it, and you can't decrypt a filename without knowing the parent directories name which means trouble in data recovery situations.

There is a good summary of encrypted overlay filesystems on the gocryptfs website.

Rclone is very similar to gocryptfs (it uses EME so avoids a prefix leak) but does leak identical file names and the max length is 143.

ah reasonable point. as usual its a give and take and we have to get out of our frame of references (which really aren't even always true, i.e. while my remote is mostly append only, i do move things around here and there)

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.