Enabling sharing encrypted files

spotter · January 26, 2020, 7:21am

Throwing out an idea:

It be interesting if one could use crypt on a google drive, but also share individual files (encrypted ones, by providing the key)

I see 2 impediments to this

Being able to associate a decrypted file path with the encrypted path one wants (probably solvable)
Being able to encrypt each file with an "individual key" that won't reveal one's master key.

re 2: a thought. If one encrypts the file-name as it does today with master key (with salting, presumably), then uses that encrypted file-name + master key to create per file key that can be generated at will to encrypt the file.

if we could share the file with the real name + the per key file, we could easily write a utility to enable people to fetch individual files we want to share.

note: I think if this would be doable, one would still want to run this encryption scheme by a expert to ensure it doesn't create a weakness in the overall encryption

thoughts?

ncw · January 26, 2020, 10:17am

Rlclone crypt doesn't have "per file" encryption at the moment, so if you shared a file + key you'd be sharing the key for the whole crypt.

If you didn't mind that (say you set up a crypt for sharing) then you could make this work with rclone as it is with a little bit of manual work.

Files encrypted with an individual key might be best left to using an external program, eg gpg?

spotter · January 26, 2020, 11:22am

I agree that right now it wouldn't work. but I don't see why it would be that hard to do.

one would need to generate a per file key in a predicitable way, that's why I thought experimented using the salted / encrypted file name to do that.

i.e.

in

// put implements Put or PutStream
func (f *Fs) put(ctx context.Context, in io.Reader, src fs.ObjectInfo, options []fs.OpenOption, put putFn) (fs.Object, error) {
        // Encrypt the data into wrappedIn
        wrappedIn, err := f.cipher.EncryptData(in)
        if err != nil {
                return nil, err
        }

if we could generate a per file key (in a predictable manner) and use that to encrypt it, this per file key could be shared without revealing providing any other access to one's files.

though haven't been able to understand yet how the read side works. still working on that.

ncw · January 26, 2020, 12:36pm

How about providing a new rclone command encrypt and another decrypt which would encrypt a stream...

then you could do

# upload
rclone encrypt </path/to/file PASSWORD | rclone rcat remote:path/to/file
# download
rclone cat remote:path/to/file | rclone decrypt >/path/to/file PASSWORD

Though there are existing utilities which could stand in for rclone there, eg gpg -c

spotter · January 26, 2020, 3:27pm

yes, if you only wanted to share 1 or 2 files here or there, it make sense. I was thinking more along the lines of where you are storing a massive amount of data encrypted and want to provide the ability to expose individual files and enable people to download and decrypt it without giving away your master password.

i.e. I only need one password to decrypt everything, but each file automatically has its own password (created by some function of master password and encrypted filename with master password, if this would be considered secure).

So I can query rclone

rclone export remote:/some/path/file

for a base physical backend export if supported will just "export" the object, make it publicly visible to those who can fetch it (i.e. return the url to it). but crypt will actually wrap it, by returning said URL, the "real name" (not crypted name) and the per file encryption key.

Using this data, it would be simple to create a little utility to download and decrypt it the file without exposing any of your other data.

one could also do

rclone unexport remote:/some/path/file to remove a file frm being publicly available (if it was)

ncw · January 26, 2020, 4:06pm

This is sounding complicated! A per file encryption key would mean storing it somewhere (in an additional file maybe or in the header) or deriving it from the files name... That would introduce quite a bit of inefficiency in crypt and/or make it so you couldn't rename files.

spotter · January 26, 2020, 4:10pm

hmm, reasonable point re renaming. but yes, perhaps that's makes it too complicated as you say. I'm not sure there's a good way to derive it otherwise. only option would be, as you say, on create, create a header for the file, but that is much more complicated

personally, I like the export/unexport idea, but can see why it might be biting off too much

spotter · January 27, 2020, 12:30pm

so digging into this further, I see the magic and nonce are stored and have to be read to decrypt a file. It would seem "easy" (scare quotes), to add a per file encryption key in this context. I can imagine this provided a little bit of overhead (creating / encrypting per file key on creation and reading / decrypting on read), but is this large in the context of everything else that is done?

ncw · January 27, 2020, 2:16pm

At the moment, the encryption for each file depends on

(nonce, masterKey)

I think what you are proposing is that we derive a per file encryption key

random = 32 bytes (say) of random numbers
fileEncryptionKey = derivationFunction(random, masterKey)
File decrytpted with (nonce, fileEncryptionKey)

We then store

(nonce, random) in the file in the clear.

This would mean that we could decrypt files as normal using the masterKey. We could also give people fileEncryptionKey for a single file without compromising masterKey.

I think that would work, depending on exactly what was used for the derivationFunction. A hash function keyed on masterKey would probably be the right approach or maybe we could use scrypt which rclone uses already, however that is quite computationally expensive.

spotter · January 27, 2020, 3:35pm

I'm not sure we need a nonce (the random nature of the per file key means that 2 files with the same content will be encrypted differently, which I think is the point of the nonce here?) but not a crypto expert, so perhaps its needed for something else?

i'm not sure we even need a generation function, i.e. instead of storing random numbers, why not just store masterKey(filesKey). i.e. should any secure pseudo random generator seeded with a good source of randomness (i.e. https://wiki.tcl-lang.org/page/Cryptographically+secure+random+numbers+using+%2Fdev%2Furandom) should be good enough to generate these keys.

each time rclone starts up (either as fs or standalone) it will seed its keygenerator with a secure seed.

every time it needs to make a new key, just takes the next key produced. encrypts it with master key and stores in header.

when it wants to read file, it reads header, like it has to for nonce, decrypts fileKey and creates a new cipher object using that key instead of master key.

but as I said earlier: would be interested in having this design vetted by someone with more knowledge of crypto than me. All I've learned in my years using crypto is that its is easy to get stupid things wrong when one implements things on one's own

ncw · January 27, 2020, 9:34pm

You are spot on for the purpose of the nonce and the nonce could take the place of the random bytes above I think. Not sure I'd be comfortable using nacl secretbox without a random nonce though.

I think this scheme would work, but there is an important detail, namely how the key is encrypted with the masterkey. I'd probably use nacl secretbox again which gives encryption and authentication at the cost of a random nonce + 24 bytes of authenticator.

With this scheme, when you given someone the file decryption key to a file, then effectively you give them the plaintext and some encrypted ciphertext for the same thing (a crib) which is what you need for a known plaintext attack.

Using the one-way function I suggested doesn't have that problem, however it is academic as all modern ciphers are resistant to known plaintext attacks.

Indeed! I've done quite a lot of crypto stuff but I'm not a cryptographer by any means!

It might be good to look at some prior art, maybe the LUKS encryption scheme which solves the same problem.

spotter · January 28, 2020, 9:24am

Getting Back to my though on export / unexport functionality.

thinking that each backend should return an object of something like

type FileExport struct {
    URLs []string
    Name *string
    Password *string
}

where pointer types are used it would be optional (i.e. might be added by others or inherent / provided by the backend on actual download)

some backends might say that if you try to export a directory, they will fail (I'd argue it doesn't make sense for crypt or chunker as directories are changable over time, but could wrap it in another struct if one really wanted to do that)

the idea is that I say

rclone export remote:/path/to/file

and each backend knows how to translate this request (possibly into multiple requests) to its underlying backend (perhaps all backends have to take a group of requests as from an API perspective, perhaps possible to group the requests to limit API calls and would just have to iterate over them itself if can't group them)

each backend would respond to its caller with the struct, which would then get "translated appropriately. i.e. chunker would make sure the full list is populated (if it does the iteration itself) and crypt would add the unencrypted name and password.

it finally gets returned to the user as a serialized out json or yaml object.

I can then write a simple tool to read eveyr file in order (skipping header data) / combining multiple/decrypting (if neccessary) into a single file.

unexport would work in reverse, not having to return anything besides succeeded/failed and would remove the permissions that allow the file to be downloaded by anyone.

ncw · January 28, 2020, 1:39pm

That is essentially what the rclone link command does, only it only does one file or dir at once.

Can you describe your use case - what problem you are trying to solve - that would help me understand your proposal better.

spotter · January 28, 2020, 7:42pm

I'm using an encrypted store as essentially a cheap cold store backup (i mean for $144 a year, once you hit 3 TB of storage, you are at the same cost as aws glacier but without any of the other fees and any other data is then essentially free) , but sometimes I might want to share something with someone else. Today that would mean copying it locally, then uploading it back. a serious waste of time. with per file keys, I could make it visible, share the key (basically the struct as described) and someone could make use of it.

rclone link is a basis for what I'd want, but I think its important for their to be an undo for it as well. i.e. if propegated through a crypt backend, it would be a small pain to unpublish it (doable based on id returned, but annoying)

ncw · January 28, 2020, 9:01pm

Interesting point!

Ah I see!

The file will show as shared on the drive interface, but that isn't terribly helpful since you'll be seeing encrypted file names...

OK, so given the per-file encryption you'd do something like

rclone link --decrypt crypteddrive:path/to/file

This could then print an rclone command like this

rclone copyurl --decrypt KEY PublickURL filename.ext

Which you'd pass onto your user.

spotter · January 28, 2020, 9:38pm

essentially yes. to me it feels like a very useful thing (think it would also be useful to delink files through rclone, perhaps with a warning that it will remove all permissions for other users)

ncw · January 29, 2020, 8:00am

Well I'm thinking of doing an update to the crypt file format to add hashes and other metadata, so adding per file encryption there would be the time to do it. I've made a note of it!

The link/unlink are relatively easy in comparison!

spotter · January 29, 2020, 10:05am

any idea if there would be an efficient way to convert from file format v1 to v2 (perhaps within GCP? or would GCP have the same 750GB a day limit)

i.e. if it didn't, presumably one could do rclone move remote_old: remote_new: or something along those lines?

ncw · January 29, 2020, 11:00am

If you were doing per-file encryption you'd be re-encrypting the files so I think rclone move would be the right solution.

fpraden · January 30, 2020, 5:16pm

Hi,
Maybe if you update the crypt file format, it may be wise to address others concern about encryption like

delegation of an entire directory
Adding/deleting user right to decrypt files/dir and subdir
long filename / pathname

Maybe having some kind of GPG encryption to add user, shared "master" key for directory/files encrypted via GPG, and removing right by changing the "master" key of the files.

It may be a little more harder to think of this kind of feature.

I was think of that for myself to integrate it with git-annex and the rclone module to have a big repository and share some part of it with multiple users with differents rights
(that why I saw this post)

Some research linked to that subject I found (in french) https://hal.inria.fr/tel-02394349
Best