Crypt hash possible?

Hey guys :slight_smile:
I might have an idea on how to make the hash work for crypt...
I guess that the files are locally encrypted before they are uploaded, right?
couldn't you create one file per directory locally and extern, with the following things in it:
name; encrypted name; unencrypted hash (possibility for different hash) + encrypted hash (possibility for different hash)
at the time you uploaded the file you can compare the md5 value you have saved via the google interface.
you can also upload the file encrypted and check it again
what do you think?

And when use the hash command you could update these file with additional hashes.

Greetings
Xyz00777

No, they are encrypted on the fly as they are being transferred.

Each time you upload a file, the hash changes because of the encryption. You'd have to keep a list of source and destination file names and somehow match them together in some fashion. It's definitely a complex issue.

Because i don't know how it will look here i uploaded my idea to pastebin because of the format.

i hope i dont forget something :slight_smile:
https://pastebin.com/nD6JM2qu

Hmm, so you mean you'd keep a local mapping of hash(encrypted file) to hash(plaintext file)

When you looked up a hash on drive, to get the actual hash you'd run it through the map.

Neat idea... You would be unlucky to get a hash collision so I think that map is all you would need.

I'm considering a V2 crypt file format which would have a header or footer with hash in BTW

3 Likes

and the header/footer would get extracted when the file get downloaded to verify the hash? (how you will handle it with mount? the files didn't get downloaded mostly and only get viewed... )

but even then would be a additional file with names+hashes awesome for integrity check with 2 hashes,
because when you download the file with the head and the encrypted file get corrupted while download you cant be sure if only the head/foot or/and the data is corrupted.
here you need a additional file with the hash of the unencrypted data inside it to can verify the hash when:
the data is encrypted and the hash is not the same like in the head/foot
because of that the additional file should get uploaded too. here would be a good question:

  • do you want to save the filename + hash as a single file per data who get uploaded
    (easier to find the correct hash and easier to implement it because you don't have to download the complete data directory file)
    (name of the hash file could .hashFILENAME (not the hash of the file only the name "hash" ) or something like these so it is invisible)
  • or do you want to save all filenames + hash in the same file
    (you have to search inside the file for the correct name and have to extract it to look if the hash is correct)

you could probably save the hash of the encrypted file from the storage supplier too so you can check if the file get corrupted by the storage supplier or not and when yes you could upload it again
(would be awesome to get a notification when you copy or sync files)
(probably a own command would be worth to check if the encrypted file hash is still the same or the have changed, so you could upload it again?)
i know the change is really little that it get corrupted but i think its not a huge point to do that too because i think the technical backend is already finished? (when the storage supplier support it)

what did you want to do with the data who are already get uploaded... i think many people have a big amount of data in the cloud... to download it all and upload it again with the additional head/foot would be far away from nice...
(i finished my 5TB backup a few days ago and it needed around 4 weeks for upload)
there would be the additional file awesome (mostly the download speed is mutch better than uplaod), because you could implement it so:

  • new data get uploaded with hash in head/foot and a file with filename+hash
  • old data get only a additional file with filename+hash
    because when you downloading a old file and the file have no hash in head/foot than you could use only the hash in the .hashFILENAME file who get additional downloaded.
    these way you have only one hash to verify the integraty of the file but better than nothing.

there would be a additional command awesome for the person who want to download the old files from the cloud and want to add the hash into the head/foot and uplaod the file again (naturly with the test if the uploaded data get uploaded correctly or not :wink:

here we would need a additional flag or something who get saved in the head or anywhere to can verify (without download) if the file already have a hash in the head/foot or not. Or probably in the hashFILENAME file?
like command started: first file have x in the first part of the head = dont need to download and update. second file have to x in the first part of the head = download it, add hash into head/foot upland it again and so on
the upload must be the same way like in my first idea (pastebin) so the file with the new hash dont get corrupted uploaded....
what do you think about the download/upload idea from me? or is these already so implemented with not encrypted files?

When the file get opened the hashes for it will become available, yes.

Mount checks hashes if the whole file gets downloaded

If you want hashes in an additional file you can do that already with the chunker backend - you set the chunk size very big and use the md5all hash option. I think that works.

I quite often save md5sum files if archiving things (generate these with rclone with rclone md5sum path > MD5SUM which is kind of what you are talking about too.

Writing extra files gets really complicated quickly so I'm trying to avoid it for crypt...

okay and what is with these?

because these file dont have atm any checksum in head/food for verify? and how i sayed it woulnd be aweseome if the person have to download and UPLAOD all of there data again....
because of that i thinked about the additional file with the hash inside it....
i thought the filename unencrypted could be something liek these:
test1.txt
.hashtest1.txt.txt
funvideo.mkv
.hashfunvidedo.mkv.txt
TEST.excel
.hashTEST.excel.txt

It is a good idea for backwards compatibility. So keep the crypt format the same and add an additional sidecar which you could upload afterwards (not sure where you'd get the hashes from though without downloading the file?).

i think downloading, create hash, add the hash (how ever) to the file who is already uploaded is okay, because the download speed is normaly mutch higher than upload speed
I dont now how you could add the the hash data to the already uploaded data because of that i mean the additional file...

OK

I meant the additional file.

Okay :+1: did you need any additional Information or something?
Should i open a github issue or did you open it
What is the procedure there now?

I added a note of this conversation to my notes for the Crypt v2 redesign :slight_smile:

Is there a point where can I see the development progress?

Sure: https://github.com/rclone/rclone/issues/3667

Feel free to comment on there if you want!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.