For every file on gdrive I pre-made a .hashm which contains md5 of it. how to verify?

indeep for file1.txt,
the checksum file is named file1.txt-EQ.hashm,
it's pre-calculated on my PC, and then uploaded with the archieve to the google drive.

.hashm file content is 4 column: md5, filename, timestamp(of any activity), action(create/verify)
the last two may not be stanard and may be only left to me only.
let's talk about the 1st and 2nd, this is same as the standard md5 file format.

so i got folder with subfolders on google drive.
i read that rclone only RECENTLY added the md5sum command,
which

  1. could retrieve md5 for single file (pre-calculated by remote server)
    and 2. (gpt told me) it can also retrieve md5 for a folder, in this case the relative path/filename will be in the 2nd column

is there existing command line parameters to use rclone md5sum to check server-pre-calculated-md5 with local-pc-calculated-md5-in-file-on-gdrive, for every single file in a multi layer folder?

if not, i'll have to use python to do that.
thank you.

example filename and content if the md5 .hashm:

0008_passwd_240115T122912.7z.001-EQ.hashm

7eb4d73972a07d9b02499eb0cae415bd *0008_passwd_240115T122912.7z.001 *240116T025734 *create

(the -EQ is for my self script to indicte the file's checksum is same as the file during creation/calculation, it could be removed if needed)

thanks

gpt4 helped me wrote those python scripts that handle this.

but for this post's title,
gpt4 say rclone md5sum cant do so,
and suggest me to write some python scripts /w help of rclone md5sum instead.

======gpt4

To verify the integrity of your files on Google Drive against your pre-calculated .hashm files, you can use a script that compares the MD5 checksums from your .hashm files with the checksums obtained using rclone md5sum. Here's a general approach:

  1. Retrieve Checksums from Google Drive: Use rclone md5sum remote:path/to/folder to get the MD5 checksums of the files stored on Google Drive. This command will list the checksums with their corresponding file names.
  2. Parse Your .hashm Files: Write a script (Python would be a good choice) to parse the .hashm files and extract the MD5 checksums and associated file names.
  3. Compare Checksums: In your script, compare the MD5 checksums from the .hashm files with those retrieved from Google Drive. This can be done by matching file names and then checking if the checksums are identical.
  4. Report Discrepancies: The script should report any discrepancies between the checksums, indicating potential issues with file integrity.

This approach involves some scripting but would automate the process of verifying file integrity against your pre-calculated hashes.

3 / 3

User

You

do rclone md5sum's parameters already do this at the moment?

ChatGPT

ChatGPT

As of now, rclone md5sum does not inherently compare pre-calculated MD5 checksums in your .hashm files with the checksums obtained from the remote storage. It primarily lists the MD5 checksums for files on the remote storage. To perform the comparison you're describing, you would still need to use a custom script or additional software to parse your .hashm files and then compare these checksums with those retrieved via rclone md5sum. This functionality is not built into rclone md5sum directly. For the latest updates or more specific functionalities, you might want to check the official rclone documentation.

depends on your definition of RECENTLY - you can check on github that it is present at least since 2016 which means from the start...

Instead of asking gpt and trying to make sense of its half truth answers I suggest you check documentation for rclone md5sum

Other way to get hash from gdrive is to use rclone lsf - you can configure which fields to list, e.g.:

rclone lsf drive: --format "tsph"

Nope. You will have to use your script-fu.

But the most important is what exactly you want to achieve? You want to check if google stored md5 hashes are correct?

yeah, reality is very different.

md5sum ask for folders, not individual files.

so at the moment, AI is assistant only.

thanks

This is not really documented, but it does actually work with individual files:

% rclone md5sum example.txt
7db9f70bb54f2801534fd94280ea74ea  example.txt

Something pretty close to what you're looking for is the --checkfile HASH option in rclone check. But it does require your sumfile to be in the exact format shown above (hash + 2 spaces + relative path), so you would need to first get your file into this format (i.e. strip out the timestamp columns and such.)

1 Like

i finally almost partly done by using python.

  1. grab the whole folder's md5 from google drive by rclone md5sum

  2. then compare to locally stored .hashm of my own format.

indeep the checksum should be untold to google drive so google drive calculate independently so the md5 are valid.

thanks all

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.