Sync Container Files to Encrypted Remote (Cryptcheck)

What is the problem you are having with rclone?

I have some container files that I want to sync to an encrypted remote. When I modify files within the container files, the container file itself does not change in size or modified date (edit: however, the MD5 of the container does change). Because of this, sync doesn't detect any changes and the updated container files are not synced to the remote.

I understand that --checksum doesn't work on encrypted remotes, but I'm a bit confused on how I'm supposed to use cryptcheck with the command I'm running (listed below). cryptcheck does indeed detect that the container files have changed, I'm just not sure how to incorporate it into my backup script (ie. do I parse the output of cryptcheck for the file names, then do a copy of them to the remote? is it possible to keep this to a single command?).

What is your rclone version (output from rclone version)

rclone v1.52.2
- os/arch: darwin/amd64
- go version: go1.14.4

Which OS you are using and how many bits (eg Windows 7, 64 bit)

MacOS Catalina 10.15.5

Which cloud storage system are you using? (eg Google Drive)

DigitalOcean Spaces

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone -v -P sync /Users/me/Documents digitaloceanspace:spacename/Documents

A log from the command with the -vv flag

When I run the command above, it works normally but just doesn't notice the changed container files. It lists them in the output with: Size and modification time the same (differ by 0s, within tolerance 1ns)

If the size/modtime doesn't change and you are trying to copy it to a crypt remote, how would you know the source changed?

You'd have to figure out a way to identify changes and upload those as you can't use a checksum unless you get really programmatic and keep a copy of source and remote and compare.

If the size/modtime doesn't change and you are trying to copy it to a crypt remote, how would you know the source changed?

Because the MD5 hash of the source container changes when I update/change its content (sorry, just realizing that I didn't mention this in my post). This is why cryptcheck recognizes that the container files on source and remote are different (their MD5's are different). I just don't know how to programmatically incorporate that output into a way of copying or syncing the updated source container to the encrypted remote.

Right, but how do you know it changed it what I was asking. Not how do you know the remote does not match.

If you track the local, that's one way to do it.

I don't think it's going to be effective to run cryptchecks all the time check mdsums as that doesn't seem like it would scale well. How many files are you dealing with?

Right, but how do you know it changed it what I was asking. Not how do you know the remote does not match.

Ah I see. I know because I update it on the source manually (I think this is what you are asking?). I run sync to the remote daily, but I manually update this container maybe 1-3 times a month.

I don't think it's going to be effective to run cryptchecks all the time check mdsums as that doesn't seem like it would scale well. How many files are you dealing with?

Just one!

I love how elegant just running a single sync command daily is and I was hoping to somehow make that work with this container file too.

So if you are running it and you know it changed, how about just forcing it to transfer?

--ignore-times                         Don't skip files that match size and time - transfer all files

If it's only one file, that would work.

You also just script and grab the checksum of the file and validate if it changes and if changes upload it.

One file is not bad at all to work around.

It sounds like scripting this is my only option unfortunately.

I'm going to write a script that:

  1. Runs my normal sync (the one in my original post)
  2. Runs cryptcheck on this particular file (I don't want to store the MD5 locally)
  3. If cryptcheck says source and remote don't match....
  4. Copy the container from source to remote with --ignore-times

Does this sound like the best way to do this? I was hoping for something a little simpler, but this is okay :slight_smile:

And if this is the best way to do it, do I use copy or sync in step #4?

That sounds fine too.

I'd just use copy as we are talking about one file and sync can be destructive if things aren't what you'd expect.

I only use sync when I was to keep source and destination the same. With one file, copy is plenty fine and safer in case you mistype something.

Great, thanks for your help!

Not a problem! Happy we got something figured out as it's always fun to learn and see what people are doing as it teaches me more!

Not to drag this thread out, since I have a workable solution, but I am curious: is that something that makes sense as a feature for rclone? ie. allow --checksum to be used with crypt and encrypted remotes.

It would be a great addition in my case, but I obviously don't know everything that would go along with creating something like that.

Edit: looks like this has been discussed here: Make --checksum work with crypt remotes

And I think there's an open issue on Github here: https://github.com/rclone/rclone/issues/1712

Yep, once that issue gets solved, you'd have an easier work around.

There is also this: https://github.com/rclone/rclone/issues/3667

Which is probably my preferred approach over --cryptsum

Sorry missed the beginning of this thread...

Are you mounting the files? I've noticed that mount doesn't update the modtime of the files. One thing you could do is get whatever mounts the files to touch them.

Mount not updating the modtime of loopback mounted files is probably a bug in the linux kernel...

Are you mounting the files?

Yes.

One thing you could do is get whatever mounts the files to touch them.

This would update the mod time, true, but the tools I'm using can't do this unfortunately.

I think my only practical option right now is to run cryptcheck, parse those results, touch the changed files, then run my normal sync. This is slightly different than the steps I outlined previously but I think it's a better, more universal solution.

If you use cryptcheck from the latest beta then you won't need to parse the files as it now has some useful output

$ rclone cryptcheck --help

rclone cryptcheck checks a remote against a crypted remote. This is
the equivalent of running rclone check, but able to check the
checksums of the crypted remote.

For it to work the underlying remote of the cryptedremote must support
some kind of checksum.

It works by reading the nonce from each file on the cryptedremote: and
using that to encrypt each file on the remote:. It then checks the
checksum of the underlying file on the cryptedremote: against the
checksum of the file it has just encrypted.

Use it like this

rclone cryptcheck /path/to/files encryptedremote:path

You can use it like this also, but that will involve downloading all
the files in remote:path.

rclone cryptcheck remote:path encryptedremote:path

After it has run it will log the status of the encryptedremote:.

If you supply the --one-way flag, it will only check that files in
the source match the files in the destination, not the other way
around. This means that extra files in the destination that are not in
the source will not be detected.

The --differ, --missing-on-dst, --missing-on-src, --src-only
and --error flags write paths, one per line, to the file name (or
stdout if it is -) supplied. What they write is described in the
help below. For example --differ will write all paths which are
present on both the source and destination but different.

The --combined flag will write a file (or stdout) which contains all
file paths with a symbol and then a space and then the path to tell
you what happened to it. These are reminiscent of diff files.

  • = path means path was found in source and destination and was identical
  • - path means path was missing on the source, so only in the destination
  • + path means path was missing on the destination, so only in the source
  • * path means path was present in source and destination but different.
  • ! path means there was an error reading or hashing the source or dest.

Usage:
rclone cryptcheck remote:path cryptedremote:path [flags]

Flags:

      --combined string         Make a combined report of changes to this file
      --differ string           Report all non-matching files to this file
      --error string            Report all files with errors (hashing or reading) to this file
  -h, --help                    help for cryptcheck
      --match string            Report all matching files to this file
      --missing-on-dst string   Report all files missing from the destination to this file
      --missing-on-src string   Report all files missing from the source to this file
      --one-way                 Check one way only, source files must exist on remote

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.