Files sometimes corrupt after saving to an encrypted rclone mount (B2)

What is the problem you are having with rclone?

I'm mounting a crypted remote (Backblaze B2) via systemd. This works fine most of the time but sometimes a file gets corrupted (0 bytes) during a save.

What is your rclone version (output from rclone version)

$  rclone --version
rclone v1.55.0
- os/type: linux
- os/arch: amd64
- go/version: go1.16.2
- go/linking: static
- go/tags: cmount

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Linux Mint 20.1

Which cloud storage system are you using? (eg Google Drive)

Backblaze B2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

I'm just saving a file to a mounted, crypted Backblaze B2 remote.
The remote is mounted via systemd as follows:

[Unit]
Description=B2 encrypted remote
AssertPathIsDirectory=/mnt/rclone/crypt-b2-remote
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
ExecStart=/usr/bin/rclone mount \
        --config=/root/.config/rclone/rclone.conf \
        --allow-other \
        --cache-tmp-upload-path=/tmp/rclone/upload-crypt-b2-remote \
        --cache-chunk-path=/tmp/rclone/chunks-crypt-b2-remote \
        --cache-workers=8 \
        --cache-writes \
        --cache-dir=/tmp/rclone/vfs-crypt-b2-remote \
        --cache-db-path=/tmp/rclone/db-crypt-b2-remote \
        --vfs-cache-mode full \
        --checksum \
        --stats=0 \
        --checkers=16 \
        --dir-cache-time=30s \
        --cache-info-age=5m \
        --fast-list crypt-b2-remote:/ /mnt/rclone/crypt-b2-remote
ExecStop=/root/scripts/rclone_unmount.sh /mnt/rclone/crypt-b2-remote
Restart=always
RestartSec=10

[Install]
WantedBy=default.target

The rclone config contents with secrets removed.

[b2-remote]
type = b2
account = <b2-account>
key = <b2-key>
hard_delete = true

[crypt-b2-remote]
type = crypt
remote = b2-remote:/mybucket
filename_encryption = standard
directory_name_encryption = true
password = <1024-bit pwd>
password2 = <1024-bit salt>

A log from the command with the -vv flag

The remote is mounted via systemd, this is the output of the systemd log

May 18 20:33:01 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: vfs cache: failed to open item: internal error: item "testdir/mykeepass.kdbx" already open in the cache
May 18 20:33:01 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: Non-out-of-space error encountered during open
May 18 20:46:31 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: vfs cache: failed to open item: internal error: item "testdir/mykeepass.kdbx" already open in the cache
May 18 20:46:31 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: Non-out-of-space error encountered during open
May 18 20:52:49 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: vfs cache: failed to open item: internal error: item "testdir/mykeepass.kdbx" already open in the cache
May 18 20:52:49 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: Non-out-of-space error encountered during open
May 19 10:41:34 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: vfs cache: failed to open item: internal error: item "testdir/mykeepass.kdbx" already open in the cache
May 19 10:41:34 mylaptop rclone[1865]: ERROR : testdir/mykeepass.kdbx: Non-out-of-space error encountered during open

So as it seems it's the vfs-cache that's causing the issue: "already open in the cache".

Edit: fixed the output of "rclone --version"

You got a ton of flags for the cache backend, but you aren't using it.

All can be removed.

Does nothing on a mount and can be removed.

Add -vv and --log-file /tmp/rclone.log to your mount command and share the full log.

Ok, thanks, I removed them.
I thought they were also related to the vfs-cache and wanted to avoid issues in case I use multiple rclone mounts on the same system.

Allright, so that's probably the reason why I'm still having a high number of class C transactions with B2.
According to the docs this is a global flag so I thought it also applied to a mount.

Done. I'll report back with the full log when it happens again.

The mount specific flags are here:

https://rclone.org/commands/rclone_mount/

You probably want to give it a max size too:

--vfs-cache-max-size SizeSuffix      Max total size of objects in the cache. (default off)

Hmm, that looks like a bug.

Can you find a way to replicate that?

If I can replicate it here then I can fix it :slight_smile:

Hi Nick,

I know a reproducer would be helpful but unfortunately I haven't found a way to reproduce it yet.
I simply noticed it because at some point my entire Keepass database was suddenly gone, right after I added a new entry.
What happened is that after adding the new entry the kdbx file was auto-saved but got corrupted. Luckily I had a recent backup of the file and after restoring that backup, I re-added the new entry and this time the autosave worked just fine (as usual).
I regularly add or change entries to the Keepass database and from time to time it gets corrupted and I have to restore a backup.

Unfortunately that's the only info I have.

I had a look at the code and I think it might be a race between the cache cleaner and opening files, but I couldn't obviously see the problem.

I tried writing a program to make it happen but I didn't succeed yet.

Can you try with 1.55.1 please? That had an important fix in the VFS layer which might be relevant, but it might not so it would be worth running that to see if it fixes it.

If you could capture a log with -vv of this event happening then it will be nearly as good as a reproduction.

I appreciate the log might be a) large and b) have secret stuff in, so instead of posting a link here you might like to private message me a link or email it to nick@craig-wood.com

Thanks for taking your time to investigate the issue in more detail.

Actually I just upgraded to 1.55.1 because I experienced another issue related to an sftp backend which might be resolved in this latest version as well.

The -vv option is already enabled as well because that's what Animosity022 suggested in an earlier post.
If it happens again with 1.55.1 I will definitely provide you the detailed (-vv) log.

You may also want to harden your KeePass settings for File Input/Output:

Screenshot 2021-05-20 101604

@ncw it just happened again with v1.55.1

I'll pm you a link to the logfile.
For privacy reasons however I had to strip the logfile because it contained a lot of (very) sensitive customer data. So I removed all older, non-relevant log entries and kept only the entries around the time the issue happened.

Hi, thanks for the hint, but I'm using KeepassXC on Linux and as far as I know / can see it doesn't support file transactions, unfortunately.

Thanks

That looks fine. All I really need are the log lines that mention keepass and I think they are all there.

On first glance I see that keepass writes to a temporary file then renames it. I bet that is the cause of the problem - there are an unbelievable number of corner cases involved in renaming stuff!

I will investigate further and report back here.

Great.

Yes, AFAIK Keepass indeed writes to a temp file first.

Thanks!

not sure your exact use case but i depend heavily on keepass and try to be paranoid about it.
i would not use rclone mount, as rclone copy/sync/move is more reliable.
i want to get the database backed up as quickly and reliably as possible.
so each time the database is saved.

using keepass, i have a trigger that

  1. copies the database to a local backup server
  2. 7zip it with strong password and copy it to wasabi.
  3. copy the database to onedrive to another crypt
    each command use --backup-dir with a date.time stamp forever forward incremental backups
    and tweak the bucket polices to prevent deletions.

this would be a simplified example
rclone.exe copy "C:\Users\user01\AppData\Local\Temp\keepass\zip\keepass.20210317.100528.7z" "wasabicryptbt:en07/keepass/zip/backup" --backup-dir=wasabicryptbt:en07/keepass/rclone/archive/20210317.100528 --log-level=DEBUG --log-file=C:\data\rclone\logs\keepass_wasabicryptbt_files_zip\20210317.100528\rclone.log

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.