Zero bytes files in VFS cache won't play (B2 backend)

What is the problem you are having with rclone?

I'm using fix-vfs-empty-dirs branch. Been working perfectly. A few days ago I added some 5-6GB files to the mount. I tried to play these files today, but the media player immediately errored out (a generic cannot play file error).

When I inspected Rclone's cache directory, I noticed that these files were no longer cached, and they were completely uploaded to B2. It was a little weird because other files (older and newer than these) were still cached. I thought "okay, it's a bug where rclone can't play files that were evicted from cache".

Then I looked closer, and noticed that rclone actually can read other files that are evicted. However, all of them take at least a few kb of space in the cache (or they aren't in the cache at all). On the other hand, the broken files show up as 0 bytes on disk in the cache dir. Not 4kb, not 1mb, just 0. Could this be the issue?

Extra detail: when I first ran into this, it seemed none of the files were working. I restarted the rclone mount. After that, only 0 bytes files weren't working, but others work. I can't tell if this was a one-time anomaly. Prior to adding these 5gb/6gb new files, everything always worked without issues.

Extra detail 2: If I just mount rclone on my Mac directly to B2 (no vfs), I can play these files just fine in media players. Only linux box with full-cache VFS has this issue.

Run the command 'rclone version' and share the full output of the command.

rclone v1.61.0-beta.6591.3f77e5648.fix-vfs-empty-dirs
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 5.15.0-58-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.19.3
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

B2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

/usr/bin/rclone mount B2:BucketName/Dir /home/user/MountedDir \
  --allow-other \
  --b2-chunk-size 50M \
  --b2-hard-delete \
  --buffer-size 64M \
  --bwlimit "07:00,1M:off 23:45,off" \
  --cache-dir /home/user/RCloneCache \
  --config /home/user/.config/rclone/rclone.conf \
  --dir-cache-time 87600h \
  --disable-http2 \
  --fast-list \
  --log-level DEBUG \
  --poll-interval 0 \
  --transfers 16 \
  --use-mmap \
  --vfs-cache-max-age 8760h \
  --vfs-cache-max-size 1400G \
  --vfs-cache-mode full \
  --vfs-write-back 15m \
  --vfs-read-ahead 200M \
  --vfs-read-chunk-size-limit 500M \
  --rc \
  --rc-no-auth

The rclone config contents with secrets removed.

[B2]
type = b2
account = [censored]
key = [censored]
endpoint =

A log from the command with the -vv flag

I am running the mount with debug log as a systemd unit, so there are thousands of lines in systemd journal. Hard to choose unless I know what to look for.

As a small experiment I tried to remove one of those zero bytes files from rclone's cache. After I did that, I was successfully able to play this file from the mount. I checked cache again, and the file now weighs a few hundred MBs (since it correctly grabbed the chunk from B2), and can continue working. Will remove the rest of these 0 byte stubs to make it work again, but wonder what caused these 0 byte files in the first place.

Before you remove them, can you find the vfsMeta file for the file and post it? It's a JSON file with info about the file.

I already removed them, but their meta is still there. They all look like this:

{
	"ModTime": "2023-01-16T17:21:02.018545388-05:00",
	"ATime": "2023-01-16T17:21:02.023691214-05:00",
	"Size": 0,
	"Rs": null,
	"Fingerprint": "",
	"Dirty": false
}

I haven't touched most of these files since removing stubs, so I don't think rclone touched their meta.

Interesting. That is a valid metadata for a zero length file.

The empty fingerprint is strange though - I don't think the fingerprint should ever be empty.

Do any other meta files have empty fingerprints?

Rclone should likely be rejecting metadata with an empty fingerprint when loading it off disk.

However we shouldn't be saving metadata like that in the first place - do you think you can make a repro where rclone saves metadata like that?

Ran this command:

grep -rnl 'RCloneCache/vfsMeta/' -e "\"Fingerprint\": \"\","

Found the same files that I had the issue with, and one old jpg. Nothing else. This is out of 3755 meta files, counted as follows:

find RCloneCache/vfsMeta -type f | wc -l
3755

I will try soon. It seems some anomaly occurred, so my hopes aren't high, but will give it a shot.

If you can that would be much appreciated :slight_smile:

Looks like we're in luck.

Repro summary

  • Mounted B2 with full vfs
  • Copied file.zip into it
  • Stopped the mount before write back had a chance to begin (default 5s)
  • The vfsMeta file has no fingerprint!

Note: my write-back when I encountered the issue was 15 minutes.


  • Started the mount again
  • Rclone set the fingerprint and uploaded the file (seems okay)
  • ls on the mount fails with "Input/output error"!

Note: It seems rclone doesn't recover gracefully when you stopped the mount with files pending write-back.


  • Restart the mount
  • ls works now

Commands

Assuming these exist:

  • B2 remote
  • Bucket/MyDir in B2 remote
  • /home/user
  • /home/user/.config/rclone/rclone.conf
  • /home/user/file30mb.zip

In first terminal:

cd /home/user
mkdir foo
mkdir foocache
rclone mount B2:Bucket/MyDir /home/user/foo --allow-other --b2-chunk-size 50M --b2-hard-delete --buffer-size 64M --cache-dir /home/user/foocache --config /home/user/.config/rclone/rclone.conf --dir-cache-time 87600h --disable-http2 --fast-list --log-level DEBUG --poll-interval 0 --transfers 16 --use-mmap --vfs-cache-max-age 8760h --vfs-cache-max-size 1400G --vfs-cache-mode full --vfs-read-ahead 200M --vfs-read-chunk-size-limit 500M

In second terminal:

cp file30mb.zip foo/

In first terminal QUICKLY:

Ctrl+C

Inspect /home/user/foocache for missing fingerprints.
Then mount again (same rclone mount command).

Try to list directory:

ls foo/
ls: reading directory 'foo/': Input/output error

The rclone log will seem to have no issues when reading:

2023/01/21 14:57:54 DEBUG : /: Attr:
2023/01/21 14:57:54 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2023/01/21 14:57:54 DEBUG : /: ReadDirAll:
2023/01/21 14:57:54 DEBUG : /: >ReadDirAll: item=12, err=<nil>

Remount rclone again, ls works now.

Let me know if any questions. :slight_smile:

@ncw Posting a reminder just in case this slipped through the cracks. (Not trying to rush you at all). If you don’t want me to remind occasionally, let me know.

Sorry been busy with other stuff!

I can only replicate this problem with the fix-vfs-empty-dirs - it works fine on master. So it must be a bug in that patch :frowning:

You are right that the IO error isn't returned from rclone.

2023/01/31 17:58:49 DEBUG : /: Attr: 
2023/01/31 17:58:49 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2023/01/31 17:58:49 DEBUG : /: ReadDirAll: 
2023/01/31 17:58:49 DEBUG : /: >ReadDirAll: item=3, err=<nil>

I wonder if this is because rclone created the root directory behind the kernel's back.

Let's test that - I'll try the experiment again with the MyDir already existing - no difference.

OK Let's try the experiment with rclone cmount which uses libfuse directly - no difference and also no errors in the log.

I grepped the rclone code and there is no way it can be returning EIO (Input/output error) without writing a log, so I think this must be coming from the kernel.

I think what is happening is that rclone is adding the root directory to the root directory which is confusing the kernel into giving EIO.

Try this

v1.62.0-beta.6712.64a9c7b5c.fix-vfs-empty-dirs on branch fix-vfs-empty-dirs (uploaded in 15-30 mins)

1 Like

@ncw Appreciate that you rebased the branch. :hugs: I'm living on it, so you just gave me an update.

Tried this build, and it no longer errors-out on ls after remount. :+1: (Albeit the "no fingerprint" vfsCache record is still produced.)

One question β€” does write-back make rclone wait before deleting a file from a remote?

  1. I waited for foo/file30mb.zip to be uploaded to B2
  2. I ran rm foo/file30mb.zip (saw in log that it was removed from cache)
  3. I restarted the mount
  4. The file was there again
  5. I rm'ed it again and waited longer
  6. I restarted the mount and it was gone now

Seems like if I didn't rm it the second time and waited longer, it would've never gotten removed? :thinking:

What should happen is that if a file is being uploaded, the upload is cancelled and deleted locally, otherwise it is deleted remotely.

I'd like to see logs of this... If the file was removed from the cache then it shouldn't capable of being uploaded and rclone should have removed it from the remote.

If the file was removed from the cache then it shouldn't capable of being uploaded

I waited purposefully for it to be uploaded first. So it was entirely on B2 when I did the rm. I suspect I possibly interrupted rclone before it got a chance to run rm on the remote. (It only did rm in cache).

It should rm both the cache and the remote file - if you can see it not doing that then a log of it will help me track down the problem.

Hm, I can't reproduce the rm issue anymore. :man_shrugging:

If you do manage to in the future, let me know!