B2 mount becomes unreadable

What is the problem you are having with rclone?

Was watching a long video directly from a mounted B2 remote. After 30 minutes the video froze. Can't do anything. In my ubuntu box, rclone log is showing normal activity, VFS doing uploads, nothing wrong. I try to navigate to the video file via ls — it works. I try to read one line from the beginning of the video file using head -n1 filename — it hangs forever.

I restarted the mounts, and it started working again. I wanted to figure out what might've caused this and noticed that rclone was using ~7GB of ram. That seemed more than I expected with my mount config, so I went and reduced --transfers from 16 to 10, and --buffer-size from 64mb to 32mb. Reloaded the mount.

I continued watching the video, and this time also watched the RAM. RAM was rising fast and went up to 6.2GB, but then it stayed there and video kept working fine. But another 1.5hours later same thing. Video froze again.

Worth noting is that rclone has been uploading stuff to B2 this entire time, while I've been watching.

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.0
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 5.15.0-84-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.1
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

B2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Please let me know if I'm doing something unadvisable in these arguments.
Note: the time of bug was outside of the bwlimit 1M window.

/usr/bin/rclone mount B2:MyBucket/dir /path/to/mount \
  --allow-other \
  --b2-chunk-size 50M \
  --b2-hard-delete \
  --buffer-size 32M \
  --bwlimit "07:00,1M:off 23:45,off" \
  --cache-dir /home/user/RCloneCache \
  --config /home/user/.config/rclone/rclone.conf \
  --dir-cache-time 87600h \
  --disable-http2 \
  --fast-list \
  --log-level INFO \
  --poll-interval 0 \
  --transfers 10 \
  --use-mmap \
  --vfs-cache-max-age 8760h \
  --vfs-cache-max-size 1400G \
  --vfs-cache-mode full \
  --vfs-write-back 15m \
  --vfs-read-ahead 200M \
  --vfs-read-chunk-size-limit 500M \
  --rc --rc-no-auth

The rclone config contents with secrets removed.

[B2]
type = b2
account = *removed*
key = *removed*

A log from the command with the -vv flag

I was running with log level INFO, but now switched to DEBUG. I'm still leaving this report here, because I see that there was a lot of B2 refactoring, and mentions of hanging, in case this is already a known issue. I will update with DEBUG log if I catch it again.

Update: The last 20 mins of video finished without an issue, so will have to try reproducing this log at another time.

Its probably the uploads that ran you out of memory, that would be my guess.

You've set that smaller than the default which is probably a good idea.

However you can get --transfers * --b2-upload-concurrency of these in memory at once.

The default for --b2-upload-concurrency is unexpectedly high at 16, so you could be using 10 * 16 * 50M = 8GB of RAM!

Note that in 1.63 you could only have --transfers * --b2-chunk-size in progress at once, so that has definitely changed.

I'd probably change --b2-upload-concurrency to be 4 which will still fill your upload pipe I expect.

I think after the changes in v1.64 I need to reconsider the default value of --b2-upload-concurrency.

b2 recommend a chunk size of 100M which is what we use by default, but these chunks need to be stored in memory.

Thanks! I have 16GB of RAM here, so will try --b2-upload-concurrency 6, since I don't mind 3GB used for this.

--b2-chunk-size — I prefer it small-ish because my upload speed is ~4-5mbytes/s (~40mbits). So if I need to restart mounts for some reason, I don't have to wait too long for them to gracefully exit (they finish uploading a chunk and quit).

Will report back once I learn more.

Thanks - I look forward to your report!

I should probably do a systematic study of the best values for --b2-transfers and --b2-chunk-size since the internals of rclone have change significantly since then. And by default using 4 * 100M * 16 = 6.4GB is way too much memory.

I'll run a little test.

1 Like

Small issue just now — after I made these adjustments, I am getting slow download from B2. Another video being played, and keeps stopping, but then continues after a bit. Every minute or so. Network activity in the ubuntu box shows ~1-3mbytes/s from B2. The video is about 30GB. My internet has 1gig download (~120mbytes/s). I can't tell yet if it's B2's fault or RClone's, but don't remember B2 being this slow before.

  • The logs (DEBUG this time) show vfs cache constantly reading various ranges at various positions.
  • The RAM is staying at 4.9G.
  • No uploads seem to be in progress currently.

Made the following adjustments:

  • --b2-upload-concurrency from 6 to 4
  • --buffer-size from 32M to 50M
  • --transfers from 10 to 20
  • added new option: --multi-thread-streams set to 40

Now I'm getting kinda consistent ~25mb/s download, but let's see how RAM goes.

(I don't care about upload speeds as much as downloads).

Update: Interesting. systemctl status for this mount shows 8G memory usage, but top shows very little (almost negligible) usage for rclone or anything else (sorted by ram). Wonder if the way systemd counts memory is somehow cumulative, but actually mmap releases memory quickly.

Here is the results of a test I did. The results are a bit erratic but seem to show that a chunk size of 25M is enough to get most of the speed improvements and that concurrency 6 or 8 is good.

1000M multipart --chunk-size 5M   --b2-upload-concurrency 2    6.748 MiB/s
1000M multipart --chunk-size 10M  --b2-upload-concurrency 2    5.297 MiB/s
1000M multipart --chunk-size 25M  --b2-upload-concurrency 2    5.143 MiB/s
1000M multipart --chunk-size 50M  --b2-upload-concurrency 2   11.143 MiB/s
1000M multipart --chunk-size 100M --b2-upload-concurrency 2   12.753 MiB/s
1000M multipart --chunk-size 5M   --b2-upload-concurrency 3   13.064 MiB/s
1000M multipart --chunk-size 10M  --b2-upload-concurrency 3    4.849 MiB/s
1000M multipart --chunk-size 25M  --b2-upload-concurrency 3   29.885 MiB/s
1000M multipart --chunk-size 50M  --b2-upload-concurrency 3   13.473 MiB/s
1000M multipart --chunk-size 100M --b2-upload-concurrency 3   33.488 MiB/s
1000M multipart --chunk-size 5M   --b2-upload-concurrency 4    9.759 MiB/s
1000M multipart --chunk-size 10M  --b2-upload-concurrency 4    4.917 MiB/s
1000M multipart --chunk-size 25M  --b2-upload-concurrency 4   14.423 MiB/s
1000M multipart --chunk-size 50M  --b2-upload-concurrency 4   11.163 MiB/s
1000M multipart --chunk-size 100M --b2-upload-concurrency 4   25.351 MiB/s
1000M multipart --chunk-size 5M   --b2-upload-concurrency 6    8.326 MiB/s
1000M multipart --chunk-size 10M  --b2-upload-concurrency 6   12.887 MiB/s
1000M multipart --chunk-size 25M  --b2-upload-concurrency 6   62.486 MiB/s
1000M multipart --chunk-size 50M  --b2-upload-concurrency 6   33.309 MiB/s
1000M multipart --chunk-size 100M --b2-upload-concurrency 6   66.665 MiB/s
1000M multipart --chunk-size 5M   --b2-upload-concurrency 8    8.230 MiB/s
1000M multipart --chunk-size 10M  --b2-upload-concurrency 8   19.279 MiB/s
1000M multipart --chunk-size 25M  --b2-upload-concurrency 8   71.427 MiB/s
1000M multipart --chunk-size 50M  --b2-upload-concurrency 8   66.660 MiB/s
1000M multipart --chunk-size 100M --b2-upload-concurrency 8   36.113 MiB/s

Unfortunately the test didn't complete for reasons I'm investigating!

There is an extremely hand-wavy explanation here

Personally I'd trust the value in top, but I would agree with the article - memory in unix is complicated!

Familiar with this. Wish there was a pragmatic measure of "non-reallocatable memory that one should worry about".

Were you testing via mount/VFS, or individual commands? Curious if the reasons it didn't complete are related to the original post. I no longer think that RAM was the issue, because top never showed rclone using more than 10% of RAM even at peak systemctl memory stat, and I've since seen higher systemctl stat without rclone breaking. And because after reduction of transfers to 10, it failed the same way anyway, even though it would've had enough memory.

I was testing with rclone copy. What mount with --vfs-cache-mode writes or full does to upload a file is essentially identical.

The test didn't complete because of a deadlock I found in the b2 code :frowning:

Here is a fix for that.

v1.65.0-beta.7390.2a30c417a.fix-b2-upload-url on branch fix-b2-upload-url (uploaded in 15-30 mins)

1 Like

I think you have a typo in your smiley face. It should be :smiley: cause yay, you identified a potential deadlock!

The links are no longer right, I believe these are correct:

v1.65.0-beta.7391.55c3c221b.fix-b2-upload-url-lock on branch fix-b2-upload-url-lock.

I installed this version, let's see what happens.

P.S. Wrote a small bash script to easily install on ubuntu. Usage:

./install-rclone.sh v1.65.0-beta.7390.2a30c417a.fix-b2-upload-url

How did it go?

Nice! I could get rclone selfupdate to do this. At the moment it can't install a branch but no reason why it couldn't.

Haven't had a chance to test-drive it properly yet. The mounted full VFS had no issues so far. Saw a few B2 upload retries due to "500 internal errors", but seems like intermittent B2 issues. Might check a long video again tonight, will report back.

Sounds convenient, and seems like you make branch-name a part of version, so could it work with the existing --version flag?

1 Like

So far have not encountered issues, except this one (the entire 1.4T cache got invalidated) which came seemingly as a result of config changes I made here. Figured, deserves its own thread.

Hi,

Thanks for the fix. I can confirm it works. See my issue here: Rclone copy to B2 stuck for large file - #2 by Animosity022

Please merge :wink:

Best Regards,
Jorgensen

I've merged all the b2 fixes to master now which means it will be in the latest beta in 15-30 minutes and released in v1.64.1

I'm still on the fix-b2-upload-url-lock branch.

I just had my rclone mount crash with

  • --b2-upload-concurrency 4
  • --buffer-size 50M
  • --b2-chunk-size 50M
  • --transfers 20
  • --multi-thread-streams 40

After noticing 91% ram usage during uploads. After recovery from crash, the mount went back to uploads, and crashed again, taking up all available ram (16GB).

Is it because I added multiple big files to the mount at the same time? Is it also related to --multi-thread-streams?

Is there a way I can make uploads use a low concurrency, but downloads use high concurrency, without sacrificing too much ram?

And is there a way I can adjust config without invalidating my cache (keeping the canonical name the same)?

On re-reading docs, looks like --multi-thread-streams replaces --b2-upload-concurrency, making uploads be 40*20*50 = 40GB RAM :scream:.

All my questions still stand. I need high concurrency downloads, but not uploads. :thinking:

Update: After some research I will try the following change to the mount:

--b2-upload-concurrency 20
--buffer-size 50M
--b2-chunk-size 50M
--transfers 2
--multi-thread-streams 20

What I expect this to do:

When streaming a single video, it will probably use 20streams * 1transfer * 50M = ~1G RAM.

When uploading multiple files, it will max-out at 20streams * 2transfers * 50M = ~2G RAM.

Let's see if I'm right.

Update 2: And yes, unfortunately this invalidated the cache again. :frowning: I stopped the mount, manually moved the caches to the newly created dirs, and started it again. It seems to be working.

Update 3: Looks like my calculations were correct. Indeed, this uses 13% of ram (~2GB) for uploading. The key to understand for most people, is that having multiple --transfers is not that important, because that's transferring files in parallel. On top of transfers, big files are also split into pieces and transferred concurrently thanks to --multi-thread-streams / --[backend]-upload-concurrency.

Rclone doesn't use those buffers for downloads... It will be using --buffer-size to buffer in RAM - the rest will be on disk.

That sounds correct to me.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.