Rclone mount crashing during parallel directory listing operations with `Assertion failed: (tmpin > *in), function _citrus_iconv _ std_iconv_convert`

UPDATE (see below for more details)

The root cause of this seems to be related to parallel directory listing/file stat operations. Reproducing it requires parallelism. See my update below.

What is the problem you are having with rclone?

rclone mount is non-deterministically crashing during file listing operations for my putio backend mounted with rclone mount. The crash seems non-deterministic but I can semi-consistently trigger the crash by running fd . some_folder where some_folder is a folder on putio with many files/folders in it.

Primary error message: Assertion failed: (tmpin > *in), function _citrus_iconv_std_iconv_convert, file citrus_iconv_std.c, line 1059.. I believe that specific code can be found online here.

Run the command 'rclone version' and share the full output of the command.

rclone v1.65.0
- os/version: darwin 14.1 (64 bit)
- os/kernel: 23.1.0 (arm64)
- os/type: darwin
- os/arch: arm64 (ARMv8 compatible)
- go/version: go1.21.4
- go/linking: dynamic
- go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

putio

The command you were trying to run (eg rclone copy /tmp remote:tmp)

my putio backend is being mounted with rclone with this command:

rclone mount -v --rc --rc-no-auth --rc-addr=localhost:5573 --vfs-read-ahead=2M --vfs-cache-mode full --vfs-read-chunk-size=20M --vfs-read-chunk-size-limit=1G --vfs-cache-max-age=4800h --vfs-cache-max-size=130G --volname putio_mount_direct --fast-list --dir-cache-time 36h --checkers 10 --timeout 10s --contimeout 30s putio:/ /Users/chris/putio_mount_direct

The rclone config contents with secrets removed.

My putio configuration from my rclone.conf

* [putio]
* type = putio
  token = {"access_token":"xxxx","expiry":"0001-01-01T00:00:00Z"}

A log from the command with the -vv flag

The key error is:

Assertion failed: (tmpin > *in), function _citrus_iconv_std_iconv_convert, file citrus_iconv_std.c, line 1059.

Any idea's at all what could be causing this or ideas for workarounds?

Edit: Self-investigation report

Looking at this more, the assertion seems to be in the macOS implementation of iconv. Specifically this line of code: https://github.com/apple-open-source/macos/blob/414fd262e186f544ada6544ce90c0d265ec70834/libiconv/libiconv_modules/iconv_std/citrus_iconv_std.c#L1059

It seems like running repeat 100 fd . > /dev/null in a rclone mounted directory with sub-directories will consistently trigger the crash. However, if I disabled parallelism by running repeat 100 fd --threads 1 . > /dev/null the crash never occurs. Similarly, running find . or tree don't trigger a crash presumably because they're single threaded. Introducing parallelism with repeat 100 { find . & } does indeed cause the crash, so it's not just something fd specific.

So I think the root cause is something related to parallelism around directory listing. That makes sense because I initially encountered this crash a few times when I wasn't running fd, but I must have been doing some sort of other parallel directory listing/stat operations.

Behind the scenes rclone adds this option to your fuse mount

-o modules=iconv,from_code=UTF-8,to_code=UTF-8-MAC

And it looks very much like the iconv code isn't thread safe or it isn't being called in a thread safe manner.

To test this you can try adding

-o modules=iconv,from_code=UTF-8,to_code=UTF-8

Which effectively disables the iconv code and will work find provided you don't have accented characters in your file names.

I had a brief look through the iconv code and it doesn't look obviously non thread safe, so I wonder if there is a problem with the calling code - namely libfuse (provided by macfuse) or possibly cgofuse.

I'm reasonably sure that this isn't a problem in the rclone code though!

Did you try upgrading macfuse?

If you wanted to dig in to this then trying to replicate it with one of macfuse example file systems (eg sshfs) would be what I'd do next. If you can replicate then you can report a bug.

1 Like

Thank you so much for the detailed reply! Your analysis makes sense to me. Though I'm a heavy daily rclone user so it's odd that I never ran into this before, but I recently switched to a M3 Apple Silicon machine, and upgraded from a very old version of macFUSE, so likely one of those changes accounts for this new bug popping up. Or possibly it's always existed and the increased core count and higher speed makes this bug occur much more often.

I did try using -o modules=iconv,from_code=UTF-8,to_code=UTF-8 like you recommend, but the crash still occurs the same way so it doesn't seem to be disabling iconv for some reason. I confirmed with the -vv logs that this option is indeed being passed:

DEBUG : Putio root 'some_folder': Mounting with options: ["-o" "attr_timeout=1" "-o" "fsname=putio:some_folder" ... "-o" "modules=iconv,from_code=UTF-8,to_code=UTF-8"]

I seem to already be running the most recent 4.5.0 version of macFUSE:

$ plutil -p /Library/Frameworks/macFUSE.framework/**/Info.plist | grep "CFBundleVersion"
  "CFBundleVersion" => "4.5.0"

I'll look into trying to replicate this bug with sshfs like you recommended and will report this to macFUSE if I can repro. Thanks again for your help! Narrowing down the cause makes this easy to avoid. In the mean time I'll just setup up an alias to default to using fd --threads=1 and I suspect with that most of the crash inducing parallelism will be avoided.

Cheers!

There is probably a better way of doing that, but it might involve commenting it out in the source code...

You could try just -o modules=iconv or comment out these lines in the code

1 Like

I've confirmed that commenting this out has prevented the race condition crash from happening! Thanks again. (Though odd that that !findOptions check isn't behaving like it seems it should)

EDIT: Actually it looks like my self-compiled rclone binary (made with just go build) isn't using cmount/FUSE. I made the modifications on v1.65.0 and the mount command there appears to seamlessly fallback to using a local NFS network mount instead of FUSE. That explains why I don't see the debug messages I added in mountOptions because I assume that code path isn't even being ran.

I'll try figuring out how to compile rclone in a way that it actually uses FUSE.

EDIT2: Got rclone to use FUSE (go build -tags cmount) and I can confirm that commenting out that code fixes the issue! And now I actually see my debug messages. Though this journey has made me quite optimistic about the future where Apple finally kills kernel extensions and rclone has to rely on local NFS mounts instead. They seem to work quite well.

Great.

There isn't a way of saying no iconv please at the moment and there probably should be!

That is good to know :slight_smile:

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.