Possible special character encoding issue on macOS

This is something I've spent quite a lot of time on in the past!

The problem is that macOS stores its file names in unicode NFD format rather than the format everyone else uses which is NFC.

This is the difference between the two forms of the Tést.txt file.

All the cloud providers (and in fact everyone else in the entire universe) uses NFC format. This is the é \xc3\xa9 format rather than the NFD format which is e\xcc\x81. rclone copy goes to some effort to match the two types of normalization up. rclone mount doesn't though.

What -o modules=iconv,from_code=UTF-8,to_code=UTF-8 does is tells fuse not to touch the UTF-8 format rclone uses.

The default here is -o modules=iconv,from_code=UTF-8,to_code=UTF-8-MAC which tells fuse to convert the UTF-8 rclone uses into NFD UTF-8 which macOS likes.

This used to work fine! However I believe that newer macOS don't actually need the NFD form any more or something has changed in macFUSE.

Note in your example above

ls gave the file name as Tést.txt which is 54 65 cc 81 73 74 2e 74 78 74 which is NFD but you typed
Tést.txt which is 54 c3 a9 73 74 2e 74 78 74 which is NFC. I think if you'd cut and pasted exactly what you got from ls it would have worked.

That is macOS doing the changing, not rclone.

Maybe rclone should be doing the NFD->NFC itself in rclone mount on macOS so you can use either normalisation.

Anyway this is a can of worms which you thank Apple for!

1 Like