Help with local encoding and foreign characters - illegal byte sequence

What is the problem you are having with rclone?

Getting "illegal byte sequence" with some asian characters. local file system is APFS, remote is box. I have experimented with --local-encoding and --box-encoding but dont seem to get different results

Run the command 'rclone version' and share the full output of the command.

rclone v1.65.0

  • os/version: darwin 14.4.1 (64 bit)
  • os/kernel: 23.4.0 (arm64)
  • os/type: darwin
  • os/arch: arm64 (ARMv8 compatible)
  • go/version: go1.21.4
  • go/linking: dynamic
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Box - but we are copying to local APFS

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync box:"xxx/" /Volumes/18\ TB\ WD\ RAID/Box/"xxx/" --retries 1 --ignore-errors --transfers=16 --config /Users/xxx/.config/rclone/rclone.conf -v -P --local-encoding "Slash,InvalidUtf8,Dot" --inplace -vvv

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[box]
type = box
token = XXX

A log from the command that you were trying to run with the -vv flag

2024/09/16 10:12:25 DEBUG : rclone: Version "v1.65.0" starting with parameters ["rclone" "sync" "box:xxx/" "/Volumes/18 TB WD RAID/xxx/" "--retries" "1" "--ignore-errors" "--transfers=16" "--config" "/Users/xxx/.config/rclone/rclone.conf" "-v" "-P" "--local-encoding" "Slash,InvalidUtf8,Dot" "--inplace" "-vvv"]
2024/09/16 10:12:31 DEBUG : Zack ￰゚ヘチ￰゚レモ￰゚ホハ.vcf: Need to transfer - File not found at Destination
2024/09/16 10:12:31 ERROR : Zack ￰゚ヘチ￰゚レモ￰゚ホハ.vcf: Failed to copy: open /Volumes/18 TB WD RAID/xxx/Zack ￰゚ヘチ￰゚レモ￰゚ホハ.vcf: illegal byte sequence
2024/09/16 10:12:31 ERROR : Attempt 1/1 failed with 1 errors and: open /Volumes/18 TB WD RAID/xxx/Zack ￰゚ヘチ￰゚レモ￰゚ホハ.vcf: illegal byte sequence

i also quickly attempted to create a new crypt local backend with the same results. i used --crypt-filename-encoding=base32768 which resulted in the same "illegal byte sequence" on the local backend.

hi,

might test without rclone, using OS, something like
touch '/Volumes/18 TB WD RAID/xxx/Zack ￰゚ヘチ￰゚レモ￰゚ホハ.vcf'

have you tested --local-unicode-normalization ?

that is an old version of rclone, might want to rclone selfupdate

good call with the touch. actually if i do that as you typed it works, but if i copy/paste from the output it does not. im thinking it may be some odd whitespace character in the file.

i can update rclone for sure and will now but doubt that addresses this.

also yes - i did try --local-unicode-normalization - it had no effect. when i did debug logging it didnt seem to trigger anything.

a lot of things happened around encoding on macOS since your old version. In general there is no point at all investigating problems with old rclone versions - it is waste of time.

not sure how to export what the character is, but it seems before the ゚characters there is something hidden.

sorry - yes i did updated to 1.68.0 with the same results

can you post a screenshot snippet of the filename from box website?

i have to work on that - the credentials i have actually dont have access to that path - only our rclone api connection.

i have tracked down the unicode character though which is throwing the error:

\uFFF0

nice, i was just going to ask if you could figure that out.

found this
U+FFF0 is not a valid unicode character

so in theory - shouldn't this flag address this character? --local-encoding "Slash,InvalidUtf8,Dot" ? i dont see this local encoding being triggered for this file whatsoever

i do not know. @kapitainsky knows about encodings and macos?

appreciate your help :slight_smile:

1 Like

Can you tell me where you see this character?

i actually dont see it - but when i copy/paste from the rclone error output and i try to delete character by character i see that there are 2 hidden characters in the string. i ended up going online and pasting into a unicode encoder and that is what popped up. not sure if there ius an easy command i can use in the terminal as well.

Zack ￰゚ヘチ￰゚レモ￰゚ホハ.vcf
\u005a\u0061\u0063\u006b\u0020\ufff0\uff9f\uff8d\uff81\ufff0\uff9f\uff9a\uff93\ufff0\uff9f\uff8e\uff8a\u002e\u0076\u0063\u0066

Thx. It is rather weird issue... but for sure we have a problem:)

Thank you - appreciate the help - its all foreign characters to me! :slight_smile:

let me know if there is anything more helpful i can provide as these characters are hard to copy.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.