Some characters seem not to be handled in passwords

What is the problem you are having with rclone?

I have set up a ProtonDrive remote with a password (and a second factor, but I don't think it is relevant here). In the config process, I pasted the password Áf@É´.å4¹Tª6}ï>o"\TL generated by my password manager, which then produces the following error:

$ rclone lsd MyProtonRemote:
2023/11/07 10:25:40.429633 ERROR RESTY 401 GET https://mail.proton.me/api/core/v4/users: Invalid access token (Code=401, Status=401), Attempt 1
2023/11/07 10:25:40.556197 ERROR RESTY 422 POST https://mail.proton.me/api/auth/v4/refresh: Invalid refresh token (Code=10013, Status=422), Attempt 1
2023/11/07 10:25:42.134211 ERROR RESTY 422 POST https://mail.proton.me/api/auth/v4: The password is not correct. Please try again with a different password. (Code=8002, Status=422), Attempt 1:
Failed to create file system for "MyProtonRemote:": couldn't initialize a new proton drive instance: 422 POST https://mail.proton.me/api/auth/v4: The password is not correct. Please try again with a different password. (Code=8002, Status=422)

When editing the remote and updating the password with something containing only letters and digits (and updating the password in my Proton account, of course), the connection is accepted without error.

Run the command 'rclone version' and share the full output of the command.

$ rclone version                  
rclone v1.64.0
- os/version: debian kali-rolling (64 bit)
- os/kernel: 6.5.0-kali3-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.1
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

ProtonDrive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

$ rclone lsd MyProtonRemote:

The rclone config contents with secrets removed.

Please note I have willingly disclosed the password hash below, so that someone might compare the theoretical hash of the password Áf@É´.å4¹Tª6}ï>o"\TL with the actual one.

[MyProtonRemote]
type = protondrive
username = XXX
password = 7bh4ow42gEFwKdXlL3kqR9AkicZ9jN5tqzi75HjlcicVRcYJpI70PnSe
2fa = XXXXXX
client_uid = 
client_access_token = 
client_refresh_token = 
client_salted_key_pass =

A log from the command with the -vv flag

Please note that following my tests, Rclone has cached credentials with a password containing only bare latin letters and digits, which was working. So when reverting to the faulty password to produce the log below, it seems Rclone first tried the cached tokens/credentials, with has produced additional DEBUG logs that, I believe, may be ignored.

$ rclone lsd MyProtonRemote: -vv           
<7>DEBUG : rclone: Version "v1.64.0" starting with parameters ["rclone" "lsd" "MyProtonRemote:" "-vv"]
<7>DEBUG : rclone: systemd logging support activated
<7>DEBUG : Creating backend with remote "MyProtonRemote:"
<7>DEBUG : Using config file from "/home/fred/.config/rclone/rclone.conf"
<7>DEBUG : proton drive root link ID '': Has cached credentials
2023/11/07 11:42:12.365736 ERROR RESTY 401 GET https://mail.proton.me/api/core/v4/users: Invalid access token (Code=401, Status=401), Attempt 1
2023/11/07 11:42:12.523031 ERROR RESTY 422 POST https://mail.proton.me/api/auth/v4/refresh: Invalid refresh token (Code=10013, Status=422), Attempt 1
<7>DEBUG : Saving config "client_uid" in section "MyProtonRemote" of the config file
<7>DEBUG : Saving config "client_access_token" in section "MyProtonRemote" of the config file
<7>DEBUG : Saving config "client_refresh_token" in section "MyProtonRemote" of the config file
<7>DEBUG : Saving config "client_salted_key_pass" in section "MyProtonRemote" of the config file
<7>DEBUG : proton drive root link ID '': Cached credential doesn't work, clearing and using the fallback login method
<7>DEBUG : Saving config "client_uid" in section "MyProtonRemote" of the config file
<7>DEBUG : Saving config "client_access_token" in section "MyProtonRemote" of the config file
<7>DEBUG : Saving config "client_refresh_token" in section "MyProtonRemote" of the config file
<7>DEBUG : Saving config "client_salted_key_pass" in section "MyProtonRemote" of the config file
<7>DEBUG : proton drive root link ID '': couldn't initialize a new proton drive instance using cached credentials: failed to refresh auth: failed to refresh auth, de-auth: 422 POST https://mail.proton.me/api/auth/v4/refresh: Invalid refresh token (Code=10013, Status=422)
<7>DEBUG : proton drive root link ID '': Using username and password to log in
2023/11/07 11:42:14.090172 ERROR RESTY 422 POST https://mail.proton.me/api/auth/v4: The password is not correct. Please try again with a different password. (Code=8002, Status=422), Attempt 1
Failed to create file system for "MyProtonRemote:": couldn't initialize a new proton drive instance: 422 POST https://mail.proton.me/api/auth/v4: The password is not correct. Please try again with a different password. (Code=8002, Status=422)

Additional information

I am in an up-to-date Debian, and I use zsh in the terminal. When I paste the faulty password in the terminal it is displayed correctly.

It is possible that there is some bug somewhere preventing password like yours from working correctly. Maybe somebody will feel like trying to figure out where the issue is.

However it is not very smart to use unicode characters for passwords (nor usernames). Problem is that for example character Å could be encoded as byte sequence 0x212B, or 0x00C5, or 0x0041030A. Which one you get will depend on your operating system, your regional settings and the way how all software components handle unicode (all dep libs etc.). Meaning that results are unpredictable as hash of different bytes will produce different value.

IMO it is better to stick to alphanumeric and special characters from ASCII only. They always behave the same - always, no exceptions.
17 characters all ASCII password has 128 bits entropy - which is brute force unbreakable nowadays. If it is not enough 39 long will have 256 bits entropy.

I think that is exactly the problem.

Here are the two passwords - the one you entered a and one I recovered from the obscured password b.

>>> a='''Áf@É´.å4¹Tª6}ï>o"\TL'''
>>> b='''Áf@É ́.å41Ta6}ï>o"\TL'''

You can see differences in normalization.

I think rclone must be normalizing the password. I know it does this for config passwords you have to type in for exactly the reason @kapitainsky said above. Maybe it shouldn't be doing it for passwords in the config file though as the user never has to type those in.

Problem is that for example character Å could be encoded as byte sequence 0x212B, or 0x00C5, or 0x0041030A.

Is it still true if everything works in UTF-8? I have been keen on ensuring everything was UTF-8 on my system.

Moreover, I have like 250-300 passwords generated by my password manager with the same charset, and it's the first time I have this issue on Debian (I had it once in an Android app silently dismissing such a character while its web app counterpart would correctly handle those characters when setting the password or logging in).

Personally, I prefer to use "Extended ASCII" as many web sites prevent passwords that are more than 15 or 16 characters, or sometimes even less, and I aim to get 200 bits of entropy, just in case there would be a massive quantic breakthrough (possibly combined with AI) in the coming years. I don't really expect that to happen, but it's a matter of principle/posture and it's (usually) cheap.

Anyway, I changed my password to something that Rclone accepts. But I subsequently had to update the password in my 5 Android apps + 2 desktop apps + 2 pinned tab in Firefox. I thought it might happen to other users too.

Not sure I understand well what is the purpose of normalisation. A developer told me, for the previous similar issue mentioned above, that they want characters to exist on keyboards. I am not sure if I understood correctly, and if yes, if it is relevant, as password managers are having us more and more paste or pull passwords.

It is irrelevant - UTF-8 is Unicode Transformation Format using one-byte units - hence 8 in its name. It is only the way to encode Unicode characters in bytes' string.

The problem with Unicode in passwords is that some characters have many different code points - like Å example. Your password manager stores it and always uses the same one - so it should always work.

But when your password is transferred to other system/program it can use different code points - all will look exactly the same - Å will be Å but underlying bytes will be different.

I look at this also from very practical perspective - what if one day I will have to type Áf@É´.å4¹Tª6}ï>o"\TL by hand? On ssh terminal with different keyboard layout I am used to? Not possible I am afraid:)

I am a lazy person and I prefer not to create artificial problems. And Unicode in passwords is just such thing - no benefit and many potential issues.

I think you are correct @kapitainsky

However I think this handling of passswords for the config file is wrong - we should not be normalizing the passwords.

I note also that if you use rclone obscure to obscure the password then we do not normalize it - it is just rclone config which is normalizing the password, so there is definitely some inconsistency here.

@Freedim can you open a new issue on Github with a link to this forum post please and I'll think about how to make a fix - thanks!

1 Like

100% - up to the user to provide string - rclone should not touch it

I am not a very advanced user, so sorry if my questions are silly...

But when your password is transferred to other system/program it can use different code points - all will look exactly the same - Å will be Å but underlying bytes will be different.

Is this implying that transferring a string actually transfers symbols, from which the recipient has to figure out the code point? I would have expected the opposite: that transferring a string was technically transferring code points, that would then be represented by a symbol, potentially different from the intended one.

I look at this also from very practical perspective - what if one day I will have to type Áf@É´.å4¹Tª6}ï>o"\TL by hand? On ssh terminal with different keyboard layout I am used to? Not possible I am afraid:)

To me this is good, besides having more entropy for less characters. Indeed, one needs to be able to access my password manager and pull or copy/paste the password; a mere keyboard will not be enough to spoof me :wink:

Yes code points are transferred but then different systems convert them (normalise) based on their approach. There is more than one way to do it - four to be precise - and this ways are not always compatible. And it is nothing about programmers not paying enough attention or making mistakes (which happens too).

It is all a bit complicated... If you are interested here you are fantastic primer on the subject:

https://mcilloni.ovh/2023/07/23/unicode-is-hard/

all thanks to Unicode consortium:) But they try their best to maintain backward compatibility and at the same time to grow all unicode set. And like everything in life Unicode isn’t perfect.

Thanks for the link. I have read about charset many times. And everytime, including now with your link, I learn a lot. It seems it's an endless topic.

1 Like

@Freedim did you make an issue about this?

Yes:
https://github.com/rclone/rclone/issues/7407

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.