I have a OneDrive remote, with a crypt remote pointing to it. As per default, the crypt remote has directory name encryption enabled. I've noticed that creating directories in the crypt with certain names fails with the following error:
Failed to mkdir: failed to make directory: invalidRequest: The provided name cannot contain leading, or trailing, spaces.
This is affecting several directories, almost all with Japanese characters in their names. An example is "βγ«γΌγ" (without quotation marks).
Run the command 'rclone version' and share the full output of the command.
rclone v1.57.0
os/version: raspbian 10.11 (64 bit)
os/kernel: 5.10.103-v8+ (aarch64)
os/type: linux
os/arch: arm64
go/version: go1.17.2
go/linking: static
go/tags: none
Which cloud storage system are you using? (eg Google Drive)
OneDrive
The command you were trying to run (eg rclone copy /tmp remote:tmp)
rclone mkdir cryo:βγ«γΌγ
The rclone config contents with secrets removed.
[cuone]
type = onedrive
drive_type = business
[cucryo]
type = crypt
remote = one:O
filename_encryption = obfuscate
A log from the command with the -vv flag
2022/03/17 16:09:08 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "mkdir" "cucryo:βγ«γΌγ" "-vv"]
2022/03/17 16:09:08 DEBUG : Creating backend with remote "cucryo:βγ«γΌγ"
2022/03/17 16:09:08 DEBUG : Using config file from "/home/pi/.config/rclone/rclone.conf"
2022/03/17 16:09:08 DEBUG : Creating backend with remote "cuone:O/220.βγγ₯\u3000"
2022/03/17 16:09:12 DEBUG : Encrypted drive 'cucryo:βγ«γΌγ': Making directory
2022/03/17 16:09:14 ERROR : Attempt 1/3 failed with 1 errors and: failed to make directory: invalidRequest: The provided name cannot contain leading, or trailing, spaces.
2022/03/17 16:09:14 DEBUG : Encrypted drive 'cucryo:βγ«γΌγ': Making directory
2022/03/17 16:09:15 ERROR : Attempt 2/3 failed with 1 errors and: failed to make directory: invalidRequest: The provided name cannot contain leading, or trailing, spaces.
2022/03/17 16:09:15 DEBUG : Encrypted drive 'cucryo:βγ«γΌγ': Making directory
2022/03/17 16:09:16 ERROR : Attempt 3/3 failed with 1 errors and: failed to make directory: invalidRequest: The provided name cannot contain leading, or trailing, spaces.
2022/03/17 16:09:16 DEBUG : 5 go routines active
2022/03/17 16:09:16 Failed to mkdir: failed to make directory: invalidRequest: The provided name cannot contain leading, or trailing, spaces.
I did originally try to go with the standard filename encryption and didn't run into this issue then, but I was running into OneDrive's path length limit instead.
Thanks for the suggestion, after a little testing it looks like base32768 filename encoding does let me avoid running into the OneDrive path length limit. Still, I'm curious if the filename obfuscation producing spaces is considered a bug, and if it's feasible to fix?
It's not a space it is some sort of unicode equivalent...
It is unfortunate that obfuscate produces these but difficult to change in a backwards compatible way. What do you think @sweh ?
The other alternative would be to encode the trailing unicode space like we already encode a trailing normal space in the OneDrive backend. This is possible but has backwards compatibility implications too.
If it's the last character and newRune is a whitespace then quote the original rune instead.
However, as you say, that could break existing files. The deobfuscate routine would still work, but filename matching might break on services that allow whitespace endings.
Similarly expanding the OneDrive character kludge mechanism to handle unicode spaces might let filename matching work (we'd need to test all the variations of whitespace to see if OneDrive rejects them all; if so we know no existing file has that pattern! And we'd also need to test other backends also using that kludge...) ugh... that's getting complicated.
I guess it might be possible to add an advanced boolean obfuscate_whitespace (hmm, needs a better name!) which to allow the user to turn on the IsSpace(newRune) test in obfuscateSegment(). That'll run the risk of breaking files if the user specifically turns it on, but won't impact anything else.
EDIT: Thinking on it more; the theoretical correct place (maybe with a global "utf8_whitespace" option) would be in the filename kludging routine. But because this doesn't use a reversible mapping, it'd break the obfuscate routines (the rune expected wouldn't be the rune seen). So I think it'd have to be done as a layering violation in the obfuscate routines itself.
Aside: Windows actually does allow for whitespace at the end of file... in cygwin you can create a file "hello.txt " with a space at the end. But then windows explorer and CMD and similar just can't handle it! Stoopid Microsoft; stoopid!
Yes I think this could be changed in the obfuscate routines, but I think that is probably too much of a backwards compatible break.
With the benefit of hindsight we should have avoided those unicode space like characters!
Actually the encoding is reversible: Overview of cloud storage systems - it lets you store any filename on any backend so you can use onedrive to back up your linux computer and not have to worry about all the characters in filenames Windows doesn't allow.
We'd have to find new mapping characters for the various unicode spaces though. Currently we map ascii onto β in RightSpace escaping (There is an escaping mechanism if that character is in existing files).
My feeling is that this is the edgiest of edge cases though and I'm not sure it is worth fixing since there is a good workaround.
I wonder if there's an easier answer.. instead of replacing the trailing whitespace with an alternate character, add an extra non-space character. So that way we know it's not whitespace ended. Then Decode just needs to look for that extra character and remove it
Then we don't need to find 6 or 8 or whatever whitespace alternatives!
Ah, so make a "remove this character" escape sequence.
That would work for trailing unicode space at least and would be much easier than finding so many unicode equivalents.
That said, I think I'm going to park this issue in "too difficult for too little gain" as in general there is very little use of unicode spaces, and even less of them trailing a file name. The obfuscate thing is unfortunate but there is a good workaround using the specially designed for onedrive base32768 encoding scheme.