ERROR File name in UTF8 must be no more than 1024 bytes

What is the problem you are having with rclone?

Previously working backup to Backblaze B2 fails with error: ERROR File name in UTF8 must be no more than 1024 bytes. After the third such error the backup
fails. Takes about 10 minutes. I'm trying to backup about 1TB.

Yes, I understand that this is similar to an old forum post here. I tried what that old post suggested but I'm still in the dark.

I sure wish that error message told me WHICH file was causing trouble, or rclone
could munge filenames appropriately so this didn't happen or could be ignored. But I guess that's not possible?

I used the rclone lsf command to scan for long files and the longest found is only 300 characters. I'm using the --skip-symblinks switch so it shouldn't be a symblink issue? No? Yes?

The original post linked above says something about when he 'set the dest' he saw the problem. I didn't understand what it was he did at that point. Maybe I could do the same?

I'm seeking some advice.

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.2

  • os/version: freebsd 13.1-release-p7 (64 bit)
  • os/kernel: 13.1-release-p7 (amd64)
  • os/type: freebsd
  • os/arch: amd64
  • go/version: go1.21.3
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Backblaze B2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync --skip-links --b2-hard-delete --modify-window 1s --fast-list --min-age 15m --log-level DEBUG --log-file /root/rclone-debug.log /mnt/NADODATA secret:

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[remote]
type = b2
account = XXX
key = XXX
endpoint = 

[secret]
type = crypt
remote = remote:NADOBACK
filename_encryption = standard
password = XXX
password2 = XXX

A log from the command that you were trying to run with the -vv flag

2023/11/22 19:55:43 DEBUG : --min-age 15m0s to 2023-11-22 19:40:43.965102049 +0000 UTC m=-899.826410527
2023/11/22 19:55:43 DEBUG : rclone: Version "v1.64.2" starting with parameters ["install/rclone" "sync" "--skip-links" "--b2-hard-delete" "--modify-window" "1s" "--fast-list" "--min-age" "15m" "--log-level" "DEBUG" "--log-file" "/root/rclone-debug.log" "/mnt/NADODATA" "secret:"]
2023/11/22 19:55:43 DEBUG : Creating backend with remote "/mnt/NADODATA"
2023/11/22 19:55:43 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2023/11/22 19:55:43 DEBUG : local: detected overridden config - adding "{HK82T}" suffix to name
2023/11/22 19:55:43 DEBUG : fs cache: renaming cache item "/mnt/NADODATA" to be canonical "local{HK82T}:/mnt/NADODATA"
2023/11/22 19:55:43 DEBUG : Creating backend with remote "secret:"
2023/11/22 19:55:44 DEBUG : Creating backend with remote "remote:NADOBACK"
2023/11/22 19:55:44 DEBUG : remote: detected overridden config - adding "{jlU5h}" suffix to name
2023/11/22 19:55:44 DEBUG : fs cache: renaming cache item "remote:NADOBACK" to be canonical "remote{jlU5h}:NADOBACK"
2023/11/22 19:56:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       1m0.5s

2023/11/22 19:57:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       2m0.5s

2023/11/22 19:58:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       3m0.5s

2023/11/22 19:59:21 ERROR : Encrypted drive 'secret:': error reading destination root directory: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 19:59:21 DEBUG : Encrypted drive 'secret:': Waiting for checks to finish
2023/11/22 19:59:21 DEBUG : Encrypted drive 'secret:': Waiting for transfers to finish
2023/11/22 19:59:21 ERROR : Encrypted drive 'secret:': not deleting files as there were IO errors
2023/11/22 19:59:21 ERROR : Encrypted drive 'secret:': not deleting directories as there were IO errors
2023/11/22 19:59:21 ERROR : Attempt 1/3 failed with 1 errors and: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 19:59:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       4m0.5s

2023/11/22 20:00:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       5m0.5s

2023/11/22 20:01:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       6m0.5s

2023/11/22 20:02:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       7m0.5s

2023/11/22 20:03:01 ERROR : Encrypted drive 'secret:': error reading destination root directory: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 20:03:01 DEBUG : Encrypted drive 'secret:': Waiting for checks to finish
2023/11/22 20:03:01 DEBUG : Encrypted drive 'secret:': Waiting for transfers to finish
2023/11/22 20:03:01 ERROR : Encrypted drive 'secret:': not deleting files as there were IO errors
2023/11/22 20:03:01 ERROR : Encrypted drive 'secret:': not deleting directories as there were IO errors
2023/11/22 20:03:01 ERROR : Attempt 2/3 failed with 1 errors and: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 20:03:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       8m0.5s

2023/11/22 20:04:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       9m0.5s

2023/11/22 20:05:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:      10m0.5s

2023/11/22 20:06:29 ERROR : Encrypted drive 'secret:': error reading destination root directory: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 20:06:29 DEBUG : Encrypted drive 'secret:': Waiting for checks to finish
2023/11/22 20:06:29 DEBUG : Encrypted drive 'secret:': Waiting for transfers to finish
2023/11/22 20:06:29 ERROR : Encrypted drive 'secret:': not deleting files as there were IO errors
2023/11/22 20:06:29 ERROR : Encrypted drive 'secret:': not deleting directories as there were IO errors
2023/11/22 20:06:29 ERROR : Attempt 3/3 failed with 1 errors and: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 20:06:29 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:     10m45.8s

2023/11/22 20:06:29 DEBUG : 6 go routines active
2023/11/22 20:06:29 Failed to sync: File name in UTF8 must be no more than 1024 bytes (400 bad_request)

You are using crypt remote which means that your max file/path length is much shorter than your raw remote allowance. This is "cost" of names encryption.

/a/b/c/test.txt

when encrypted becomes on B2 remote:

NADOBACK/opudp9b9cvlcd6tc3v361rlh0c/n7m3943f52iafvug368ug93d44/pcvo5isl5l91bsqqrckn6cc220/8k3am05ki43ju5avbrahe3o4ck

You are also using default crypt names encoding base32 which is the safest but also the biggest offender in terms of "consuming" file/path length:)

If B2 allows case sensitive file names you could use base64 which is a bit better than base32 in terms of length.

You could also check if base32768 (the best but does not work for all remotes) is an option, run:

rclone test info --check-length --check-base32768 remote:test-info

and post output here.

In general for crypt remote - keep your path shallow and names short. If not sure how short - test to find limit yourself. Unfortunately there is no simple formula to calculate it.

Of course, the basic problem with using rclone for backup is that I don't control what the filenames are. And to compound matters, the error message doesn't tell me the offending filename. Rubbing salt in the wound is rclone's unwillingness to continue. This means that my backup is fragile - might suddenly stop working. If I was a ransomware author, I'd create a long filename and wait a few months... :wink: Is there any way to force rclone to. continue past an error like this, just skip over files with offending names, or to report the offending file?

Anyway, in the output below, is MaxFileLength the length of the leaf filename, or does it refer to the full path name? Length before encryption or after? I know I have many full path names in the 300 range plaintext, none anywhere near 600. Dunno what they are encrypted. They can't double in length, can they?

rclone test info --check-length --check-base32768 secret:
2023/11/23 12:48:34 EME operates on 1 to 128 block-cipher blocks, you passed 513
2023/11/23 12:48:34 EME operates on 1 to 128 block-cipher blocks, you passed 257
2023/11/23 12:48:34 EME operates on 1 to 128 block-cipher blocks, you passed 129
2023/11/23 12:48:37 EME operates on 1 to 128 block-cipher blocks, you passed 1025
2023/11/23 12:48:37 EME operates on 1 to 128 block-cipher blocks, you passed 513
2023/11/23 12:48:37 EME operates on 1 to 128 block-cipher blocks, you passed 257
2023/11/23 12:48:37 EME operates on 1 to 128 block-cipher blocks, you passed 129
2023/11/23 12:48:40 EME operates on 1 to 128 block-cipher blocks, you passed 1537
2023/11/23 12:48:40 EME operates on 1 to 128 block-cipher blocks, you passed 769
2023/11/23 12:48:40 EME operates on 1 to 128 block-cipher blocks, you passed 385
2023/11/23 12:48:40 EME operates on 1 to 128 block-cipher blocks, you passed 193
2023/11/23 12:48:43 EME operates on 1 to 128 block-cipher blocks, you passed 2049
2023/11/23 12:48:43 EME operates on 1 to 128 block-cipher blocks, you passed 1025
2023/11/23 12:48:43 EME operates on 1 to 128 block-cipher blocks, you passed 513
2023/11/23 12:48:43 EME operates on 1 to 128 block-cipher blocks, you passed 257
2023/11/23 12:48:43 EME operates on 1 to 128 block-cipher blocks, you passed 129
2023/11/23 12:50:28 NOTICE: Encrypted drive 'secret:test-base32768': 0 differences found
2023/11/23 12:50:28 NOTICE: Encrypted drive 'secret:test-base32768': 1028 hashes could not be checked
2023/11/23 12:50:28 NOTICE: Encrypted drive 'secret:test-base32768': 1028 matching files
// secret
maxFileLength = 639 // for 1 byte unicode characters
maxFileLength = 319 // for 2 byte unicode characters
maxFileLength = 213 // for 3 byte unicode characters
maxFileLength = 159 // for 4 byte unicode characters
base32768isOK = true // make sure maxFileLength for 2 byte unicode chars is the same as for 1 byte characters

Ok I do not think that base32768 is good option for B2 as they count Unicode in bytes and not characters.

Well.. this is the same problem with any backup software. What is important here rclone is not backup program. It is utility program to copy file from A to B. With few extra bells and whistles like crypt added for convenience of many simple tasks.

Yes - but like previous point - it always applies. rclone or not. You are system operator - you should plan your backup to accept max path length... and not blame tools:)

If you want to handle path/file names' length limits and logic you have to build a wrapper around rclone yourself.

And IMO if you need backup solution look at programs like kopia or restic (they use rclone too) or some commercial offerings.

It is only leaf length. It is raw B2 - no encryption.

Look at my example again... double in length? worst case is much worse:)

If I used kopia or restic and one day they broke because of some unanticipated limitation, could I blame them? Nope. Still on me. I get it. Yes, I understand the system admin's motto is "down not across" or at least "mea culpa".

In any case, so rclone has a filename limit. Got it. And it fails fatally if it hits a long filename, and it doesn't say what the offending file is. And it's my fault. If I want better behavior, I build defensive structure around rclone or use a tool that has this filtering built already.

Do I have it about right?

This is definitely wrong. I am sort of sure it used to show it in the past.... I will have to test it again.

The big problem with crypt names encryption design is that you can not easily plan what is your max path/file name length. It comes with benefits of how easy it is to use but at the cost.... It is a bit like EncFS...

It would be incredibly useful if rclone reported the offending file/path at the point it failed. I'm using log level DEBUG and nothing useful is reported. Is there a verbosity higher than DEBUG?

Also, any reason why there isn't a --ignore-offending-files option?
Let me guess. The offending file error is generated by the remote. Rclone can't "read" the error and interpret it as something that could be mitigated by skipping a file. Did I guess right?

Agree with you on this 100%. But still checking

BTW

Running it against crypt remote is pointless:) Unless you plan creating nested crypts:)

you should run it against B2 bucket/folder.

Ok there is some value in these results:) 1 byte characters name on this crypt remote max length is 629 characters.

Your example is a very small initial filename, 8 characters. I would expect there's some overhead in the encryption that disproportionately expands an 8 character name. Would a 300 character filename more than double in length? How did you generate that test example, by the way?

Correct. All is described in Crypt. But it means it is difficult to say what is max path allowed.

As sometime shorter one can be longer encrypted that long one.

I don't fully understand the second sentence, and I ran the command because you suggested it -- did I do it wrong?

I asked (based on your config file names):

Reading that quickly, it says it adds a salt. And pads up to a multiple of 16 bytes. So that's overhead. I'm unclear on why the encrypted name would expand more than that. Nevertheless, I get the point that there will be expansion and the full pathname length limit in plaintext might be a LOT less than 1024, and could easily be less than 128. Which is kinda' lame, but it is what it is. Mea culpa. :slight_smile:

I thought "test-info" on the right was a placeholder for some optional bucket name. It's something else? I'll re run it. Don't fully understand what I'm doing.

This is why I prefer kopia or restic for backup. They encapsulate everything in well defined files' based containers. So I do not have to worry what target remote supports - as long as it supports bare minimum to store containers.

Well now I know.

All rclone test info --check-length --check-base32768 goal is empirical test to establish:

  1. max files' name length
  2. support of all characters used by base32768 encoding

The goal is to see if given remote can use base32768 and if it makes sense.

Some remotes count max characters regardless if they are 1 or 2 bytes.

So e.g. instead of 255 x 5 bit base32 can squeeze we can have 255 x 15 bit which is mega difference.

I understand what it's testing. What I don't understand is the destination. What's different between just the remote name "secret:" and the remote name "secret:test-info"?

Anyway:

rclone test info --check-length --check-base32768 remote:test-info
2023/11/23 13:53:40 NOTICE: B2 bucket test-info path test-base32768: 0 differences found
2023/11/23 13:53:40 NOTICE: B2 bucket test-info path test-base32768: 1028 matching files
// remote
maxFileLength = 1024 // for 1 byte unicode characters
maxFileLength = 512 // for 2 byte unicode characters
maxFileLength = 341 // for 3 byte unicode characters
maxFileLength = 256 // for 4 byte unicode characters
base32768isOK = true // make sure maxFileLength for 2 byte unicode chars is the same as for 1 byte characters