ERROR File name in UTF8 must be no more than 1024 bytes

If this maxFileLength = 1024 refers to the whole path encrypted, I can see how a 300 character path could exceed this (because all the little path components get bloated many times over), after encryption. But if the 1024 refers to the leaf filename, I don't see how the expansion could be very much. A long leaf filename wouldn't more than double in size.

Anyway.

  1. The key question is - why rclone does not report path/file name when max length is exceeded for crypt remote.

  2. Can such errors be ignored.

I do not have answers - at least yet. So anybody with some ideas please jump in

Thanks for taking the time to look at this with me.

In the meantime until we have better answers you can find the longest encrypted names manually:

  1. get all source file names (including their path)
    rclone lsf /mnt/NADODATA --recursive --files-only > all_names.txt

  2. Then you can loop over all_names.txt and generate encrypted values:
    rclone backend encode secret: "$line"

This way it should be easy to identify which file(s) are causing problem.

Looking again at the original debug log I start to think that it is not a problem with transferring files with path length exceeding B2 limits. During attempted transfer we would have file name logged at least.

It looks for me like error originates in checking phase:

2023/11/22 19:59:21 ERROR : Encrypted drive 'secret:': error reading destination root directory: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/22 19:59:21 DEBUG : Encrypted drive 'secret:': Waiting for checks to finish

which is weird.

@ncw may I ask you to have a look?

That would be nice wouldn't it!

I think what is happening is that rclone is looking to see if a directory exists and the B2 API returns that error File name in UTF8 must be no more than 1024 bytes.

I think that it is talking about the full path length, not just the length of the file name - check the docs for more info.

If you run your command with -vv --dump headers --retries 1 you'll hopefully see the file name in question in the HTTP transaction that fails.

if you want to find the longest path in the source you can do it like this

rclone lsf --skip-links -R /mnt/NADODATA | awk '{ if (length($0) > max) {max = length($0); line = $0} } END { print line }'
1 Like

Thanks to you and @ kapitainsky and @ncw for considering my issue.

I rewrote my rclone script to recurse one level into the tree and run rclone separately on each first level subdirectory. I ran this over the thanksgiving holiday. It had two benefits. One was that it got the backup done for most of the data, allowing me to better enjoy the turkey, and two, it identified the subtree with (supposedly) the problem pathname.

I've previously run the rclone lsf scan to determine the longest full pathname, which works out to be 313 characters -- that's the full path, so the max leaf filename necessarily is shorter than this. And I've recently seen that this 313 character max-length full pathname is in a subtree that backs up successfully, not the one that gives the error, ergo, this "longest" pathname is not the problem. I also found the longest leaf filename, 200 characters, which also occurs on a subtree that rclones with no error. And my longest number of path segments (25) is also on a subtree that rclones without error.

So the neither the longest filename nor the longest pathname is causing this problem. I suppose I could figure out what the longest encrypted filename is, but it seems to me this must the encryption of the longest filename? Not true with the longest encrypted full path name, of course, because something like /x/x/x/x/x/x/x/x/ could grow a lot more than /xxxxxxxx. Perhaps I'll do some awk magic to find this,

Or maybe I'll try recursing deeper into the tree to try to narrow down where the error is occurring. I sure wish the error message was more specific.

I tried this. Sadly I remain in the dark. I can't post the whole log as it's too long, but it ends this way...

2023/11/26 15:30:00 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/11/26 15:30:00 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/11/26 15:30:00 DEBUG : HTTP REQUEST (req 0xc016b55800)
2023/11/26 15:30:00 DEBUG : POST /b2api/v1/b2_list_file_names HTTP/1.1
Host: api001.backblazeb2.com
User-Agent: rclone/v1.64.2
Content-Length: 980
Authorization: XXXX
Content-Type: application/json
Accept-Encoding: gzip

2023/11/26 15:30:00 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/11/26 15:30:00 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/11/26 15:30:00 DEBUG : HTTP RESPONSE (req 0xc016b55800)
2023/11/26 15:30:00 DEBUG : HTTP/1.1 200 
Content-Length: 1390852
Cache-Control: max-age=0, no-cache, no-store
Content-Type: application/json;charset=UTF-8
Date: Sun, 26 Nov 2023 15:30:00 GMT

2023/11/26 15:30:00 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/11/26 15:30:01 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/11/26 15:30:01 DEBUG : HTTP REQUEST (req 0xc015fa4d00)
2023/11/26 15:30:01 DEBUG : POST /b2api/v1/b2_list_file_names HTTP/1.1
Host: api001.backblazeb2.com
User-Agent: rclone/v1.64.2
Content-Length: 980
Authorization: XXXX
Content-Type: application/json
Accept-Encoding: gzip

2023/11/26 15:30:01 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2023/11/26 15:30:01 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/11/26 15:30:01 DEBUG : HTTP RESPONSE (req 0xc015fa4d00)
2023/11/26 15:30:01 DEBUG : HTTP/1.1 400 
Connection: close
Content-Length: 110
Cache-Control: max-age=0, no-cache, no-store
Content-Type: application/json;charset=utf-8
Date: Sun, 26 Nov 2023 15:30:01 GMT

2023/11/26 15:30:01 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2023/11/26 15:30:01 ERROR : Encrypted drive 'secret:backup': error reading destination root directory: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/26 15:30:01 DEBUG : Encrypted drive 'secret:backup': Waiting for checks to finish
2023/11/26 15:30:01 DEBUG : Encrypted drive 'secret:backup': Waiting for transfers to finish
2023/11/26 15:30:01 ERROR : Encrypted drive 'secret:backup': not deleting files as there were IO errors
2023/11/26 15:30:01 ERROR : Encrypted drive 'secret:backup': not deleting directories as there were IO errors
2023/11/26 15:30:01 ERROR : Attempt 1/1 failed with 1 errors and: File name in UTF8 must be no more than 1024 bytes (400 bad_request)
2023/11/26 15:30:01 INFO  : 

By doing separate rclones of both the first and second level directories, I've been able to complete 99% of the rclone (important for reducing stress about stale backups) and narrow the problem down to a specific subtree.

Interestingly, the offending subtree happens to be the same subtree where the rclone is running and writing its log. Since this rclone used to work just fine, it can't be anything inherent about this self reference causing the problem. Nevertheless, it's curious. I'll keep recursing into the subtrees to find the problem unless anyone has a better idea.

1 Like

It looks like that is the problem. b2 have returned a 400 error which is where I guess that error is coming from.

If you do --dump bodies --retries 1 --dry-run then you'll see the bad file name in that request.

I added --dry-run as you don't want the actual files in your log.

That was the trick. Got it!

2023/11/27 19:02:52 ERROR : BACKBLAZE2/root/mnt/NADODATA/backup/jails/BACKBLAZE2/root/mnt/NADODATA/backup/jails/BACKBLAZE2/root/mnt/NADODATA/backup/jails/BACKBLAZE2/root/mnt/NADODATA/backup/jails/BACKBLAZE2/root/mnt/NADODATA/backup/hwsrv/var/lib/dpkg/info: error reading destination directory: File name in UTF8 must be no more than 1024 bytes (400 bad_request)

Clearly a copy loop of some sort created by some botched command in the past -- probably by me. That's why the second rule of sysadmining is: never do anything.

But this is easy enough to investigate, isolate, and fix, now that I know where the heck it is.

Thank you @ncw for showing me the detailed arcana of the dump command to reveal these hidden error details. Maybe some future update of rclone could present these details more directly in the error message?

And thanks to @kapitainsky for your first reply.

Some lessons learned.

As previously concluded, the longest encrypted path isn't necessarily the longest unencrypted path, nor is it the path with the most segments. Some middling path with many enough segments and long enough segment names can end up longest after encryption. That's why the plaintext scans didn't reveal anything.

I'm keeping these multiple rclones spread across the top level. They seem like a win and are zero work now that I wrote a script to do them. Until rclone someday develops the ability work past this long filename error, It can somewhat work past the error by using multiple rclones at the top of the tree and reporting any error for human investigation. That will help buy time, should this issue, or a different one, come up in the future years. I think that's good enough in my situation to leave at that and allow me to go back to doing nothing.

LOG_LEVEL=INFO
for PATHNAME in "$SOURCE"/*/; do
    echo "**** Begin RCLONE subdirectory: $PATHNAME"
    SUBFOLDER=$(basename "$PATHNAME")
    LOG="/var/log/rclone/rclone-$SUBFOLDER.log"
    rclone sync --skip-links --b2-hard-delete --modify-window 1s \
      --fast-list --min-age 15m --log-level $LOG_LEVEL \
      --log-file ${LOG}  $SOURCE/$SUBFOLDER secret:$SUBFOLDER
    if [ "$?" -ne "0" ]; then
       tail $LOG
       echo "ERROR in $SUBFOLDER"
       cat $LOG | mail -s "ERROR IN RCLONE"  adminemergency@mydomain.com
    else
       tail -6 $LOG
       echo "Completed OK"
    fi
    echo "**** End"
    echo ""

done

Again, thanks for your help.

C

Thx for sharing you script! It can be handy.

PS: it appears this path of doom is not on my source filesystem. It's on the remote. Nothing like it exists on my source. It must've been created on the remote by some botched copy loop in the past.

That would also explain why scanning the source tree for long filenames didn't turn up anything.

Indeed, I can see the copy loop here on the remote

# rclone lsd secret:backup/jails/BACKBLAZE2/root/mnt/NADODATA/backup/jails/BACKBLAZE2
          -1 2000-01-01 00:00:00        -1 root

At some point in the past, /mnt was included in a copy that created an endless directory loop on the remote.

I'm running the rclone purge command now. Boy is it slow. I guess each leaf in this immense errant subtree needs to be unlinked individually.

:rofl:

You would have found it with the lsf command if it was on the source.

Did it work OK?

Yep. The purge took an hour or so, but the /mnt/path/of/doom/path/of/doom/path/of/doom/path/of/doom.. loop was eventually deleted. With that gone, rclone seems to be running fine.

Thanks again for the help.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.