Filename too long

I did some tests and it turns out that the culprit isn't the crypt command, but when its run with chunking.

config is below:

This contains 100 char cyrpt and 3 char crypt.

[local]
type = local

[crpt100-local]
type = crypt
remote = local:
filename_encryption = standard
directory_name_encryption = true
password = xxx100
password2 = xxx100

[chnk-crpt100-local]
type = chunker
remote = crpt100-local:
chunk_size = 1M
hash_type = md5

[crpt3pw-local]
type = crypt
remote = local:
filename_encryption = standard
directory_name_encryption = true
password = xxx3
password2 = xxx3

[chnk-crpt3pw-local]
type = chunker
remote = crpt3pw-local:
chunk_size = 1M
hash_type = md5

The source leaf file is:

357A75YK6EO4IO3E3EY434VXHBHQJEEJG7HOBP3UZSY7ZU2JRXZX37X4YIFPNPKHB6TPVKNJMZW5NDMJ2266P47QJVFFAIXB6JC4ZB5F5LLS6VQ=

This is 112 char long.

when it goes through the 100 char crypt WITHOUT the chunker, there are no errors as below:

2019/11/04 22:54:27 INFO : d/33/BZTFK2NX4C4OQVFZZ7CQOKY2HCUVIR/357A75YK6EO4IO3E3EY434VXHBHQJEEJG7HOBP3UZSY7ZU2JRXZX37X4YIFPNPKHB6TPVKNJMZW5NDMJ2266P47QJVFFAIXB6JC4ZB5F5LLS6VQ=: Copied (new)

When the same source file goes through the 3 char crypt WITH chunker then we get the LONG filenames that make the errors as below

2019/11/04 22:56:36 ERROR : d/33/BZTFK2NX4C4OQVFZZ7CQOKY2HCUVIR/357A75YK6EO4IO3E3EY434VXHBHQJEEJG7HOBP3UZSY7ZU2JRXZX37X4YIFPNPKHB6TPVKNJMZW5NDMJ2266P47QJVFFAIXB6JC4ZB5F5LLS6VQ=: Failed to copy: open /mnt/data/sync/test-data-cyrptom/mrqupgirc9apqh4g34durgucnc/v16cb0h0k313v72o884aubs2nk/4rf2tlgbh9188jkssacnc3qfb8/bl481m1isan9v7c805riiakrv32honddfikc3rbs6gl27g0a67k0/mi46msgd6l3u161tukc1na4p9a1euih56d8toghq9rih1v92450efmbpnja1co0786s3gs5ngqvhpqe3tflikmo05bsq5n6egqunmcm84dsat0f112j3g63874eihkbcnfgvqkcogvhalnr531gu2ltos9p3nhada8df8h0356lmnm2otbnd74g37d95qhf8h78i0v325qcv5u9der1u88ve6hlk3r60ajnvqdk7eccbfonriqhdjllaq3ch4h3p: file name too long

The resulting encrypted leaf file is 256 char long. Other files are longer.

The commands used for the 3 pw char with chunker and crypt.

2019/11/04 22:56:28 DEBUG : rclone: Version "v1.50.0-016-g7bf05631-beta" starting with parameters ["rclone" "sync" "/mnt/data/sync/data-cryptom/" "chnk-crpt50-local:50/" "--fast-list" "-P" "-vv" "--log-file" "50.log"]

and for the 100 pw crypt without chunker =

2019/11/04 22:54:23 DEBUG : rclone: Version "v1.50.0-016-g7bf05631-beta" starting with parameters ["rclone" "sync" "/mnt/data/sync/data-cryptom/" "crpt100-local:100/" "-vv" "-P" "--log-file" "100.log"]

Maybe its the additional information that the chunker adds to filename, although it still shouldn't be that long?

did further test with chunker crypt.

original leaf name is 124 char

3TSFEBDYIT6R2HUB2UB23QD4LOOFLPVLW7GG6QGTCH5CSVYWM2MN6H6HFY6JDBX742Y5Z5KUVBKTD6QOOL7JJY4PF44ZTUQIQJBUAP2O7WBC5NUWC27HWIX7WYBQ

reducing chunk appendage to 3 char -## results in encrypted filename = 256 char

rclone sync /mnt/data/sync/data-cryptom/ chnk-crpt50-local:50b/ --fast-list -P --log-file 50b.log --chunker-name-format *-##

2019/11/04 23:30:12 ERROR : d/45/RG7TCUP4XFCRDQ6A3PHP3YEVXQ2526/3TSFEBDYIT6R2HUB2UB23QD4LOOFLPVLW7GG6QGTCH5CSVYWM2MN6H6HFY6JDBX742Y5Z5KUVBKTD6QOOL7JJY4PF44ZTUQIQJBUAP2O7WBC5NUWC27HWIX7WYBQ====: Failed to copy: open /mnt/data/sync/test-data-cyrptom/c92hs35rokdkqqatmveibrl5vc/v16cb0h0k313v72o884aubs2nk/4gbe0393a49rfoodn06l10hek8/nvk41r1guk0jfengvs77bbu2eno1k243jvqtboq9g8vb8okcasb0/4gdtkkk8okq958pbt61vcvscr5irccp5gdm2pmbld0866muu9m6tuu5ucoq2hfbagqqaclsdu6gjl95fg2sv0qistcdg6j55i85kpgio5jhv3d3r7a91hvi50tchg2meqsj8b0k9163vakgdcqkejio67cvdpvfuqovoe41l525af8efvao6qmpeehgekh30m82t6ru004n4ctojnmr7vekqr84n74s6fi61dv0us2n0e8csteaho7bs0debsurm: file name too long

using standard chunker appendage the encrypted filename = 282 char

rclone sync /mnt/data/sync/data-cryptom/ chnk-crpt50-local:50/ --fast-list -P --log-file 50a.log

2019/11/04 23:34:54 ERROR : d/45/RG7TCUP4XFCRDQ6A3PHP3YEVXQ2526/3TSFEBDYIT6R2HUB2UB23QD4LOOFLPVLW7GG6QGTCH5CSVYWM2MN6H6HFY6JDBX742Y5Z5KUVBKTD6QOOL7JJY4PF44ZTUQIQJBUAP2O7WBC5NUWC27HWIX7WYBQ====: Failed to copy: open /mnt/data/sync/test-data-cyrptom/mrqupgirc9apqh4g34durgucnc/v16cb0h0k313v72o884aubs2nk/4gbe0393a49rfoodn06l10hek8/nvk41r1guk0jfengvs77bbu2eno1k243jvqtboq9g8vb8okcasb0/s4c2qptfsdob2lm6dihd7hu7qlshnu2kpa6pklm2mg02nseklfho6ag0qibv5afdg9t5bncjbqslv7babe438ebi7sqe9gl7ti835g2lq5p1dd7e8hvjpfehd6ac2b1sact5m8qa7nu2mtbgahqodqah0kkg1955l812m0u6foja93l1b25nfs2udgq2v5aarqpqqdejqfaud1c5handtjpnf9n9iojplsc6dg08ec6i98nj1d61e62sutumrbova2j2jhtospcnb6i4lusia71gug: file name too long

Thanks for the test. This makes more sense at least :slight_smile:

Yes the chunker does effectively add a little more to the name, but it's a few chars right? (is it 4?, 5?)
I can't quite see why that would make the encrypted leaf name jump up so drastically. The chars chunker appends to the name should be no different to any other chars (ie. a leaf-name that was 128char to begin with but not using chunker) - and that should still be well within your limits.

Unfortunately I have extremely limited practical experience with chunker yet. It is still very new and I have literally tested it once. I only really know the main principles it operates by. That kind of makes it difficult to say exactly what the issue might be if it relates to that backend.

It almost sounds like the names are getting translated to unicode in chunker and suddenly take up twice the space or something. There is just too much of an increase here to account for.

NCW gave you the formula for the encrypted name length above. If you input the number of chars in one of your chunker output files (pre encrypt) - does it come out right? I kind of expect it wont based on your data.

If you jump over a 16 byte boundary then you'll suddenly get 16*8/5 = 26 more characters in your file name. That might explain it.

Yes, but I can't reconcile that with a jump from 124char to 256char - assuming I am understanding the data above correctly.

The chunker adds only a few chars to the file name as far as I remember. rclone lsf in the right remote will show the names.

I've checked the chunker without crypt and it only adds ".rclone_chunk.###' to the right of each filename. ie. 17 chars.

Doesn't explain the increase in file size?

It does seem like it shouldn't be able to account for all of it according to your data, no.
It will obviously increase the filename_size somewhat, but not so drastically as you describe.
At this point I don't really have any strong idea about what may be causing the problem.

Sidenote: is the postfix configurable? I haven't played around with chunker enough to know. If it is then I'd consider setting something shorter even if it may not directly the cause of this problem, because that seems unnecessarily long. I assume this is a "magic word" it looks for to identify a chunk, but aside from the numbers at the end that could probably be shortened to just 3-4 uncommonly used characters, making the postfix 6-7 chars instead of 17 to save on namespace.

EDIT: Apprently yes:
--chunker-name-format

The chunker backends wasn't dev'ed by NCW, but contributed by Ivan Andreev - so you may perhaps ask him if he has any knowledge about this? I don't know him personally, and I don't know if he frequents the forum, but you can probably find him on github and make contact there. Assuming that this is a genuine bug in chunker that you can reproduce for him, he is likely to be interested in fixing it.

If you can list with rclone ls the remote immediately under the chunker remote it will show you the file names.

BTW I have this issue in my rclone cache-crypte-mount too.
Think this is cause I have really too long Path.

edit: 280 chars
regards

As in above post, the default postfix is 17 additional char for chunker. I amended this to 3 char as its configuarable.

Original filename was 124 char.

  1. 124 + 3 char post fix = 127 char unencrypted = 256 encrypted chunker

  2. 124 + 17 char post fix = 141 char unencrypted = 282 encrypted chunker

Either way, it makes it impossible for me to use the encrypted chunker

Ah, I see the problem. It is that chunker generates intermediate names whch are longer. I did a little experiment with a 127 char file name and chunker tries to save this as a temporary name

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.rclone_chunk.001..tmp_1573122616

which encrypts to this 282 char file name

95ooh9l4s9urukjojv46tq2lgott5upqcuvolbanjpfej46458ltld8vck5pn1i7u0m4hi4j4q82nnq4rnbgun7tgqp35u9871sj5tcefued8dlc5b7q00er37vdt445n1365a9tjptlfgjqb8ukdpuk6ud6aiqdidhvvcv8j2dlebvpriebbe1kbgipnp4j65c4srk3jofs3pir4ibao71ejo9bkhaq1jh6o9li5huffschk4ncqpo56vs4j5opbbeohbfbuccnrud93o2h9dopr8

I used this config (with randomly generated passwords I don't care about :slight_smile:

[crypt]
type = crypt
remote = /tmp/secret
filename_encryption = standard
password = igAQKcbmiGoYHJyj8-eWL-FxRZBQ19EIqZTC
password2 = UhLYvwXume37V_w8sR-iaT7k4nTofzjWNUjC

[chunker]
type = chunker
remote = crypt:
chunk_size = 1M
hash_type = md5

And created a 2MB file whose name was 127 a characters and tried to copy it in

$ rclone --config /tmp/rclone.config copy -vv aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa chunker: --retries 1 --low-level-retries 1
2019/11/07 10:35:44 DEBUG : rclone: Version "v1.50.1-024-ge557586e-mount2-v2-beta" starting with parameters ["rclone" "--config" "/tmp/rclone.config" "copy" "-vv" "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" "chunker:" "--retries" "1" "--low-level-retries" "1"]
2019/11/07 10:35:44 DEBUG : Using config file from "/tmp/rclone.config"
2019/11/07 10:35:45 DEBUG : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: Need to transfer - File not found at Destination
2019/11/07 10:35:45 ERROR : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: Failed to copy: open /tmp/secret/mtf0f7hst60i7onogcudk3blf9dp3c18k0vqaoj9gqvpd51q99rkvdphqirrakdt1tpii6rd9em1jn24ghv58f4chdncu92p7q0etdeuk859d837k8thu50amh6mnotb3eh0g09l9b85epk4g0ij643m3nidm5adnsglj1hgclr7ulf5q7qu56krjl75ve91ob7hh97m6ihttc66cp54f7s5vd3neloj5vkoql5b18h9h79ju8motovei30dern4do63f60gu1g5d223jpoph43a64: file name too long
2019/11/07 10:35:45 ERROR : Attempt 1/1 failed with 3 errors and: open /tmp/secret/mtf0f7hst60i7onogcudk3blf9dp3c18k0vqaoj9gqvpd51q99rkvdphqirrakdt1tpii6rd9em1jn24ghv58f4chdncu92p7q0etdeuk859d837k8thu50amh6mnotb3eh0g09l9b85epk4g0ij643m3nidm5adnsglj1hgclr7ulf5q7qu56krjl75ve91ob7hh97m6ihttc66cp54f7s5vd3neloj5vkoql5b18h9h79ju8motovei30dern4do63f60gu1g5d223jpoph43a64: file name too long
2019/11/07 10:35:45 Failed to copy with 3 errors: last error was: open /tmp/secret/mtf0f7hst60i7onogcudk3blf9dp3c18k0vqaoj9gqvpd51q99rkvdphqirrakdt1tpii6rd9em1jn24ghv58f4chdncu92p7q0etdeuk859d837k8thu50amh6mnotb3eh0g09l9b85epk4g0ij643m3nidm5adnsglj1hgclr7ulf5q7qu56krjl75ve91ob7hh97m6ihttc66cp54f7s5vd3neloj5vkoql5b18h9h79ju8motovei30dern4do63f60gu1g5d223jpoph43a64: file name too long

You can then use cryptdecode to work out what the file name is

$ rclone cryptdecode --config /tmp/rclone.config crypt: mtf0f7hst60i7onogcudk3blf9dp3c18k0vqaoj9gqvpd51q99rkvdphqirrakdt1tpii6rd9em1jn24ghv58f4chdncu92p7q0etdeuk859d837k8thu50amh6mnotb3eh0g09l9b85epk4g0ij643m3nidm5adnsglj1hgclr7ulf5q7qu56krjl75ve91ob7hh97m6ihttc66cp54f7s5vd3neloj5vkoql5b18h9h79ju8motovei30dern4do63f60gu1g5d223jpoph43a64
mtf0f7hst60i7onogcudk3blf9dp3c18k0vqaoj9gqvpd51q99rkvdphqirrakdt1tpii6rd9em1jn24ghv58f4chdncu92p7q0etdeuk859d837k8thu50amh6mnotb3eh0g09l9b85epk4g0ij643m3nidm5adnsglj1hgclr7ulf5q7qu56krjl75ve91ob7hh97m6ihttc66cp54f7s5vd3neloj5vkoql5b18h9h79ju8motovei30dern4do63f60gu1g5d223jpoph43a64 	 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.rclone_chunk.001..tmp_1573122945

@lawmanuk can you please make a new issue on github about this with a link to this forum discussion and I'll see if the author of the chunker backend can come up with some ideas - thanks.

1 Like

Done.

Separately, is there a way to shorten names on encryption, rather than have lengthen even a bit, creating potential for errors?

That way, there can never be a crypt error due to filenames (separate from chunker issue). I know some other projects like cryfs / cryptomator shorten the filenames on purpose. As you said, you don't want to create an index file (although I support any solution which helps to tighten security (of metadata etc.) ), but was wondering if anyway to use some kind of compression algorithm to reduce filename length during crypt? Longshot I know.

There's something else going on here, unrelated to chunker.

I did a normal crypt WITHOUT chunker. Password was 100 char but that shouldn't affect anything.

unencrypted leaf file = 155 chars
N16(1) General form of injunction for interim application or originating application (formal parts - see complete N16 for wording of operating clauses).doc

encrypted filename = 256 chars
6ler7m5ckoj3u8637387v5er3culocvn9mga98cotqpo62vtg6hulhmcmql128b0cgcbgt3hjlsvend1v4c393vsrga4c1bj4d2umjif7jap3jac43sslc4bank2rvh8jpg5grnp7vntl242qmbhif75oi77si73cv4gsl5138fo058rpojfuhoo5rb0fgf73mem2dus9mm1dudf0iijn7rl27inrctshc9tujq6io2arcjljo29ou2j838adtap

Not sure why, as no chunking involved.
2019/11/08 22:52:14 ERROR : law/public prosecutions/ASBO - Civil Injunctions/N16(1) General form of injunction for interim application or originating application (formal parts - see complete N16 for wording of operating clauses).doc: Failed to copy: open /mnt/data/sync/test-data-cyrptom/gegoo0eeq2bmkda326m3klb19k/fiika42v42dinkc00g8i3dj6ug/ngll69cm12c74qe9hckum0tp2eib15ep9na52b7ikrt5j5u43kcg/780j9mf7agf8l65m15rk03lotj5equgmcialibnbiol6afcmeag0/6ler7m5ckoj3u8637387v5er3culocvn9mga98cotqpo62vtg6hulhmcmql128b0cgcbgt3hjlsvend1v4c393vsrga4c1bj4d2umjif7jap3jac43sslc4bank2rvh8jpg5grnp7vntl242qmbhif75oi77si73cv4gsl5138fo058rpojfuhoo5rb0fgf73mem2dus9mm1dudf0iijn7rl27inrctshc9tujq6io2arcjljo29ou2j838adtap: file name too long

My bad. Coding chunker, I only cared about the full path+filename limits imposed by linux/windows VFS layer (I believe it's 1024 on linux kernels) but completely forgot about filename (without path) limits imposed by filesystems and didn't know about crypt limits at all.

I admit that 16 chars for temporary suffix ("..tmp_1234567890") is too much. My excuse is it's just first public release. So with current design and default settings chunker requires 34 chars (".rclone_chunk.001..tmp_1234567890"). This leaves 255-34=221 for chunker-over-disk files and 143-34=109 for chunker-over-crypt.
I think the length can be reduced to 7 chars ("_abcxyz") squashing 9 (still need underscore or something for filename matcher to separate temporary part from chunk number).

Basic chunk name format is configurable so a user in need can change it to eg "*.rcc##" and squash 10 chars more (provided there are no more than 99 chunks per file but you can configure chunk size to avoid that). This leaves 255-13=242 for normal users and 143-13=130 for long but secret file names.

You could remove the "rcc" suffix altogether and keep ".#" only but then chunker will treat anything named "file.doc.1" as a chunk of "file.doc" so you should know what you are doing.

I will now think about a simplistic random number generator which avoids name clashes and still fits in 6 base-36 digits without extra accesses to remote directory listing. The patch will also add a configuration setting to let users of rclone v1.50 roll back to the previous way of temporary naming in case they start seeing new temporaries as normal files on a shared storage but can't immediately upgrade rclone.

I'm not sure what can be done about crypt'ed filenames. Probably someone in the community will come up with a patch augmenting "filename_encryption=standard" with new "filename_encryption=weak_but_shorter". Rclone is a free community driven project, contributions are always welcome.

And I would appreciate if topic starter invests a minute and drops a few words about their use-case. Chunker will not normally kick in unless the file size exceeds chunk size, which should normally be more than some storage limit, eg 100mb on box.com or 2gb on fat32. Do you really have lots of, say, 5Gb files with 160-char names? I would believe a large 5GB file with very long very secret name like "me_naked_riding_a_cow_in_siberia_aahhhhh_99chars.2019-11-09_17-33.naked-cow-travel-boy-river-fun-tags.HiDPIx1024.MOV" is not a mainstream case or can be easily solved manually.

PS I think file name limits (not only in crypt or chunker) call for a new "tinyname" overlay in rclone that will map long file names to shortened or random strings using a configurable index in local fs. This could also enrich other metadata besides filenames, eg add hashsum support to remotes lacking it or serve as a generic metadata container like file tags or descriptions. Contributors wanted.

Hi,

Thanks for response.

I use rclone with crypt to sync 2 machines (only 1 online at any one time) with via a cloud store in the middle.

I'm not using chunker for 5gb files etc. Just using it to remove metadata of file size of a normal backup (so set at 1M) with random data files eg. pdf, ms-word docs, videos, books, excel sheets - so the files aren't usually very big.

Originally intended to use rclone to sync the data (one way for now as 2 way not available). I've played around with using the original data as source, or various systems which reduce metadata for cloud storage (eg. cryfs, cryptomator) - then syncing those. The source examples I've given above are cryptomator files.

I abandoned using chunker and above at present due to the filename length problems and use below for now.

As rclone doesn't have compression, deduplication or file joining (to increase everything to 1M size) - I've decided to use borg backup (https://www.borgbackup.org/) to create the actual backup as it has all above functions and creates compressed (zstd 9 very high!), deduplicated (amazing), flattened directory structure, backup with all files to a set size (eg. 1M or 5M chunks I ended up with).

Borg doesn't have sync, so I then use rclone crypt to sync this backup to the cloud, and sync down to other computer when needed.

Although a 2 step process (until I can find 1 program that can do it all), it can run smoothly from a script and achieves:

  1. Sync
  2. Compression
  3. Deduplication
  4. Complete removal of all metadata (no filenames, no directory structure, no file sizes)

Not using restic (which can backup to rclone cloud), as it doesn't have sync option/compression/or inbuilt chunker.

For subsequent backups, borg does this extremely quickly with an incremental process thats amazing, as internally its not working with files but chunks of data only.

Fair enough. Thank you for input.
Just a few comments. AFAIK:

  • borg can encrypt backups so "rclone crypt" is an overkill
  • restic is able to chunk since it uses basically the same rabin-fingerpring based floating-window chunking algorithm as borgbackup; restic is generally less mature however

true re overkill... but....

  1. rclone is being used for sync anyway, so no additional effort to add crypt which helps to obfuscate which system is being used. eg. borg or some other, as filenames are encrypted. Not knowing which system is being used removes any starting point for attacker.

  2. If rclone header is obfuscated in later editions that will assist further.

  3. restic doesn't have any compression.

thanks

ok
btw if 2 machines were allowed to run online simultaneously, you could try https://github.com/syncthing/syncthing

syncthing is amazing and used to use that when they were both on. Now I only keep one on at a time in different locations.