My second rclone project too long filenames crypt

I have a few thoughts and even an idea for a "name shortener" remote (but lack the golang experience. If only rclone were a python tool...)

Thoughts:

  • You say it is a hashsum. A hashsum of what? The file name? or the file contents?
  • What about subdirs?
  • What happens if you write to remote: on computerA and then read on computerB? That would break a lot of workflows!

My Idea

This is just rough but one idea I had for a name-shortener remote. What it does it take every name (and part for directories) and hash them keeping the first 10 or so digits (in base32). It then writes a sidecar file with the original name.

When writing to a remote, it has to write the sidecar and the file.

When reading it has to read the sidecar. However, you can speed that up with a local database (maybe a LRU-based?) of hashes and names. This is just to speed it up. If it gets nuked, ground-truth is still on the remote.

So,

this-is-a-long-directory-name-so-deal-with-it/short/medium_dir/redicoulous-filename-with-lots-of-information-such-as-date-20211225T122500.47854778.ext

will encode to:

7revsnsh3c/ud2ou7mrjf/t4u3eolo6y/idq5svfuhi

and will create (if it doesn't exists) the following: (<filename> : <content>)

7revsnsh3c.name : this-is-a-long-directory-name-so-deal-with-it
7revsnsh3c/ud2ou7mrjf.name : short 
7revsnsh3c/ud2ou7mrjf/t4u3eolo6y.name : medium_dir
7revsnsh3c/ud2ou7mrjf/t4u3eolo6y/idq5svfuhi.name : redicoulous-filename-with-lots-of-information-such-as-date-20211225T122500.47854778.ext

This way, each component is only 10 characters, the "truth" always lives on the remote, but once it is listed (and you don't change the cache dir), it is super fast in the future.