Remove metadata leaks

Hi,

It would be great if rclone could remove the following data leaks:

  1. Directory Structure
  2. File sizes
  3. Maybe less doable, but remove rclone identifying header in each crypted file.

This could be done by:

  1. Flattening directory structure, similar to how cryfs or crypomator do.

  2. Remove any data for file sizes by specifying file size required. eg. 10M would:-
    a. Merge all small files to 10M length
    b. Split all larger files to 10M length (I know chunker can do this already).

Combined with crypt, this makes rclone have all the features of both crypomator and cryfs, with many other features besides.

Even if (3) is too big a task, (1) and moreso (2) would be really helpful for security purposes.

Thanks

I believe rclone had flattening early on - but it was removed. I think it had something to do with limitations of certain cloud backends making that approach incompatible. You would have to get an answer from @ncw for the details I think (and I would be interested to know too).

It would be nice if we could have it as an optional feature for those backends that can support it - but it may be that this would break a lot of other code and require a lot of functions to basically be maintained twice to support both. I don't know.. maybe NCW can elaborate as it's a fascinating topic and I wasn't around back when I heard this was still a thing.

about file-sizes:
Encrypted files are allready not the same exact size as the original. They are slightly longer, and I don't think you can trivially calculate what the real size is without having the crypt-key. But of course they are very obviously "about" the same length... that info is pretty hard to mis-use though when it's not exact.

For large files you know about chunker...
But I absolutely agree that a "merger" would be really nice for small files (and I have some ideas on how to implement it I will pitch to NCW). I am personally more interested in the performance potential of it than the security benefit - as many/most cloud backends struggle in performance on many tiny files. Transferring them as one larger unit would be very beneficial in a lot of cases.

I gave you the link to the issues page in the other thread if you want to formalize any of these ideas and add more detail to them. And should you know how to program than by all means - make a pull request and have a go at it. The code is open to contributions and NCW will assist where needed. He is always very keen on help as he has a ton on his to-do list already for this project.

This possible, but requires keeping index files which need to be kept in sync. This makes things much more complicated and fragile - if the index is missing your files may be lost.

There are ideas about this in the rclone issues.

To do this would require the index file to read the actual size from.

Actually this would be easy! The 6 character "RCLONE" string could be configurable by the user. There needs to be a "magic" string at the start but it could be user definable.

That does actually seem like a good security feature then, because I'm sure lots of other encryption schemes use bas32 - and not knowing for sure which scheme is even used would be valuable obfuscation...

I second that. Would be good to make that user definable header if not completely headerless.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.