Over the last year @id01 and I have worked on a new overlay remote that implements transparent compression.
We now have a first beta available here
The remote implements 3 compression options lz4, gzip and xz for a fast, balanced and strong compression option.
If you are interested please go ahead and test it. You can report any issues here or in the relevant pull request on github.
it looks like there are two files created in the remote for each file in the source, is that correct?
and if so, why is there the need for that second file.
the filename of the source file is mangled, instead of simply adding a file extension,
i would have expected only one file named dump1.bin.gz
on windows,
i tried to rclone mount to access the compressed remote. rclone.exe mount presstest01: p:
my system locked up so bad, i could not even run task manager, via ctrl+alt+delete.
and my taskbar crashed as well.
luckily i had task scheduler open, and was able to end the task.
using lastest winfsp, v1.6
also, when the mount does work, i cannot copy files, getting errors like
2020/03/24 11:41:30 NOTICE: S3 bucket presstest path 01: Streaming uploads using chunk size 5M will have maximum file size of 48.828G
2020/03/24 11:41:30 DEBUG : 50files/dump1.binffffffffffffffff.gz: multipart upload starting chunk 1 size 1.000M offset 0/off
2020/03/24 11:41:30 DEBUG : 50files/dump1.binffffffffffffffff.gz: Size and modification time the same (differ by 0s, within tolerance 1ns)
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x28 pc=0xe86fa5]
Very nice! I've been keeping a close eye on this in github.
Not sure when I will realistically have time to test it out thoroughly as I have my hands full atm - but it's definitely going on my list.
Size-info is still stored in the name though right? Or did you manage to find a better solution for this?
i think that one big problem with the compressed remote is that the .gz file must contain the original filename.
if a file named dump1.bin is compressed into a .gz, the file inside the .gz much not change.
as it now, the filename is changed to dump1.bin0000100000000000
so as it is now, i must use rclone to uncompress that file,
I think the files should be cross-compatible.
But the current implementation apparently has to rely on storing the (real) filesize in the name.
So I think the only "damage" here is the name gets a little messed up at the end.
I don't love this solution either, but when we don't really have a reliable structure to store metadata like this centrally (which would be a large and difficult undertaking) you have to choose between some imperfect solutions. You could save this data in an accompanying file instead - but then you'd suddenly double the amount of files which would also not be ideal...
It's a hard problem is what I'm getting at, not a just a silly design flaw.
i love rclone, have donated to rclone with time in the forum and money from my wallet, need rclone.
but at some point, trying to be everything to everybody is not productive.
Yes, but that's not a transparent solution. making the solution is transparent is what makes this problem hard - otherwise it would be just as easy as you demonstrate. If you don't have any problem with zipping and unzipping your own stuff then this this remote isn't something you need.
Hey good to see that someone is testing this now. Maybe I should have been a little clearer this is very much still in development
The documentation hasn't been written yet I was under impression the setup is pretty straightforward if you've setup any other rclone backend before but documentation is on my todo list.
This is necessary so store metadata and allow seeking.
This looks like an actual bug I'll be looking into it. Note that using press with mount might not be optimal due to the extra overhead intruced by compressing blocking io longer.
This is still the best solution we have currently. I've been thinking about a general rclone metadata framework that stores metdata per folder but this is something that still needs discussion.
For the gzip compression this can actually be fixed the header supports storing the original filenam. It's just a limitation of our current gzip implementation I hadn't really thought about yet, same for xz actually. In the case of lz4 it's not possible to fix this.
There's is definitely some merit to that. Luckily rclone is highly modular, implementing a new backend doesn't pollute rclone's core in any way, and you only have to change a single line to remove a backend. Rclone also supports plugins now and while I'm personally not a fan of plugins in general I'd have no problem with separating certain functionality into a plugin. To me personally transparent compression aligns far more with "rsync for cloud storage" than a dlna server or the entire rest of serve tbh.