I've looked at rar2fs before, and it is a very enticing idea.
I think for me the big killer is that it can't support writing at all (due to some sort of licensing issue that applies to making the archives and not just reading them). However you could always work around this I guess by having a different part of the system create/change the files since that is pretty easy to script...
Is anyone aware of a way to work around this? I don't really understand why I couldn't just lean on my normal Winrar installation to do this task rather than rar2fs needing to do it internally (and not being allowed to due to the licensing problem).
Anyway I think you have identified the main problem here - that so many files need to be accessed and read to get their content information. Most Cloud services that aren't premium pay-pr-action types tend to have limitations on the number of requested transfers pr second. For Gdrive for example this is about 2-3/sec. Obviously if you then have to read a handful of bytes from thousands of files (or more) then this is going to end up taking a lot of time regardless of your CPU or bandwidth...
I can see a few potential solutions...
The simplest possible thing would just be to make fewer/larger archives. The size of the file won't make it a lot slower to grab the metadata from, but the number of file-accesses will. Thus if it is practical to just bundle together more data into larger archives this would scale down the time spent pretty much linearly.
The ideal solution would be to have a local cache of the metadata. Since rclone will have the hash-values from each file when listed it would be simple to just compare them to the existing cache. Is the hash still the same? Then we know the data is the same as before and we don't have to access file at all. Only new or changed files would be have to be accessed - which would obviously speed up the process immensely, or even eliminate it entirely in a lot of cases. You'd still have to list everything at some point - but only once until the file actually changes...
However, this would probably require a remote-integration of rar2fs for rclone so these programs can coordinate (ie. rclone would probably need to run the metadata cache as rclone is the one that knows the hashes, but rar2fs would need to be aware of the cache and use it to make optimized choices). While this would be a near perfect solution I think - it would need some work to make happen.
One general optimization tip that should apply to any solution is to enable "quick-open information" (under Compression --> Options in Winrar). I recommend setting this to "enable for all files". This is not on by default. The only downside is a (quite trivial) size increase in the archive).
What does it do? Well it adds a duplicate of the file-listing in the archive with all the metadata right up front in the file. Normally this information is spread around in many different places in the file.
This usually doesn't matter so much when reading from a SSD or even HDD since they can random-access to the file very fast, but when reading an archive from a Cloud the benefit is immense. This is because rather than having to do dozens or hundreds of seeks (which are fast on rclone compared to opening new files, yet still hundreds of times slower than on a local drive) we can just read one block of data and get everything we need.
Just make a large and complex archive and compress it once with this off, and once with it on. Then try to open it via a mount. With the feature disabled you might have to wait several minutes to show the contents, but with it on it will be nearly instant. I use still for all rar files - especially all that might go to my cloud storage.
This doesn't fundamentally solve the problem that we still need to access each file once, but at least each file will be listed much faster - especially when they contain lots of files).
You will have to recompress the file to enable this option I think. In theory it might be able to just convert it and add in that to an existing archive, but I'm not sure if that's an option anywhere. In any case it does necessitate a reupload for a cloud since the file changes...
I've kind of been hoping we could get a rar2fs integration at some point, as this would be a fantastically efficient way to handle lots of small files (as a single file on the cloud-side) - increasing performance drastically but also heavily mitigating the maximum number of file limits that many Clouds have (like 400.000 for Gdrive).
@ncw An interesting topic for you to be aware of perhaps?