I think rclone needs some sort of inverted chunker style overlay.
Chunker takes very large files and breaks them down into smaller files. But what about people who have tons of very small files and want an overlay that somehow merges them into larger files?
This sounds pointless, who would want to have to download a 1gb file in order to access a 1kb file?
But, well, if you're using amazon deep glacier you won't be making any file downloads hardly ever. But you might be requesting too many list/head api calls and overpaying for them. I know about --update --use-server-modtime and --s3-no-head but even getting 1000 items per 1 request using --fast-list the way my data is currently stored will generate far far too many api requests over time merely via the list api (at least that's my fear, and that's why I've never tried it before.)
I've been looking at amazon deep glacier, and it would be perfect for my usage case. EXCEPT I have too many files. I've made my backup's in a lazy fashion with million of tiny files just loose.
Since I started using google's cloud service I've only ever downloaded less than 0.1-1.0% of the data. But I am constantly checking metadata in order to update backups. I never use mount. I never use plex. I often use rclone copy and target a very large directory though. Like I said there are just too many files in my cloud storage right now. Too many small files generating too much metadata for amazon deep glacier's request policies.
Maybe I just need to be a better user of rclone. Obviously if I was smarter I'd have already set all this up. But in my imagination if chunker can exist, why not invertedchunker? rclone does so much for me, without me having to know what I'm doing, what's one more thing?
Edit: The problem would be how to access the metadata for all the files inside the archive/zip that's not a feature a standard archive.zip supports or has. A second file containing all the metadata for the archive would be needed. But in order to work on deep glacier this second file would need to be sorted on an alternate remote? Or some such. It's an almost impossible problem to solve I guess? The metadata files for each archive/invertechunk could be stored maybe on normal s3 and then the deep glacier chunks wouldn't need to be interacted with essentially EVER, you'd never use up those api requests, instead you'd just be constantly downloading small metadata files from S3? These metadata files would be small, and fit within S3's free bandwidth requirements.
But this would be the most complicated overlay layer ever coded for rclone because it would require two underlying remotes, one of which would be deep glacier, or any other api restrictive remote, even ones with solutions in rclone already in place (like box.com) and then a 2nd-ary remote with limited storage but less limited api requests?
Sorry if this makes no sense. I'm just trying to think of a solution that would make amazon deep glacier easier to use. I think this sort of solution is possible, but it's so far beyond me that maybe I'm describing it poorly. In essence I'd want an overlay that would convert lsd and lsl and checkbydate style commands targeted at amazon deep glacier into instead downloads of small files from some other remote that didn't restrict api calls at all, but did restrict storage. Heck even google drive instead of S3 could probably host all the tiny metadata files that would be created to pair with the inverted-chunker overlay.