Because of how Google Drive stores files, it's very inefficient to look up a file by path: foo/bar/baz requires searching for folders named foo to find the ID, then searching for folders named bar that have that ID as parent and finally searching for files named baz with the ID returned by bar.
I was wondering if you'd consider adding an optional (behind a flag) optimization where rclone simply never stores directories upstream? Specifically, store everything in a single (root) directory and put the whole path in the file name. The only disadvantage would be that empty folders wouldn't be stored, but I think this is not a big deal in most cases and the performance gains could be significant.
Interesting, I thought this was Google Drive specific. In that case, a separate overlay backend does make more sense. I also realized that this is actually the default behavior of bucket-based remotes according to rclone mount docs:
The bucket based remotes (eg Swift, S3, Google Compute Storage, B2, Hubic) do not support the concept of empty directories, so empty directories will have a tendency to disappear once they fall out of the directory cache.
So I guess this could be just a generic way to treat a remote as bucket-based even if it's not that by nature.
Listing the directory would be very time consuming though - are you thinking that rclone should cache it locally?
Yes, but such a cache can be maintained very efficiently. Currently, rclone needs at least <number of directories> requests to build a full file tree. No directories would mean only needing <number of files> / 1000 (since 1000 is the maximum number of files returned IIRC), which is already probably very useful for rclone sync.
Also, updating it subsequently can be done in a single request (if you sort by modification date when you request a list of files). With this, the cache can potentially be very long-lived without needing a full rebuild. The only ill-effect that comes to mind would be deleted files appearing visible (but they wouldn't be readable anyway).
I'll just note that if you are doing a recursive traversal rclone will use a fancy listing algorithm on Google drive which typically does about 1/10th of the requests. So called --fast-list.