Scaling issue on mounted encrypted s3?

vvyhszer · May 18, 2021, 2:37pm

Let's say I've set up rclone (v1.55.1 on amd64 linux) to create an encrypted drive using Amazon S3:

[drivewithmanyfiles]
type = crypt
remote = s3:mybucket
directory_name_encryption = true

[s3]
type = s3
provider = AWS

And let's say I've mounted the directory:

rclone mount drivewithmanyfiles: /mnt/drivewithmanyfiles

I remember many filesystems (particularly old ones like FAT) have problems when you have a lot of files (like, thousands or more) in a single directory with no subdirectories, so it's better to have a tree-like structure where you have, say, ten sub-directories for each directory, and ten sub-sub-directories for each sub-directory, etc.

Problem is, Amazon S3 doesn't really have the concept of a sub-directory, right? So if I do

ls /mnt/drivewithmanyfiles

rclone will have to go through every single file on the entire filesystem, great-great-grandchildren and all, just to list the subdirectories of the root directory?

The reason I ask is that I've been thinking of making a single, giant filesystem on an AWS bucket, but if this is the case, maybe it's better to have multiple smaller filesystems spread across multiple AWS buckets. The fundamental scaling issue still exists, but if I use ten buckets (/mnt/drivewithmanyfiles1, /mnt/drivewithmanyfiles2... /mnt/drivewithmanyfiles10), would rclone work ten times faster when I ask it to list /mnt/drivewithmanyfiles1?

ncw · May 18, 2021, 3:30pm

No it won't. Fortunately the S3 protocol allows you to list a "directory" and it shows "prefixes" for entries under that directory.

system · May 21, 2021, 3:31pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.