In Swift, When I have large numbers of files in a bucket,
ls operations are taking ages to complete. I’m wondering if this is a known issue, and if so, is there a way to mitigate it?
swift list returns very quickly (3-4s), but
rclone swift: ls will easily run for many minutes before returning. The bucket size I have in front of me is about 11k objects.
Of course, this also affects
rclone copy operations with
--include filters, which is what I’m really after.
First thing to try - use the
@ncw, I tried that. In fact, I should have pasted my full (non-trimmed down) command line:
rclone --config rclonerc --fast-list ls swift:bucket
I don’t have timings right now, mainly because it takes a long time to wait for the commands to run. I’ll try to get that and report back.
It would be worth running with
-vv --dump-headers that will show you the requests rclone is doing.
rclone has to do a HEAD request on every large file to find its size, so if you have a directory full or large files that may be the cause.
If that is the case then you can increase
--checkers to do more stuff in parallel.
indeed, it is a directory of many large files (compressed log files). They are not all huge, but there are likely thousands of them that are.
Is there a way to turn off size checking until
after the --include --exclude computation would be done? At least in my case, that would solve my problem since I’m only copying a couple named files.
Is there a way to turn off size checking entirely?
I just remembered we have discussed this before in the forum:
Ls slow on large directories (500~ files) (Hubic)
Number of checkers makes no difference as it is one per directory
I made this issue about it:
rclone could defer reading the size until it is actually used. That would work too I expect.
You can try the
--ignore-size flag - I’m not sure that will help though.
swift list: 8s
ls with checkers: 8 (default), --fast-list: 9m
ls with checkers: 256, --fast-list: 8m
ls with checkers: 1024, --fast-list: 7m45s
I don’t think swift list reads the size does it?
Thanks for the pointer to the GH issue.
swift list -l does, and doesn’t appear to impact the run time of the command.
But, I see many of what I would expect are the large files are zero-byte, probably signaling that they need another lookup and the swift client doesn’t do that for you.
I just tried and as you suspected, it doesn’t appear to change the speeds. Ah well.