Issue using '--checkers' flag with 'rclone delete' on Lyve Cloud storage

StrugglingUser · December 20, 2022, 7:17pm

What is the problem you are having with rclone?

Greetings,

I'm attempting to utilize an rclone 'delete' command with the '--checkers=16' flag set on a Seagate Lyve Cloud S3 bucket. However, upon initialization, I only see 4 checks being done, which is below the default value as well if I'm not mistaken (default is 8 I think). I cancelled out of the command after a moment or two, longer runs result in the same issue. Additionally, the same is true of the 'purge' command, though it appears that 5 checkers are started for it.

I'm unsure of the cause of this and am hopeful someone with more experience might have a suggestion.

I apologize for the edits to the paths, I hope they're still readable with information omitted.

Thank you.

Run the command 'rclone version' and share the full output of the command.

rclone v1.60.1
- os/version: centos 7.9.2009 (64 bit)
- os/kernel: 3.10.0-1160.66.1.el7.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.19.3
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Seagate Lyve Cloud S3

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone delete remote:bucket/<path> --checkers=16 -P

The rclone config contents with secrets removed.

[remote]
type = s3
provider = LyveCloud
env_auth = false
...
...
endpoint = s3.us-east-1.lyvecloud.seagate.com
acl = private

A log from the command with the `-vv` flag

2022/12/20 14:01:50 DEBUG : rclone: Version "v1.60.1" starting with parameters ["rclone" "delete" "remote:bucket/<path>" "--checkers=16" "-P" "-vv"]
2022/12/20 14:01:50 DEBUG : Creating backend with remote "remote:bucket/<path>"
2022/12/20 14:01:50 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2022-12-20 14:01:50 DEBUG : Waiting for deletions to finish
2022-12-20 14:01:50 INFO  : <file> : deleted
...
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:                21 / 25, 84%
Deleted:               25 (files), 0 (dirs)
Elapsed time:         1.4s
Checking:
<path>/<file>: checking
<path>/<file>: checking
<path>/<file>: checking
<path>/<file>: checking^C

Ole · December 21, 2022, 7:27am

Hi StrugglingUser,

Each --checker checks a complete folder/directory at the time, so you observation isn't unusual.

Initially there is only one checker checking the top-level folder, it then starts a number of checkers to check all the folders found in the top-level folder and so forth.

I therefore guess you have 4-5 sub folders with a lot of files in remote:bucket/<path>.

StrugglingUser · December 21, 2022, 1:45pm

Hi Ole,

Thank you for your reply!

There are certainly more sub folders, easily upwards of 8-16, and within each are easily 400+ individual files or further sub-directories as well.

Unfortunately, I'm still a bit confused about this. For a directory like 'miniconda2/pkgs', within are sub-directories that easily have thousands of files, why is it that more checkers are not being initialized to run when the number of files to check is greater than the number of running checkers (4 in my case)?

I ran the 'rclone delete ... --checkers=16 -P' command for 20 minutes, and 9264 files were deleted, which a napkin approximation would be about ~6-7 checks per second. Then I reran the command using '--checkers=4', and after 5m24s, 2666 files were deleted, which is about ~7-8 checks per second. For completeness, I also ran the command using '--checkers=1', and after 5m4s, 2616 files were deleted which is still about ~7-8 checks per second.

Wouldn't we expect this rate to be closer to the number of checkers specified or at least see rates that are not equal (assuming my napkin approximation is correct)?

~# rclone delete remote:bucket/<path> --checkers=16 -P
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              9264 / 9268, 100%
Deleted:             9268 (files), 0 (dirs)
Elapsed time:      20m0.7s
Checking:
 * miniconda2/pkgs/pillow…ckages/PIL/_imaging.so: checking
 * miniconda2/pkgs/pillow…ages/PIL/_imagingtk.so: checking
 * miniconda2/pkgs/pillow…IL/_tkinter_finder.pyc: checking
 * miniconda2/pkgs/pillow…-packages/PIL/_util.py: checking

~# rclone delete remote:bucket/<path> --checkers=4 -P
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              2666 / 2670, 100%
Deleted:             2670 (files), 0 (dirs)
Elapsed time:      5m24.1s
Checking:
 * miniconda2/pkgs/pip-19…ternal/req/req_file.py: checking
 * miniconda2/pkgs/pip-19…p/_internal/resolve.py: checking
 * miniconda2/pkgs/pip-19…ncoding.cpython-37.pyc: checking
 * miniconda2/pkgs/pip-19…esystem.cpython-37.pyc: checking

~# rclone delete remote:bucket/<path> --checkers=1 -P
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              2612 / 2616, 100%
Deleted:             2616 (files), 0 (dirs)
Elapsed time:       5m4.1s
Checking:
 * miniconda2/pkgs/plotly…axis/title/__init__.py: checking
 * miniconda2/pkgs/plotly…title/font/__init__.py: checking
 * miniconda2/pkgs/plotly…yout/shape/__init__.py: checking
 * miniconda2/pkgs/plotly…shape/line/__init__.py: checking

Ole · December 21, 2022, 4:33pm

Great test and analysis, looks like LyveCloud has a (hidden?) rate limit on file deletion which is easily saturated by a single rclone checker.

I briefly searched for LyveCloud API rate limits, but didn't find anything. I did however note the first page in this brochure having the statement "No API charges" and this typically comes with (somewhat hidden) API rate limits. Think "There is no such thing as a free lunch", so what is the catch?

I therefore suggest you ask LyveCloud how fast you are supposed to be able to delete files, and perhaps mention one of their other statements, such as: "No friction"

Because the checker performs the equivalent of

ls -l miniconda2/pkgs

to find the files and sub-directories of the directory. It then deletes the files and starts new checkers for each of the sub-directories (up to the max. --checkers)

So you will likely see more checkers starting if you leave it running, but that unfortunately won't increase speed if already limited by LyveCloud when using a single checker.

StrugglingUser · December 21, 2022, 5:31pm

Thank you again for your reply, and your explanation of how the deletion command functions. That helped a lot with my understanding.

Based on this, it does seem like a limit has been specified on the backend in Lyve Cloud. I'll follow-up with Seagate next to inquire about this and will reply on this forum post once I know more about that, assuming it's okay to leave this post up awhile longer.

Thank you again for your help!

StrugglingUser · January 10, 2023, 10:22pm

As a follow up to this post, Seagate confirmed that the backend API does not have a statically assigned number for concurrent checkers. The issue persists as of this writing.

What would be the next possible steps in investigating this?

ncw · January 10, 2023, 10:48pm

Just a thought. Try increasing --transfers too. It might be that rclone is treating deletes as transfers instead of checks.

StrugglingUser · January 11, 2023, 1:14am

Well, I feel thoroughly silly. That did the trick! I tested using transfers=[1, 4, 8] and found that the number of concurrent checks adjusted accordingly. Thank you all!

ncw · January 11, 2023, 11:28am

Yes for some reason we count Deletes as transfers, not checks.

I don't think it says this in the docs.

@Ole @albertony do you think it should say this in the docs? Or we should change deletes to use checkers as their limiter as deletes are typically a quick API call.

Perhaps we need a better philosophy as to what checkers and transfers are.

Maybe checkers are general parallelism and transfers controls how many bandwidth eating transfers (or server side transfers) we do at once?

Ole · January 11, 2023, 9:34pm

... on S3, SFTP and similar - and painfully slow on throttling remotes like OneDrive, Google Drive, Dropbox etc.

That was my also my mistaken assumption and the reason I assumed throttling and didn't consider --transfers.

The Seagate S3 server however has a response time/latency around 500ms (304s/2616*4) on a deletion, which makes it have a deletion rate around 8 files per second (with 4 transfers), which is comparable with OneDrive, Google Drive, Dropbox etc.

I would therefore probably have given the exact same advice (to check with Seagate), even if I knew for sure that deletes were performed by --transfers.

I think the main thing that made both @StrugglingUser and me read the situation wrongly was the stats:

StrugglingUser:

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              2612 / 2616, 100%
Deleted:             2616 (files), 0 (dirs)
Elapsed time:       5m4.1s
Checking:
 * miniconda2/pkgs/plotly…axis/title/__init__.py: checking
 * miniconda2/pkgs/plotly…title/font/__init__.py: checking
 * miniconda2/pkgs/plotly…yout/shape/__init__.py: checking
 * miniconda2/pkgs/plotly…shape/line/__init__.py: checking

where something performed by the --transfers are listed as "Checking:" and the --checkers don't advance in front of the --transfers doing the deletes. Both of us would probably have made different decisions if presented with something like this:

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              8,612 / 12,616, 66%
Deleted:             2,616 / 4,312 (files), 0 / 17 (dirs)
Elapsed time:       5m4.1s
Checking:
 * miniconda2/pkgs/pillow…ages/PIL/_imagingtk.so: checking
 * miniconda2/pkgs/pillow…IL/_tkinter_finder.pyc: checking
 * miniconda2/pkgs/pillow…-packages/PIL/_util.py: checking
 * miniconda2/pkgs/pip-19…p/_internal/resolve.py: checking
 * miniconda2/pkgs/pip-19…ncoding.cpython-37.pyc: checking
 * miniconda2/pkgs/pip-19…esystem.cpython-37.pyc: checking
Transferring:
 * miniconda2/pkgs/plotly…axis/title/__init__.py: deleting
 * miniconda2/pkgs/plotly…title/font/__init__.py: deleting
 * miniconda2/pkgs/plotly…yout/shape/__init__.py: deleting
 * miniconda2/pkgs/plotly…shape/line/__init__.py: deleting

I understand that this is a lot easier said than done, but just changing "Checking:" to "Transferring:" and "checking" to "deleting" would certainly have rung a bell in my read.

I would be extremely wary of moving deletions from the transfers to the checkers, it would break much of the tuning advice I have given by potentially triggering heavy throttling in OneDrive, Google Drive, Dropbox etc.

system · February 10, 2023, 9:34pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.