Progress flag for rclone purge 😲

I found a past topic on this that didn't get traction. I'd value this option a lot, especially for local filesystems.

Acknowledging that this could be expensive for a cloud target, maybe just warn people of that in the man page and/or only allow the feature for remotes that won't take a cost hit (ie local).

For cloud targets, maybe it just talks about MiB/s deleted or some other metric that doesn't require a remote walk before or during the operation?

I can put in a separate feature topic for it but in the same vein, is there any way to control the parallelization of rclone purge? How parallel is it by default? It doesn't complain when I put in --transfers=N but I suspect it's just ignoring the flag. Thanks!!

#~: rclone -P --transfers=16 purge local:/mnt/nas1/delete_dir

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Deleted:                0 (files), 1 (dirs)
Elapsed time:      9m57.5s
# rclone version
rclone v1.57.0-DEV
- os/version: rocky 9.2 (64 bit)
- os/kernel: 5.14.0-284.30.1.el9_2.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.2
- go/linking: dynamic

BTW. You are using ANCIENT rclone version. 20+ releases behind the latest one.

Try --checkers 16

On some backends purge is an atomic operation that happens instantly. On some it is implemented by the backend and some by the rclone core .

You'll get better stats with a newer rclone

I don't understand going out of your way to point this out when it has nothing whatsoever to do with my question. I use the online docs for rclone which should always be the latest and it doesn't mention the root of my question.

In any case, I upgraded my version to the latest and there is zero change in functionality.

Thanks Nick. I upgraded my version and also tried --checkers=16 and saw no change in behavior.

You'll get better stats with a newer rclone

I didn't get any better stats (still shows zeros) on the latest version

Online docs are for the latest version (v1.65.2 now) - you are using v1.57.0 beta... Nobody is interested really in troubleshooting some many years old rclone release.

Just a thought but for local filesystem why to use something built to work with cloud providers (rclone)? Wouldn't simple rm -rf do the the job better?

rclone is not trying to replace every imaginable local tool/command possible. Its focus is to excel when dealing with cloud. OS provided tools are in most cases much better and faster.

It would be pure waste of time to try to provide some stats for local filesystem operations which have no meaning for online storages... IMO

Hmm that should have helped... I had a look at the code and I discovered that the local backend implements its own Purge rather than using the rclone fallback which has the nice stats and parallel running.

I did a bit of code archaeology and discovered it was implemented in 2014 for v1.02 and doesn't appear to have any special reasons for existing. Unfortunately I didn't write in the commit message anything useful and I certainly don't remember after 10 years!

Deleting the Purge implementation causes stats to appear and --checkers to have an effect!

It will produce stats like this

Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Checks:             10000 / 10000, 100%
Deleted:            10000 (files), 1001 (dirs)
Elapsed time:         0.1s

Give this a go

v1.66.0-beta.7678.212c4d1d9.fix-local-purge on branch fix-local-purge (uploaded in 15-30 mins)

It isn't every day that you

  • find a bug from 10 years ago
  • make the code faster and more functional by deleting 25 lines of code

:slight_smile:

3 Likes

I think you'll find that the rclone purge in the branch above is faster than rm -rf now (I didn't time it) as it runs --checkers deletes at once, and all I had to do was delete 25 lines of code :wink:

3 Likes

I have several thoughts on this...first it's pretty clear that rclone wasn't only built to work with "cloud providers". It's a a powerful, parallelized tool with lots of types of remotes, including support for several filesystems and file-share platforms that technically aren't cloud.

Second, most POSIX commands aren't natively parallelized so when you're interacting with filesystems it's useful to have a tool that you can thread to do something like removal of millions of files. I have other tools to do highly-parallel file/dir removals on Linux but I like rclone so I wanted to see why this feature didn't seem to have much flesh and it seems like Nick digging into it helped a lot. If I hadn't seen rclone purge listed in the documentation I wouldn't have asked them to write it. Someone had clearly already put some time into it though so I wanted to understand why the behavior was different.

2 Likes

Fair points. Thx for explaining your take on this.

1 Like