Progress on `size` command

We have a ton of files, and commands such as size can take a long time with no idea of how long, and that can be frustrating. Is there a way, and if not please add it, for the --progress flag to also work on directory listings.

This could also work on the ls commands that output to json.

hello and welcome to the forum,

rclone has to iterate over an unknown file system.
how would rclone know the difference between 1% complete versus 99% complete.

Many filesystems send the number of pages or number of objects in the header.

EDIT: and if that is not possible, at least list the number of files processed and time taken, so we know it's doing something.

for example?

as for cloud providers, there is not a file system and no way to get such information.
in fact, many provides limit the number of files listed per api call to 1000 and many such api calls are required and with many providers, each api costs money.

edit to your edit:

that is possible.

rclone size is a super expensive call as it has to walk through a remote and go directory by directory.

What's your use case and what are you trying to do as that's probably a better question as size tends to be avoided since it's very API heavy.

this might better explain what is happening
https://rclone.org/s3/#avoiding-get-requests-to-read-directory-listings

Ok then, what about going with the rsync approach? they show something like 321/654 with the first being the number left to process and the second is the total discovered (so far).

I'll reiterate this.

Neither of the above require making any extra calls, while still showing progress is being made. Usually people have an idea of how many files there are, and can guesstimate if they see how many have been processed. Anything is better than nothing.

I think @ncw could comment, but my understanding is it walks through the file system based on getting 'chunks' of data based and it doesn't know what it finds until it progresses.

I'm not sure what the original use case of rclone size was so not sure it's used much/often.

"Progress" doesn't necessarily mean "estimated percent completed". At the most basic, a simple number that gets incremented when it does a thing. See my rsync suggestion.

Not showing anything during a long task is very bad UX, which means I don't know if the program is doing what it claims to be doing or if it's stuck and not doing anything.

The progress/stats seems to be explicitly disabled for the size command by the second parameter (showStats=false) in this function call. The same is seen for the ls command.

I guess the purpose is to keep the default output from these commands clean from any progress information, if used in a subsequent script/program - otherwise stats would print every 1 minute with the current defaults.

There may be ways to get the best of both worlds, thinking...

... nice, the code already allows the best from both worlds :sweat_smile:

The trick is to also set the --stats parameter (to anything you like):

    rclone size myRemote --progress --stats 1s

Please note there is a known issue, if you use RCLONE_STATS="1s". See Github #5341

.

What kind of output do you get from that? I just see empty progress:

Transferred:             0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors:                 0
Checks:                 0 / 0, -
Transferred:            0 / 0, -
Elapsed time:          0s

I see an update of elapsed time every 5 sec using this command:

rclone size OneDrive: --progress --stats 5s

Transferred:             0 / 0 Bytes, -, 0 Bytes/s, ETA -
Elapsed time:        15.5s

on this version/platform:

rclone v1.55.1
- os/type: windows    
- os/arch: amd64      
- go/version: go1.16.3
- go/linking: dynamic 
- go/tags: cmount

What updates? I think the OP is saying they are looking for some progress percentage or something along those lines.

I see just 0/0 and nothing changes when I use that command.

The progress of the elapsed time is basic and similar to a rotating gear, hourglass, etc. and as such fulfils the OP's request to minimum have some kind of indication that the program is busy/working:

I fully agree, it would be nice if the progress/stats information also included the number of directory items processed/pending. Especially in situations like size or a sync with a very restrictive filter.

I just use ncdu instead of size, seems to give much more info in similar time to complete the scan.

2 Likes

I know you aren't supposed to have a favourite child, but of all the rclone commands rclone ncdu is my favourite :slight_smile: