Hash status/progress bar


#1

Hi again!

Is it hard to add a status or progress bar like, feature for hashing of files? Just so you know how it is going. As of know when one is hashing big files before upload, it just stands there doing nothing (looks like it) and you just got to hope that everything is working as expected. If possible it would be nice to also have a speed measurement and an estimation of the time left. In my head it should look much like the status for uploading. But maybe instead of being under the category transferring it should have its own category like “Hashing:”

Eg (faked data in case it wasn’t obvious):
Transferred: 600.00G / 8.000 TBytes, 7,5%, 8.063 MBytes/s, ETA 12041h47m7s
Errors: 0
Checks: 100 / 100, 100%
Transferred: 2 / 10, 80%
Elapsed time: 20h11m7.8s
Hashing:
* file_to_upload_nr001…018-01-01T235959.ext: 80% /1T, 15M/s, ETA 5h47m7s
* file_to_upload_nr002…018-01-01T235959.ext: 40% /1T, 14M/s, ETA 15h47m7s
* file_to_upload_nr003…018-01-01T235959.ext: 50% /1T, 13M/s, ETA 12h47m7s
* file_to_upload_nr004…018-01-01T235959.ext: 60% /1T, 16M/s, ETA 5h47m7s
* file_to_upload_nr008…018-01-01T235959.ext: 80% /1T, 10M/s, ETA 5h47m7s
Transferring:
* file_to_upload_nr005…018-01-01T235959.ext: 20% /1T, 15M/s, ETA 5h47m7s
* file_to_upload_nr006…018-01-01T235959.ext: 10% /1T, 13M/s, ETA 5h47m7s
* file_to_upload_nr007…018-01-01T235959.ext: 30% /1T, 17M/s, ETA 5h47m7s


Another thing (Maybe this should be another question?)
The “…” in the middle of the filename appeared when I upgraded rclone from 1.38 to 1.44. Is there a bug somewhere causing this? If I remember right it was just dots before ("…"). This happens when the filename is to long.

Best Regards


#2

Use the -P flag or --progress to see a dynamic display

I changed the symbol to use a unicode elipsis character. If you set your terminal to UTF-8 it should display properly. UTF-8 is the sensible setting for a terminal now-a-days I think.


#3

I’ve tried the -P flag but that doesn’t show the status of the hashing. It just shows that status without reprinting the information to the console each time. (Which is a nice function but not what I’m asking for)

I would like to see how much of a file that rclone have hashed. While it is hashing. So instead of rclone reporting that it has transferred 0% of a file while it is hashing it, it should instead report like 54% hashed and not say that it is transferring the file until the hashing is complete.

It is getting pretty late/early here so I don’t know if I made it easier to understand or not, but if you want me to clarify anything just tell me!


#4

Use similar to this. Dont forget update rclone.
rclone -P copy /root/videos gdrive:videos


#5

Ah I understand now! Yes the progress of the hashing isn’t shown. In fact it would be very difficult to show because the backends do their own hashing. Fetching a hash from google drive only takes 1 http transaction, but hashing a file on the local backend means reading it and hashing it.

I’m not quite sure how we would implement this…


#6

Exactly!

At the moment I’m not very conversant with the rclone code. But do you mean that you are using a specific library for each back-end (storage provider)? And that each library has its own hashing function, that you have no source code of? If that is the case, it gets harder… I don’t have a complete solution for this either. Maybe somebody else know a better way to do this? But just throwing out some quick thoughts. Could it be possible to see how much data that has been read by the hashing function? Like overloading/replacing the function that the hash function uses to read the file from disk, and in that function keep track of the amount of data that has been read together with the total filesize and then calculate the “assumed” status of the hash? I’m not sure if this is possible without recompiling the library though. Depends on if the library is statically or dynamically linked.

Since you liked my last diagram I made one for this too :wink:


#7

That is correct, except I do have the source of all the backends they are part of rclone.

I’d just have to work out a way of getting the data out of the backend.

This could be a new method, say HashWithStatus() which could work as the current Hash method, but call back a status function with progress.

That would probably be the best way of breaking the encapsulation without coupling the modules together.

I’m not sure how I’d display the info though.

What would you suggest here?


#8

Is rclone coded object oriented? If that is the case, couldn’t you then just make a member-variable in the hash-object, that is accessible by other functions? If not, could you make a variable that is either global (I know, this is bad practice) or shared with an alias/pointer or equivalent. And at every update of the status (in the command-line) just get the value of that variable.

For the displaying I think it would be nice if the hashing was displayed in a different list than transferring. Like in the first message I sent of this thread

Transferred: 600.00G / 8.000 TBytes, 7,5%, 8.063 MBytes/s, ETA 12041h47m7s
Errors: 0
Checks: 100 / 100, 100%
Transferred: 2 / 10, 80%
Elapsed time: 20h11m7.8s

Hashing:
* file_to_upload_nr001…018-01-01T235959.ext: 80% /1T, 15M/s, ETA 5h47m7s
* file_to_upload_nr002…018-01-01T235959.ext: 40% /1T, 14M/s, ETA 15h47m7s
* file_to_upload_nr003…018-01-01T235959.ext: 50% /1T, 13M/s, ETA 12h47m7s
* file_to_upload_nr004…018-01-01T235959.ext: 60% /1T, 16M/s, ETA 5h47m7s
* file_to_upload_nr008…018-01-01T235959.ext: 80% /1T, 10M/s, ETA 5h47m7s
Transferring:
* file_to_upload_nr005…018-01-01T235959.ext: 20% /1T, 15M/s, ETA 5h47m7s
* file_to_upload_nr006…018-01-01T235959.ext: 10% /1T, 13M/s, ETA 5h47m7s
* file_to_upload_nr007…018-01-01T235959.ext: 30% /1T, 17M/s, ETA 5h47m7s


#9

I see!

Do you want to make a new issue on github about this so this discussion doesn’t get forgotten.

Would you like to help implement it?


#10

I’d like to, but I don’t think I have the time for the moment. I’m currently working all awake hours and sometimes even the hours meant to be slept in… Just to keep up with a tight time schedule for a big company program. And I don’t think I want to put even more work on my shoulders right now. But if you or somebody else needs any help when implementing it you could always send me a message, and I’ll see what time I can spare!