More stats detals

a normal --stats 3m would give a output like
2017/08/15 13:36:25 INFO :
Transferred: 39 Bytes (10 Bytes/s)
Errors: 0
Checks: 0
Transferred: 1
Elapsed time: 3.8s

In addition to this would be nice to see some more like

2017/08/15 13:36:25 INFO :
Transferred size: 39 Bytes (10 Bytes/s)
Remaining size 30 MBytes
Errors: 0
Checks: 0
Deleted 10
Transferred: 1
Pending Files 2
Elapsed time: 3.8s
Remaining Time 5 mins

There are various issues about improving rclone’s stats reporting on github.

The Deleted stat would be easy to add - we already have that.

Pending Files - stats will show you the files that are being transferred in the list.

However a true list of pending files plus estimations of remaining time would require rclone to make two passes through the listing, or it would require a mode where it does all the listing first, then queues the transfers for afterwards. Either way the listing is the most time consuming part on some remotes (eg google drive) so having to do it twice or wait for it to complete before transferring any files would make rclone run a lot slower…

Perhaps the thing to do would be to queue up to 100 files (say) for transfer and update the stats as we go along. This will make the stats change rather a lot as rclone runs through, but for syncs where not a lot changes the stats would be accurate.

Hmm, there isn’t an easy solution to this!

Of course, there are different types of copies and some may be easier to estimate or enumerate than others. I think if you’re coming FROM local TO remote, its relatively easy to count the files ahead of time and won’t use a lot of resources. The reverse, however, isn’t true (both require a full pass to see what needs to be copied and how much data that is but doing so on a remote could take a long time before even the first bit is copied). When you add in the advanced capabilities like filtering, include/exclude lists, etc., it could get even more intensive.

I do notice that if I’m copying a large number of files from a remote with a filter (say, “rclone sync remoteA: remoteB:”) it starts copying/deleting as soon as the first criteria is met (ie, delete from B that which isn’t on A and copy from A anything that isn’t on or is different on B), there’s another thread that is running the checks. I may be copying 8 files in parallel while the checks counter is still increasing. I’m assuming that these checks are determining what ELSE needs to be copied or deleted in advance. Could that data be included in the stats without adding a lot of overhead (that is, Pending deletes, Pending copies)? There would have to be the understanding that these numbers aren’t final until the checks are complete (so you may see “pending copies: 5” and then later see “pending copies: 9”).

rclone has to enumerate the local and the remote in lockstep to work out what needs to be transferred so you don’t really win anything here.

Yes there are some internal queues which the length of doesn’t get surfaced. These are quite small though (usually --checkers or --transfers long).

Some statistics are funny, but I’ve gotten used to them, like this:

Transferring:
 *   ....mp4: 71% /285.571M, 11.069M/s, 2213473h42m8.290448384s

No, there are no problems with my connection, it’s just a slightly larger file.

Then, a few seconds later the file finishes being sent.

I don’t understand how it can possibly ever print that!

The code is here: https://github.com/ncw/rclone/blob/f9a1a7e700ff540490c5cc22e4a40502a8c89d6b/fs/accounting/accounting.go#L218-L241

It should always return an integer, so if it didn’t that means that 1E9 * seconds must have overflowed an int64. That in turn must mean seconds was > 18,446,744,073.

However we know that avg = 11069000 more or less and left = (100-71)*285.571E6/100 = 82,815,590 so float(left) / avg = 7.48s which is what we expected to see.

This makes me think maybe there is a memory corruption going on - in which case time to break out the race detector.

Or maybe this is a Windows bug of some kind (I think you are on windows aren’t you?).

Any chance you could build rclone with go build -race? And see if you can reproduce and if you get any RACE messages? If not I can send you a race enabled build tomorrow.

Indeed…

If there is an intermittent connection problem (disconnection for a few ms, eg) would it raise the count to infinity, causing overflow?

I would not be surprised if it’s a windows bug … :wink:

Yes, it’s a Windows 10 Pro 64 (1803) running on an i5-7200u and 8Gb of RAM.

I think I prefer it, because I don’t have git, etc on this computer.

BTW, I’m running Rclone with these parameters:

rclone: Version "v1.41-080-g3ef938eb" starting with parameters
["RCLONE" "sync" "drive_local:D:\\ioioioioioio"
"Wasabi:ioioioioioioi"
"--filter-from" "ioioioioi.txt"
"--config=D:\\ioioioioi\\rclone.conf"
"--stats-unit=bits"
"--stats" "15s"
"--stats-log-level" "DEBUG" 
"-vv"
"--log-file=D:\\Logs\\ioioioio_2018-06-10_17-42-49.log"
"--backup-dir" "Wasabi:ioioioioi/2018-06/2018-06-10_17-42-49/ioioioio/"]

I think the problem is that there is internal buffering in the s3 backend so that the samples given to the moving average are (5M, 0, 0, 0, 5M) rather than something smoother.

I had a good look at the source of the library that rclone was using to caculate the moving averages and I think it has a problem with the way rclone is using it (those zeroes reset its internal state). I replaced it with a very simple bit of arithmetic instead. So can you have a go with this and see if it has the same problem?

https://beta.rclone.org/v1.41-102-g0e4eab1b-fix-stats/ (uploaded in 15-30 mins)

I think the statistics are more consistent now. However, it still has something strange: it seems that Rclone reads a piece/chunk of the file and then calculates the statistic, then it “freezes” the calculation until reading the next piece/chunk, when it updates the statistic. See what happens on rows in which reading % is updated:

(Remembering that there are 4 simultaneous transfers)

	Line 1898:  *   .../20160101_135553.mp4: 21% /114.575M, 9.725M/s, 1m13s
	Line 1914:  *   .../20160101_135553.mp4: 21% /114.575M, 3.694M/s, 3m14s
	Line 1926:  *   .../20160101_135553.mp4: 21% /114.575M, 1.403M/s, 8m30s
	Line 1938:  *   .../20160101_135553.mp4: 21% /114.575M, 545.603k/s, 22m24s
	Line 1950:  *   .../20160101_135553.mp4: 21% /114.575M, 207.227k/s, 59m1s
	Line 1963:  *   .../20160101_135553.mp4: 21% /114.575M, 78.707k/s, 2h35m23s
	Line 1976:  *   .../20160101_135553.mp4: 21% /114.575M, 29.894k/s, 6h49m6s
	Line 1988:  *   .../20160101_135553.mp4: 21% /114.575M, 11.354k/s, 17h57m8s
	Line 2001:  *   .../20160101_135553.mp4: 21% /114.575M, 4.312k/s, 47h15m57s
	Line 2013:  *   .../20160101_135553.mp4: 21% /114.575M, 1.638k/s, 124h26m45s
	Line 2025:  *   .../20160101_135553.mp4: 21% /114.575M, 637/s, 327h39m3s
	Line 2037:  *   .../20160101_135553.mp4: 21% /114.575M, 241/s, 862h39m55s
	Line 2050:  *   .../20160101_135553.mp4: 21% /114.575M, 91/s, 2271h17m36s
	Line 2062:  *   .../20160101_135553.mp4: 21% /114.575M, 34/s, 5980h2m24s
	Line 2074:  *   .../20160101_135553.mp4: 21% /114.575M, 13/s, 15744h43m14s
	Line 2087:  *   .../20160101_135553.mp4: 21% /114.575M, 5/s, 41453h56m23s
	Line 2099:  *   .../20160101_135553.mp4: 26% /114.575M, 2.322M/s, 4m51s
	Line 2112:  *   .../20160101_135553.mp4: 26% /114.575M, 902.903k/s, 12m47s
	Line 2139:  *   .../20160101_135553.mp4: 26% /114.575M, 342.934k/s, 33m40s
	Line 2179:  *   .../20160101_135553.mp4: 26% /114.575M, 130.250k/s, 1h28m39s
	Line 2191:  *   .../20160101_135553.mp4: 30% /114.575M, 1.122M/s, 9m27s
	Line 2203:  *   .../20160101_135553.mp4: 30% /114.575M, 436.488k/s, 24m53s
	Line 2215:  *   .../20160101_135553.mp4: 30% /114.575M, 165.783k/s, 1h5m32s
	Line 2227:  *   .../20160101_135553.mp4: 30% /114.575M, 62.967k/s, 2h52m32s
	Line 2239:  *   .../20160101_135553.mp4: 30% /114.575M, 23.915k/s, 7h34m17s
	Line 2251:  *   .../20160101_135553.mp4: 30% /114.575M, 9.083k/s, 19h56m5s
	Line 2263:  *   .../20160101_135553.mp4: 30% /114.575M, 3.449k/s, 52h29m9s
	Line 2275:  *   .../20160101_135553.mp4: 30% /114.575M, 1.310k/s, 138h11m20s
	Line 2287:  *   .../20160101_135553.mp4: 30% /114.575M, 509/s, 363h50m6s
	Line 2300:  *   .../20160101_135553.mp4: 30% /114.575M, 193/s, 957h56m2s
	Line 2313:  *   .../20160101_135553.mp4: 30% /114.575M, 73/s, 2522h7m26s
	Line 2325:  *   .../20160101_135553.mp4: 30% /114.575M, 27/s, 6640h26m48s
	Line 2337:  *   .../20160101_135553.mp4: 30% /114.575M, 10/s, 17483h29m28s
	Line 2349:  *   .../20160101_135553.mp4: 30% /114.575M, 4/s, 46031h54m43s
	Line 2361:  *   .../20160101_135553.mp4: 30% /114.575M, 1/s, 121196h26m30s
	Line 2373:  *   .../20160101_135553.mp4: 34% /114.575M, 1.958M/s, 5m4s
	Line 2385:  *   .../20160101_135553.mp4: 34% /114.575M, 761.634k/s, 13m22s
	Line 2398:  *   .../20160101_135553.mp4: 34% /114.575M, 289.277k/s, 35m11s
	Line 2411:  *   .../20160101_135553.mp4: 34% /114.575M, 109.871k/s, 1h32m40s
	Line 2423:  *   .../20160101_135553.mp4: 34% /114.575M, 41.730k/s, 4h3m59s
	Line 2435:  *   .../20160101_135553.mp4: 34% /114.575M, 15.850k/s, 10h42m24s
	Line 2447:  *   .../20160101_135553.mp4: 34% /114.575M, 6.020k/s, 28h11m21s
	Line 2459:  *   .../20160101_135553.mp4: 34% /114.575M, 2.286k/s, 74h13m9s
	Line 2471:  *   .../20160101_135553.mp4: 39% /114.575M, 1.911M/s, 4m51s
	Line 2483:  *   .../20160101_135553.mp4: 39% /114.575M, 743.219k/s, 12m46s
	Line 2495:  *   .../20160101_135553.mp4: 39% /114.575M, 282.283k/s, 33m39s
	Line 2507:  *   .../20160101_135553.mp4: 39% /114.575M, 107.215k/s, 1h28m36s
	Line 2519:  *   .../20160101_135553.mp4: 39% /114.575M, 40.721k/s, 3h53m16s
	Line 2531:  *   .../20160101_135553.mp4: 39% /114.575M, 15.466k/s, 10h14m10s
	Line 2544:  *   .../20160101_135553.mp4: 39% /114.575M, 5.874k/s, 26h57m3s
	Line 2556:  *   .../20160101_135553.mp4: 39% /114.575M, 2.230k/s, 70h57m32s
	Line 2568:  *   .../20160101_135553.mp4: 39% /114.575M, 867/s, 186h49m34s
	Line 2580:  *   .../20160101_135553.mp4: 39% /114.575M, 329/s, 491h53m26s
	Line 2592:  *   .../20160101_135553.mp4: 43% /114.575M, 1.018M/s, 8m27s
	Line 2604:  *   .../20160101_135553.mp4: 43% /114.575M, 395.787k/s, 22m16s
	Line 2616:  *   .../20160101_135553.mp4: 43% /114.575M, 150.324k/s, 58m39s
	Line 2628:  *   .../20160101_135553.mp4: 43% /114.575M, 57.095k/s, 2h34m25s
	Line 2640:  *   .../20160101_135553.mp4: 43% /114.575M, 21.685k/s, 6h46m33s
	Line 2652:  *   .../20160101_135553.mp4: 43% /114.575M, 8.236k/s, 17h50m26s
	Line 2664:  *   .../20160101_135553.mp4: 43% /114.575M, 3.128k/s, 46h58m20s
	Line 2676:  *   .../20160101_135553.mp4: 43% /114.575M, 1.188k/s, 123h40m20s
	Line 2688:  *   .../20160101_135553.mp4: 43% /114.575M, 462/s, 325h36m52s
	Line 2700:  *   .../20160101_135553.mp4: 43% /114.575M, 175/s, 857h18m13s
	Line 2712:  *   .../20160101_135553.mp4: 43% /114.575M, 66/s, 2257h10m35s
	Line 2724:  *   .../20160101_135553.mp4: 43% /114.575M, 25/s, 5942h52m18s
	Line 2767:  *   .../20160101_135553.mp4: 43% /114.575M, 9/s, 15646h51m39s
	Line 2779:  *   .../20160101_135553.mp4: 48% /114.575M, 1.229M/s, 6m27s
	Line 2791:  *   .../20160101_135553.mp4: 48% /114.575M, 478.056k/s, 17m0s
	Line 2804:  *   .../20160101_135553.mp4: 52% /114.575M, 1.181M/s, 6m9s
	Line 2816:  *   .../20160101_135553.mp4: 52% /114.575M, 459.185k/s, 16m13s
	Line 2830:  *   .../20160101_135553.mp4: 52% /114.575M, 174.403k/s, 42m43s
	Line 2842:  *   .../20160101_135553.mp4: 52% /114.575M, 66.240k/s, 1h52m29s
	Line 2854:  *   .../20160101_135553.mp4: 52% /114.575M, 25.158k/s, 4h56m9s
	Line 2866:  *   .../20160101_135553.mp4: 52% /114.575M, 9.556k/s, 12h59m46s
	Line 2878:  *   .../20160101_135553.mp4: 52% /114.575M, 3.629k/s, 34h13m2s
	Line 2893:  *   .../20160101_135553.mp4: 52% /114.575M, 1.378k/s, 90h5m23s
	Line 2905:  *   .../20160101_135553.mp4: 52% /114.575M, 536/s, 237h11m44s
	Line 2917:  *   .../20160101_135553.mp4: 52% /114.575M, 203/s, 624h30m28s
	Line 2930:  *   .../20160101_135553.mp4: 52% /114.575M, 77/s, 1644h15m12s
	Line 2942:  *   .../20160101_135553.mp4: 52% /114.575M, 29/s, 4329h7m10s
	Line 2954:  *   .../20160101_135553.mp4: 52% /114.575M, 11/s, 11398h2m49s
	Line 2966:  *   .../20160101_135553.mp4: 52% /114.575M, 4/s, 30009h40m30s
	Line 2979:  *   .../20160101_135553.mp4: 52% /114.575M, 1/s, 79011h50m7s
	Line 2991:  *   .../20160101_135553.mp4: 52% /114.575M, 0/s, 208028h34m46s
	Line 3004:  *   .../20160101_135553.mp4: 52% /114.575M, 0/s, 547714h1m12s
	Line 3016:  *   .../20160101_135553.mp4: 52% /114.575M, 0/s, 1442064h35m13s
	Line 3028:  *   .../20160101_135553.mp4: 52% /114.575M, 0/s, 0s
	Line 3042:  *   .../20160101_135553.mp4: 56% /114.575M, 1.015M/s, 6m30s
	Line 3055:  *   .../20160101_135553.mp4: 56% /114.575M, 394.679k/s, 17m8s
	Line 3067:  *   .../20160101_135553.mp4: 61% /114.575M, 1.737M/s, 3m25s
	Line 3079:  *   .../20160101_135553.mp4: 61% /114.575M, 675.521k/s, 9m0s
	Line 3101:  *   .../20160101_135553.mp4: 61% /114.575M, 256.571k/s, 23m43s
	Line 3135:  *   .../20160101_135553.mp4: 61% /114.575M, 97.448k/s, 1h2m27s
	Line 3149:  *   .../20160101_135553.mp4: 61% /114.575M, 37.012k/s, 2h44m25s
	Line 3161:  *   .../20160101_135553.mp4: 61% /114.575M, 14.058k/s, 7h12m55s
	Line 3174:  *   .../20160101_135553.mp4: 61% /114.575M, 5.339k/s, 18h59m49s
	Line 3187:  *   .../20160101_135553.mp4: 61% /114.575M, 2.027k/s, 50h1m2s

I think this is the “real” statistic:

	Line 1898:  *   .../20160101_135553.mp4: 21% /114.575M, 9.725M/s, 1m13s
	Line 2099:  *   .../20160101_135553.mp4: 26% /114.575M, 2.322M/s, 4m51s
	Line 2191:  *   .../20160101_135553.mp4: 30% /114.575M, 1.122M/s, 9m27s
	Line 2373:  *   .../20160101_135553.mp4: 34% /114.575M, 1.958M/s, 5m4s
	Line 2471:  *   .../20160101_135553.mp4: 39% /114.575M, 1.911M/s, 4m51s
	Line 2592:  *   .../20160101_135553.mp4: 43% /114.575M, 1.018M/s, 8m27s
	Line 2779:  *   .../20160101_135553.mp4: 48% /114.575M, 1.229M/s, 6m27s
	Line 2804:  *   .../20160101_135553.mp4: 52% /114.575M, 1.181M/s, 6m9s
	Line 3042:  *   .../20160101_135553.mp4: 56% /114.575M, 1.015M/s, 6m30s
	Line 3067:  *   .../20160101_135553.mp4: 61% /114.575M, 1.737M/s, 3m25s

The question now is: is it possible to eliminate the intermediate lines?

That is very interesting…

How often were you printing the stats? I calculate every 15s - is that right?

98 samples from top to bottom of your example which is 24.5 minutes to send (61%-21%)*114M = 45MB of data. Which is only 30k/s which doesn’t seem right.

Can you examine a graph of your network interface? What speed does it think it is running at? Do you see peaks and troughs in it?

It seems to have worse behaviour than the old method of calculating it!

One of the problems is that rclone is measuring the reading into an internal buffer of 5MB in size but it doesn’t acutally know when the data is sent (as that is under the control of the S3 library).

That’s right, see the parameters in the above post:

"--stats-unit=bits"
"--stats" "15s"

Specific graph for rclone.exe:

Exactly, I think that’s the problem.

I did a little test, with the upload (sync) of just one big file (2Gb, a Veracrypt container) and comparing the statistics of the two versions.

Note:

  • in the tested connection, with only 1M/s of upload, 2Gb would be sent in ~4h;
  • there was no other application using the connection
rclone: Version "v1.41-080-g3ef938eb"

2018-06-13 11:04:50:0292	23	 * local2:  1% /2G, 22.751M/s, 11m51s
2018-06-13 11:05:05:0200	32	 * local2:  1% /2G, 8.366M/s, 32m14s
2018-06-13 11:05:20:0153	41	 * local2:  1% /2G, 3.077M/s, 1h27m40s
2018-06-13 11:05:35:0163	50	 * local2:  1% /2G, 1.131M/s, 3h58m24s
2018-06-13 11:05:50:0224	59	 * local2:  1% /2G, 1.938M/s, 2h18m51s
2018-06-13 11:06:05:0157	68	 * local2:  1% /2G, 729.646k/s, 6h17m36s
2018-06-13 11:06:20:0158	77	 * local2:  1% /2G, 268.321k/s, 17h6m50s
2018-06-13 11:06:35:0291	86	 * local2:  1% /2G, 98.673k/s, 46h32m16s
2018-06-13 11:06:50:0163	95	 * local2:  1% /2G, 4.023M/s, 1h6m32s
2018-06-13 11:07:05:0335	104	 * local2:  1% /2G, 1.479M/s, 3h0m57s
2018-06-13 11:07:20:0160	113	 * local2:  1% /2G, 557.133k/s, 8h12m5s
2018-06-13 11:07:35:0277	122	 * local2:  1% /2G, 204.882k/s, 22h18m7s
2018-06-13 11:07:50:0160	131	 * local2:  1% /2G, 75.344k/s, 60h38m45s
2018-06-13 11:08:05:0282	140	 * local2:  2% /2G, 1.441M/s, 3h5m17s
2018-06-13 11:08:20:0213	149	 * local2:  2% /2G, 2.809M/s, 1h34m50s
2018-06-13 11:08:35:0168	158	 * local2:  2% /2G, 1.033M/s, 4h17m54s
2018-06-13 11:08:50:0163	167	 * local2:  2% /2G, 388.979k/s, 11h41m18s
2018-06-13 11:09:05:0279	176	 * local2:  2% /2G, 143.044k/s, 31h47m3s
2018-06-13 11:09:20:0338	185	 * local2:  2% /2G, 52.604k/s, 86h25m49s
2018-06-13 11:09:35:0291	194	 * local2:  2% /2G, 1.360M/s, 3h15m20s
2018-06-13 11:09:50:0133	203	 * local2:  2% /2G, 512.282k/s, 8h51m10s
2018-06-13 11:10:05:0147	212	 * local2:  2% /2G, 2.310M/s, 1h54m43s
2018-06-13 11:10:20:0150	221	 * local2:  2% /2G, 870.031k/s, 5h11m58s
2018-06-13 11:10:35:0163	230	 * local2:  2% /2G, 319.947k/s, 14h8m21s
2018-06-13 11:10:50:0124	239	 * local2:  3% /2G, 2.555M/s, 1h43m28s
2018-06-13 11:11:05:0124	248	 * local2:  3% /2G, 962.254k/s, 4h41m21s
2018-06-13 11:11:20:0326	257	 * local2:  3% /2G, 353.862k/s, 12h45m6s
rclone: Version "v1.41-102-g0e4eab1b-fix-stats"

2018-06-13 10:56:19:0630	58	 * local2:  1% /2G, 205.883k/s, 22h21m34s
2018-06-13 10:56:34:0522	67	 * local2:  1% /2G, 2.582M/s, 1h44m12s
2018-06-13 10:56:49:0470	76	 * local2:  1% /2G, 3.170M/s, 1h24m40s
2018-06-13 10:57:04:0636	85	 * local2:  1% /2G, 1.204M/s, 3h42m57s
2018-06-13 10:57:19:0528	94	 * local2:  1% /2G, 468.198k/s, 9h47m1s
2018-06-13 10:57:34:0583	103	 * local2:  1% /2G, 177.827k/s, 25h45m33s
2018-06-13 10:57:49:0495	112	 * local2:  1% /2G, 67.541k/s, 67h49m14s
2018-06-13 10:58:04:0404	121	 * local2:  1% /2G, 1.190M/s, 3h44m55s
2018-06-13 10:58:19:0530	130	 * local2:  1% /2G, 462.959k/s, 9h52m11s
2018-06-13 10:58:34:0465	139	 * local2:  1% /2G, 175.837k/s, 25h59m9s
2018-06-13 10:58:49:0583	148	 * local2:  1% /2G, 66.785k/s, 68h25m4s
2018-06-13 10:59:04:0596	157	 * local2:  2% /2G, 2.191M/s, 2h1m36s
2018-06-13 10:59:19:0392	166	 * local2:  2% /2G, 851.998k/s, 5h20m10s
2018-06-13 10:59:34:0624	175	 * local2:  2% /2G, 323.600k/s, 14h2m59s
2018-06-13 10:59:49:0405	184	 * local2:  2% /2G, 122.906k/s, 36h59m30s
2018-06-13 11:00:04:0624	193	 * local2:  2% /2G, 1.537M/s, 2h52m55s
2018-06-13 11:00:19:0501	202	 * local2:  2% /2G, 597.643k/s, 7h35m18s
2018-06-13 11:00:34:0414	211	 * local2:  2% /2G, 2.279M/s, 1h56m18s
2018-06-13 11:00:49:0448	220	 * local2:  2% /2G, 886.394k/s, 5h6m12s
2018-06-13 11:01:04:0462	229	 * local2:  2% /2G, 336.663k/s, 13h26m13s
2018-06-13 11:01:19:0503	238	 * local2:  3% /2G, 1.291M/s, 3h24m48s
2018-06-13 11:01:34:0537	247	 * local2:  3% /2G, 502.105k/s, 8h59m13s

Different outputs, and to be honest I don’t think any of them are completely correct.

I keep getting the impression that statistics work well for small/medium file sizes, but not for big files.

But it is something that I have already gotten used to and I think it doesn’t interfere with the use of Rclone.

:wink:

The stats should sort of be eventually correct.

What rclone should really be doing is measuring the stats on the output of the buffer rather the input. That is what it does for the other remotes, but with wasabi rclone uses the amazon SDK which requires a block of memory as input which means it can’t do that.

:slight_smile: Well if you’d like me to look at it further then please make a new issue on github then I won’t forget!

Thanks Nick, but I don’t think it’s necessary, we can use it perfectly the way it is. :+1:

Even more if we consider that it is something specific of Wasabi / S3 / Amazon SDK.

1 Like