As some of you know, I always update to the latest (Windows x64) build. So, yesterday I went with 5133, which includes changes to the bandwidth limiter (--bwlimit). My move command has been the same for a long time:
Most of those flags are RcloneBrowser defaults. The bug I'm seeing with build 5133 is that it sets the bandwidth limit at 15M, disregarding the flag. I tested this several times with different builds and found that it is only 5133 that does that. The previous build, 5128, does not have this issue.
I can confirm this for windows, the offending commit is 463a18aa0713a55e12e4cb6d201144121fc60ba5 (via git bisect). However, for me it is limited to around 8 MiB/s. Limiting works correctly only for limits <= 8MiB/s.
Edit: For some reason, the max bandwidth limit is TokenBucketSlotTransportRx * 2 KiB. Maybe the smoothening is going overboard?
I can't replicate this directly under Linux, however reducing the TokenBucketSlotTransportRx to 1k I can see the effect, just not as dramatically.
I think what is happening here is that the time to wait for a 4k block at 95 MB/s is too small. It is 41us, but I think the resolution of the Windows timer is lower than that.
If it is limiting you to 8 MiB/s then that implies a timer resolution of more like 488 us or 0.5 ms.
@x0b and @VBB can you recompile rclone and try to find the minimum value of these in fs/accounting/token_bucket.go which more or less hit the rate limit.
I think you'll probably need them at least 10x bigger.
I think the token bucket size might need to scale according to the bandwidth, but that is a bigger change that I'd like to put in just before a release so I might just revert this commit.
On the download side TokenBucketSlotTransportRx seems to be linearly correlated with the effective download bandwidth.
TokenBucketSlotTransport (MiB/s)
16 * 1024
32 * 1024
64 * 1024
128 * 1024
Rx
29.7
58.1
109.7
219.8
Tx
30.8
61.3
92.1
161.3
Assuming a linear relationship, TokenBucketSlotTransportRx: = 604 * 1024 would be required for 1 GiB/s. The numbers are on Windows 20H2 with an ancient (2013) CPU.
Edit:
Added upload (Tx) data. For Tx, I used the internal statistics of rclone and copied a file from disk into RAM. This might explain the difference.
If it is limiting you to 8 MiB/s then that implies a timer resolution of more like 488 us or 0.5 ms.
@ncw
You are definitely on the right track with your timer resolution guess. I used a tool to read the timer resolution, and it was 1 ms. When setting the timer resolution to 0.5 ms, the minimum value, the effective bandwidth immediately doubled. I then checked the windows energy statistics, the 1 ms is actually set by the rclone binary and would otherwise be 15.625 ms.(related discussion)
This seems to require a robust solution, meaning the commit needs to be reverted (or only applied to non-windows) for 1.54.
For me it was capped at almost exactly 15MB/s. This is on a gig/gig connection and letting it run for a good twelve hours overnight. It started off around 6-7 MB/s and slowly made its way up to 15.