==========================================
Hi all
Thanks to all the team who maintain this forum and this tool.
I've got the job to upload our enterprise backup to BackBlaze B2
I've been testing three different tools: rclone, b2 and cyberduck cli and I decided by rclone.
My data universe is about 4TB with file sizes starting from 300 Mb to 600Gb approx.
My script sends one file at a time to the cloud...
With biggest files, performance is great, about 80Mbits/sec
But, with with smaller ones (300 Mb to 1Gb approx) performance is not good, about 30Mbits/s
What I have seen is that rclone opens more in parallel connections for large files than for smaller ones, I'm right?
If that is the case, and knowing that the number of parallel connections is decisive for a good performance, How can I instruct rclone to open more connections even when the files are not very large?
Right now my powershell sentence is very simple and I'm using default config:
Hi @ncw I were doing some test with --transfers 50 in my upload line and then looking at netstat command how many connections originated from rclone process.
In the best case there were 30 opened connections at a time for a 2.5GB file and in a case of a 1GB file there were around 10 connections at a time approx.
Sadly I could not verify that --transfer parameter has had an effect in the number of connections opened in parallel.
From my point of view, Rclone "decides" how many connections to be opened based on the file size more than any other parameter.
Then I went back to "Edit advanced config" for the given endpoint just in case I were missing something but the only interesting configuration there was "chunk_size".
So if there were another parameter to include, it would be great, else I'm thinking in sending two or three files in parallel to maximize bandwidth usage.
Regards
Ok, I think now I can understand the behavior, so my guess was correct.
Maybe I was waiting for a similar behavior to B2 upload tool, where you can use --threads parameter and it will open as many connections as threads regardless of chunk size.
Evidently, if I want more connections then i have to narrow chunk_size.
Thank you.
Hi all
I'm updating this post to inform that reducing the chunk_size greatly increased the bandwidth utilization.
Now I'm using --transfers 100 and chunk_size 15 Mb, with this setup I could get 100 connections in parallel and a full utilization of the bandwidth.
Thanks again to help me understand the behavior of these settings.
Regards.
No worries! Larger chunks helps with larger files I think, however the setting of --b2-chunk-size is largely historical. B2 used to only support 100Mb chunks (10E6 bytes) which is where the default of 96MiB comes from. I think chunks can be as small as 5MB.
It is clear that increasing parallel connections increases throughput. No doubt.
What I think is most important is to maximize the number of connections. With that premise reducing the size of chunk worked great in my case (although the B2 documentation speaks of 100Mb, a size of 15Mb was well supported).
My suggestion is to make this clear in the documentation.
Regards.
Hi all
After a while working with this configuration very well, yesterday I faced a problem.
Some big files were refused to upload by the B2 backend.
rclone.exe : 2019/10/06 19:26:32 ERROR : files01.vm-52863D2019-10-05T210044_70C4.vbk: Failed to copy: "files01.vm-52863D2019-10-05T210044_70C4.vbk" too big (298349092864 bytes) makes too many parts 18969 > 10000 - increase --b2-chunk-size
It seems that the number of parts of a file can be chunked can't exceed 10000
I comment it just in case the documentation did not reflect this limit,(I didn't checked) If it is already in the doc, please dismiss.
Now I have a doubt; b2-chunk-size parameter can be configured at the same moment of firing rclone or only can be previously configured by rclone config ?