That is very interesting analysis - thanks.
I expect that FileZilla does overlapping reads to get the performance up or uses bigger buffers.
I had a quick look at the code for the SFTP library - it appears that it does use overlapping reads if the read buffer is big enough.
The default read buffer is 32k - it is possible to increase that, however not all servers are guaranteed to use more than 32k. I wonder if FileZilla does that? I looked at FileZilla's source briefly then I remembered why I'm not a C++ programmer any more
Rclone uses 1 MB buffers to accept the reads from SFTP...
It might be worth experimenting with the --buffer-size
parameter and see what different that makes. Try 0 as a baseline too.