Max Pages on Linux with rClone/MergerFS

darthShadow · February 14, 2020, 11:46am

I have been trying all the commands with both the flags specified and also with and without vfs writes enabled. Hasn't helped.

darthShadow · February 14, 2020, 12:01pm

Writes Disabled: https://drive.google.com/file/d/1Et5GQi9OQe_itUlSzqzurJnomBQpRR6z/view

Writes Enabled: https://drive.google.com/file/d/1fX2sBDgJNAegg3AMDgq6zVKH9Whuu3oQ/view

ncw · February 14, 2020, 12:49pm

That looks like Rcat is ignoring --ignore-checksum

I'll push a fix for that in a moment

That looks like the file gets copied to cache. It is the same length as the file in the object store, so rclone checksums the local and the remote before deciding whether to upload it. It looks like maybe it doesn't upload it - check the log!

darthShadow · February 14, 2020, 1:13pm

Amazing that you were able to figure that out just from a flamegraph, but, yes, that was the case. After deleting the file from the mount and copying a new one, the speeds have increased. Its not as good as mergerfs but that's probably because of the double copying required. Once the Rcat bug gets fixed, it should be close to or equal to the performance of mergerfs.

Are there any other checksum algorithms available to use other than Md5? Just want to see if something else will be faster than Md5 when enabled.

darthShadow · February 14, 2020, 1:23pm

Updated Profile with Write Caching Enabled: https://drive.google.com/file/d/1vb05hGLpEAz1ispZ8YXFqH0sMTTd-G6_/view

Doesn't seem to have any obvious bottleneck.

EnorMOZ · February 14, 2020, 1:30pm

Are there any other checksum algorithms available to use other than Md5? Just want to see if something else will be faster than Md5 when enabled.

crc32 is faster on a computational level however you are still limited by the io read speed which at that points makes it just about even across the board. Maybe a second or two difference between them.

This actually seems like an interesting project... https://cyan4973.github.io/xxHash/

darthShadow · February 14, 2020, 1:44pm

Some more benchmarks for Reads on a local mount on an NVMe Disk:

hanwen/go-fuse (mount2)

darthshadow@server:~/max-pages$ rm 10G.img && dd of=10G.img if=test-mount/10G.img count=10240 bs=1048576
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 4.56218 s, 2.4 GB/s
darthshadow@server:~/max-pages$ rm 10G.img && dd of=10G.img if=test-mount/10G.img count=10240 bs=1048576
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 4.56656 s, 2.4 GB/s
darthshadow@server:~/max-pages$ rm 10G.img && dd of=10G.img if=test-mount/10G.img count=10240 bs=1048576
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 4.78932 s, 2.2 GB/s

Flame Graph: https://drive.google.com/file/d/1Jm1wOO9jb5QWPwUvDtilISf4QliENHYU/view

bazil/fuse (mount)

darthshadow@server:~/max-pages$ rm 10G.img && dd of=10G.img if=test-mount/10G.img count=10240 bs=1048576
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 7.83715 s, 1.4 GB/s
darthshadow@server:~/max-pages$ rm 10G.img && dd of=10G.img if=test-mount/10G.img count=10240 bs=1048576
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 8.46278 s, 1.3 GB/s
darthshadow@server:~/max-pages$ rm 10G.img && dd of=10G.img if=test-mount/10G.img count=10240 bs=1048576
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 9.18021 s, 1.2 GB/s

Flame Graph: https://drive.google.com/file/d/1CStArTi9-1laoKYCWfYl2DeSQTzIzyeR/view

Significant amount of time is spent on GC which probably explains the speed differences. Anything obvious pop out to you, @ncw ?

ncw · February 14, 2020, 2:27pm

They have a suprising amount of info in - a good visualization!

I've pushed a fix for that to master now.

Rclone supports a few algorithms... If you change the order of the hashes in fs/hash/hash.go then you can try CRC32 as the first one which will be faster than MD5. I prefer MD5 for actual data integrity checks though!

The one avoidable part is the memove bit at the start which is memory copying...

mount2 looks pretty much as fast as it could be - nearly all the time is spent in syscalls.

mount is using a lot of time in the garbage collector. It could be the underlying library creating the garbage or it could be the specific interface so cmd/mount vs cmd/mount2. There is probably a way of profiling where the garbage comes from... Something like this which profiles all the memory ever allocated

go tool pprof -alloc_space -svg http://localhost:5572/debug/pprof/heap > heap.svg

I suspect it is probably the fuse library as go-fuse does talk about "performance competitive with libfuse" on its front page as one of its features.

Dual-O · February 15, 2020, 4:57pm

sorry for the late reply
Tested today mount2 with rclone v1.51.0-042-g219bd97e-beta
df -h still not working

ls -ld /mnt/mount/temp
drwxrwx--- 1 rclone rclone 0 Nov 12  2018 /mnt/mount/temp

cd /mnt/mount/temp
-bash: cd: /mnt/mount/temp/: Permission denied

touch /mnt/rc1/temp/test.txt
*no errors*

rm /mnt/mount/temp/test.txt
rm: remove write-protected regular empty file '/mnt/mount/temp/test.txt'? y
*no errors after y*

ncw · February 16, 2020, 4:16pm

I haven't had time to fix it yet!

darthShadow · February 17, 2020, 6:28am

ncw:

go tool pprof -alloc_space -svg http://localhost:5572/debug/pprof/heap > heap.svg
I suspect it is probably the fuse library as go-fuse does talk about "performance competitive with libfuse" on its front page as one of its features.

Looks like you are right.

Memory Flamegraph: https://drive.google.com/file/d/1rEF5E974uyVbA-0XlzZpv1z-mq-Z9pyA/view
Memory Heap: https://drive.google.com/file/d/10n19adUrX0e3RzRoGQjKwhoKWEkmn4hH/view

I also tried with the latest master and the results are below:
Write Caching Enabled: https://drive.google.com/file/d/1vb05hGLpEAz1ispZ8YXFqH0sMTTd-G6_/view
Write Caching Disabled: https://drive.google.com/file/d/1Et5GQi9OQe_itUlSzqzurJnomBQpRR6z/view

Looks like we are good other than the memmoves which show up

I also tried out a couple of hashing algorithms and the results are below:

MD5: https://drive.google.com/file/d/1fQN_fhIb_CU7IrMDQjIoRkiW3i34PI9p/view
CRC32: https://drive.google.com/file/d/1w2B5pRO2sgyZP5_c4BBhCcHn0EuJRazG/view
XxHash64: https://drive.google.com/file/d/1nweOF12AyW4LQGSvrCJUTrrXQh4s6VNn/view

MD5 was the slowest and XxHash64 was the fastest, with CRC32 being just slightly slower than XxHash64.

Can we add an option to allow choosing the hash as a parameter? And what would be the implications of changing the hash for the backend? Will it effect existing files on the backend?

ncw · February 19, 2020, 5:14pm

Nice! That reminded me that the bazil fuse has a pathalogical memory problem when you set attr timeout to 0 which is probably related...

I made a fix to the accounting which implements WriterTo which it passes on to the async reader - this could help here. I have't merged it yet - it needs a bit more testing.

We are limited to which hashes the backends support. Most backends only support one hash, but some support more than one. I think google drive supports CRC32 but rclone does't implement it. The local backend can support any hash, but will use MD5 hash by default.

We could add (say) xxhash64 to the local backend - this would only get used for local -> local copies (since I don't think any other backends support it) but it would speed those up. I'm not keen on crc32 as an integrity check - it is better than nothing, but it only takes 3 bitflips to defeat it. I don't know the theoretical properties of xxxHash but it will be limited by only being 64 bits...

We'd need to add xxhash64 has a supported hash function and put it first in the list - local -> local would use it automatically then.