Hi,
I have many large databases (>100GB!) on Gdrive, mounted read-only with rclone. I need to randomly lookup some data (for example, a 256KB chunk on a random offset in a file) and I need to do it as fast as possible (max 2 seconds allowed). How can I accomplish this with rclone? What options are advised? Would the cache backend be useful at all? I'm already using my own client id.
What is your rclone version (output from rclone version)
rclone v1.45
os/arch: linux/arm
go version: go1.11.6
I can update, or compile beta from scratch, no problem....
Which OS you are using and how many bits (eg Windows 7, 64 bit)
Raspberry pi 4 8gb x64, so i can use 4 threads and max 6gb of ram.
I might install gentoo for max performance, idk
I don't have logs and config because my question is about a system that still does not exist and is going to be built in the next few days. You can think about a typical config for Google drive and a typical rclone mount cmdline.
Hi,
thank you for suggesting.
I made some profiling in the disk reads in the DB application and I can exclude that the app is slow and I can say that each read takes from 430 to 600ms, which I think is impressive, very fast, but unfortunately not fast enough for my use case (it needs to be at least 30% faster).
The Developer console say that their apis answer in 100ms. I don't know if they say the truth or not...
As for wasabi, I'm too lazy to reupload the files there I will try to convince myself...
Drive doesn't work for chia, every time the answers are slower randomly, and you can reach the download limit easily because when you download a small part of one file, google counts the whole size downloaded. With 100gb files you get limited fast
wrong.
With the one-week old chiapos update, which introduces parallel reads for complete proofs, these are the complete proof lookup times.
14.73
17.31
13.13
14.34
24.92
13.77
14.99
14.61
As you can see, they are below 30 seconds, so a complete proof can be looked up in the timeframe, and a reward be won.
My bet is that a custom implementation of GDrive APIs can further improve times. (difficult period for me so i can't work on this stuff at the moment)
A good question is: does a k=33 plot have 2x the lookups or does it have the same number of lookups with 2x size? If it's the second, you can further improve your chances by making larger plots.
Another question is how will it go with pools. We will see..
I see I will try creating a pool plot. It wasn't an api limit, it was the official chia cli plot checker that wasn't working, but I will test it on an actual pool plot then. Many thanks