EDIT : I can confirm the new beta is working a lot better, it's not downloading like crazy anymore Very well done. And I guess "- Download to multiple places at once in the stream" this will do wonder for p2p stuff too.
Sorry forgot to mention that I took that out! I came to the conclusion that the way the internal buffering works the read ahead is effectively --buffer-size which we have a flag for already.
Great!
Thanks! And yes it should do, though I should say that I didn't put a limit on how many downloaders you could have at once, yet! The worst case would be the size of the file divided by the --buffer-size in use...
What it will fix is the N people streaming from the one file problem which has been mentioned to me several times.
Do you think it would be possible to have stats about read cache hit ? Like the % of chunck reads that were in the cache, this sort of things ? I know I ask a lot : o
So demanding I hadn't thought of making stats, I think it is a great idea. I'll put it on the list. It would probably be an rc command which could give you some stats on the vfs, maybe vfs/stats. Or maybe make it part of the general rclone stats - then it could come out in the prometheus stats.
That is great news! I've been testing it on my home internet which isn't the quickest and it is as fast as everything else. I should do some local testing too I think.
The cache is really working well, that and the fact that the downloaded data are more in line with what is needed. my queries per 100 seconds went from around 150-200 to 40-60, for the same usage.
Hello. Can someone confirm that --dir-cache-time is working for them with this beta ? I've set it at 168h, and this morning a lot of the cache was cleaned, with the messages in the rclone console that it was too old...
dir-cache-time controls the metadata caching not the file caching, you want these for the file cache
--vfs-cache-max-age duration Max age of objects in the cache. (default 1h0m0s)
--vfs-cache-max-size SizeSuffix Max total size of objects in the cache. (default off)
2020/06/11 10:42:57 ERROR : IO error: open RW handle failed to open cache file: vfs cache item: check object failed: vfs cache item: open truncate failed: vfs item truncate: failed to open cache file: Le chemin d’accès spécifié est introuvable. (meaning tthe specified path does not exist).
Now the path is pretty long, but i don't remember having this kind of error before.
If you want I can give you access to 1 gbps servers for testing!! That shouldn't slow you down lol
I'm really interested in trying this new cache mode... I just have some questions:
Does it handle the same file opened simultaneously very well?
It seems that when a file is opened, it needs to be first buffered to disk before giving data to the application, isn't possible to do it simultaneously ? Like read first chunk from remote and serve it right away to speed up start times, then start caching the next chunks in async mode
Would any of my current settings conflicts with whatever defaults you are using for this cache?
Do you think storing the cache in RAM would speed up things? Since it looks like a random read pattern, where some HDDs could struggle? i.e can the average HDD max 1 gbps out if you are doing random reads to small chunks of files?
I just had a look through your log a bit more carefully
The path in question is 199 bytes. Added to C:\Users\XXXXXX XXXXXXXX\AppData\Local\rclone\vfsMeta\Google-G-Suite-Crypt" that could make it longer than 260 bytes. Windows docs on path length.
In the local backend we are careful to use paths which look like this
To specify an extended-length path, use the "\?" prefix. For example, "\?\D:\very long path".
Which do not have the 260 character limit. So it looks like I need to do that in the vfs cache also.
I'll factor out the code out of the local backend to do this and it should fix this problem.
Have a go with this - it uses UNC paths on Windows so shouldn't have this problem. It is otherwise unchanged.
It writes each chunk to the disk then reads it back from the disk, however it should be in the OS cache at that point so be effectively instant. This could potentially be optimized further...
Let me know if it is a big problem - I don't think it should be.
Ah, you need --buffer-size to be set for the cache to be effective. I need to warn the user about this. I recommend the default of 16M to start with
Frequent files should get buffered by your OS so provided your working set < your RAM you should be OK. If your working set > your RAM and you are using HDD then nothing will save you except for SSDs!