Cache questions - keeping a local copy of data after it is moved

Hi all…

I have a quick question… I am working on creating something for a customer and just need a shove in the right direction.

I am trying to create a pseudo file repository for them using cloud storage. I have the bare bones working fine thanks to Rclone!! brilliant stuff…

basically I am looking to cache writes to the cloud <-- this is easy using the cache feature with the tmp upload… BUT it doesn’t achieve the endgame… as soon as the data is loaded it is removed from the tmp upload, but not chunked into cache. I need new files to get uploaded ASAP (cache can do this) but then still be available in the local cache for what ever rules are set (size, time etc…). if this possible??

I am playing with the VFS cache… but the only option that seems that might work (just starting testing now) would be vfs-cache-mode=full ( I have tested writes already and performance is horrid for this use case ) …

any input would be welcome

Related GH Issue: https://github.com/ncw/rclone/issues/2700

def related… but that doesn’t seem to be going anywhere…

from my description I am basically trying to spin up a cloud gateway… so need the current data (stuff written now) to stay local and then age out of cache as per normal rules if I pulled a file from the cloud.

in its current form potentially a 2year old file has faster access than a file writes 10 minutes ago…

If i understand you correctly you could probably use:
https://rclone.org/rc/#cache-fetch-fetch-file-chunks
you have probably already some scripts running to manage your “cloud gateway”, so just tell it to cache the file as soon the upload is finished via the rc command.

Have a nice day!

This is probably the best solution but you wouldn’t be able to tell when a file has finished uploading short of polling the tmp-upload-path every few minutes.

If you use --vfs-cache-mode writes then it will cache the uploaded file until --vfs-cache-max-age passes. I would have thought this would be what you want…

--vfs-cache-mode full will download any files to the local disk first.

Why are you finding the performance of --vfs-cache-mode writes bad?

I’ll test this,… I didn’t have the --vfs-cache-max-age flag… but when I tested this without as soon as the file was uploaded it wasn’t anywhere to be seen and had to be downloaded again…

and the performance comes from the writes… the files don’t show until committed for some reason… I am still testing so will add the flag and test properly and report back

Can you share the full command you used ?

@Animosity022

rclone mount
–config=/data/local/rclone/rclone.conf
–drive-chunk-size=64M
–allow-other
–cache-dir=/mnt/ssd/data/cache
–dir-cache-time=48h
–vfs-cache-max-age=8760h
–vfs-read-chunk-size=32M
–vfs-read-chunk-size-limit=512M
–vfs-cache-mode=full
–low-level-retries=20
–buffer-size=0
–log-level=DEBUG
–log-file=/data/logs/masters.log
–tpslimit=10
gdrive:Masters /data/masters &

this is the current script im testing…

I think that looks pretty good in terms of what you are going for, but with full cache mode, it’s going to download everything all the time for every read of a file so there would be an appearance of latency if you worked with larger files as it has to download it all the way.

My testing currently shows that the data seems to stay in the cache after adding in the –vfs-cache-max-age flag.

the performance issue I am talking about is more around the write speed. the rig I am testing is on an ADSL service… only 1MB up… so when a file is written, it gets written to cache quickly, but won’t show up until it is uploaded to the back end… this is the part I need to fix!!

cool… I’ll go back to “write” and see what happens…

same story so that s a good and a bad thing… means that I can read files “faster” because they will stream down etc…

but when I write a largish file (in this instance because of the slow uplink even a 50MB file is large) the client stops responding to the file is written to the back end… I would love to for it to commit to cache and then complete the file write when it can… especially if I have multiple endpoints writing data simultaneously.

I am trying to avoid a Union Mount and moving the data via scripts every XX min

hi all… I guess I am trying to create a Google File Stream style service to run on Linux… current data is cached, new data gets cached straight away… (don’t worry about offline… thats not a requirement)

You are in an odd spot as the cache backend would kind of work, but that need the vfs writes on if you are actually using it as a mount.

Best bet is to wait for a few of the fixes to go in.

https://github.com/ncw/rclone/issues/2327

This would be the one I would think would help your situation.

thanks guys… sorry I am still exploring this and looking at different options… if I get a working solution then I will share… but will also track and watch!!