I would like to use rclone running on an raspberry pi as a backup solution. I want to use the serve command to serve a cloud storage to my network. Since my upload speed is pretty slow the always-running pi shall do the uploads to the cloud in the background while my devices can backup fast to the volume served by rclone.
What I want to know is how rclone behaves if my internet connection drops or the power goes out while files are not (yet) uploaded to the cloud.
From the perspective of the user (using the volume served by rclone) the backup would be complete (because it's already transmitted to the served volume) but how can I be sure that it eventually will be in the cloud even if the internet drops or the power goes out while the upload from the pi to the cloud is still running? Or in other words: Is the data persisted on the pi (even in a power-outage) until the upload is complete?
And related to that I have one more question: I will use restic on the clients to backup to the volume served by rclone. Restic has a feature to check the integrity of a backup. But if I use the feature in my configuratin described above how can I be sure that it's not using a state cached on the pi (which may be integer but is not necessarily the same state as in the cloud)?
Which --vfs-cache-mode are you using? If you aren't using any flags then rclone won't cache data. If you are using --vfs-cache-mode writes or higher, then the latest beta will retry and re-upload files in the event of a power outage or an internet connection drop.
What storage are you using on the Pi? I find that SD cards are quite liable to corrupt in unclean power offs.
You could do a restic check direct to the archive stored on the cloud?
In fact I'd recommend you configure restic to use rclone and backup direct to the cloud - cut out the middle man of the storage server if you want ultimate reliability.
Currently I don't use anything, I'm still waiting for the hardware to arrive and will set everything up in the next days. What do you mean with "won't cache data"? Does this mean that the upload would be piped directly to the cloud so the upload to the pi would go as fast as my internet connection from the pi to the cloud can handle? And can you estimate when the latest beta will be out of beta? Because if the current release doesn't support this I think I'll wait until the new release is out.
At least for the data I'll use an SSD since the micro SDs die pretty fast even with not that much writeload.
Yes you're right. But if possible I would like to avoid that since I have the pi in between anyway (reasons see below). I guess my main problem is that I don't fully understand how caching exactly behaves in rclone.
There is the vfs caching (which is missing in the global flags page, is this intentionally?) and there is the "normal" caching. What's the difference between them?
Also how does --vfs-cache-max-age exactly work for uploads? I.e. if I have --vfs-cache-mode=writes set and --vfs-cache-max-age=1h. Now if I upload a large file to the volume served by the pi and this file takes longer than 1h to upload from the pi to the cloud. What happens then? If rclone sees the --cfs-cache-ax-age as a hard limit wouldn't this mean that rclone would delete the file from it's cache before the upload to the cloud finished?
Regarding the restic checks: If I have a short caching time (let's say 1h) and I do a restic check of my backup after that time passed (so the read-cache should be empty) do you see any possibility that the check is successul even though the data was not successfuly stored in the cloud? In my understanding at that point the pi should have forgotten the data and should have to pull it from the cloud if restic requests them for the check, right?
Unfortunately I can't do this since some backups may need multiple days to upload to the cloud and I don't want to have the computer running 24/7. Also I'd like to keep the client side as simple as possible since there are multiple of them and some operated by people with a pretty bad relation to computers
Thank you in advance
P.S. rclone is one of these few pieces of software where I really have to smile when looking through the documentation because in many cases it makes things possible of which you couldn't even dream of before. So thank you for your great work.
The former is the "vfs" cache which works at a higher level than the "cache" backend which are the other flags --cache (except for --cache-dir which affects everything)
If you do rclone help flags cache you'll see the two sections!
Usage:
rclone help flags [<regexp to match>] [flags]
Flags:
-h, --help help for flags
Global Flags:
--cache-dir string Directory rclone will use for caching. (default "/home/ncw/.cache/rclone")
Backend Flags:
--cache-chunk-clean-interval Duration How often should the cache perform cleanups of the chunk storage. (default 1m0s)
--cache-chunk-no-memory Disable the in-memory cache for storing chunks during streaming.
--cache-chunk-path string Directory to cache chunk files. (default "/home/ncw/.cache/rclone/cache-backend")
--cache-chunk-size SizeSuffix The size of a chunk (partial file data). (default 5M)
--cache-chunk-total-size SizeSuffix The total size that the chunks can take up on the local disk. (default 10G)
--cache-db-path string Directory to store file structure metadata DB. (default "/home/ncw/.cache/rclone/cache-backend")
--cache-db-purge Clear all the cached data for this remote on start.
--cache-db-wait-time Duration How long to wait for the DB to be available - 0 is unlimited (default 1s)
--cache-info-age Duration How long to cache file structure information (directory listings, file size, times etc). (default 6h0m0s)
--cache-plex-insecure string Skip all certificate verification when connecting to the Plex server
--cache-plex-password string The password of the Plex user (obscured)
--cache-plex-url string The URL of the Plex server
--cache-plex-username string The username of the Plex user
--cache-read-retries int How many times to retry a read from a cache storage. (default 10)
--cache-remote string Remote to cache.
--cache-rps int Limits the number of requests per second to the source FS (-1 to disable) (default -1)
--cache-tmp-upload-path string Directory to keep temporary files until they are uploaded.
--cache-tmp-wait-time Duration How long should files be stored in local cache before being uploaded (default 15s)
--cache-workers int How many workers should run in parallel to download chunks. (default 4)
--cache-writes Cache file data on writes through the FS
--union-cache-time int Cache time of usage and free space (in seconds). This option is only useful when a path preserving policy is used. (default 120)