I’m testing out using Rclone (along with GDrive) as a replacement for CrashPlan. Rclone suits my needs, as I am more interested in syncing/cloning local directories in the cloud than I am in a full-fledged and complicated backup system (multiple clients, snapshots, etc.). I really like the ability to mount backups, ease of use, and multi-platform support (as I may be using it on Mac, Linux, and Windows). In testing, I discovered by default, Rclone re-uploads any file that has an altered filename or directory location. I’ve also read where --track-renames and --checksum don’t work with encrypted hosts. Are there any workarounds or other options for this? This would be a big issue for me, as I am often working with large files which can change directories and don’t want to have to re-upload them constantly.
I’ve also looked into Restic with Rclone as a backend. Restic’s big (and only) selling point is it doesn’t have this problem, but apparently it can not mount on Windows for restore. I’m not sold on Restic’s use of compressed zip files rather than just a simple directory structure, and Restic is also more complicated overall. I would prefer to use Rclone by itself if possible. It just seems like there are way more possibilities for something to go wrong with Restic. Perhaps there are other backup tools which can interface with Rclone that would better suit my needs?
The fundamental problem with crypted files are that in order to --track-renames we need to be able to positively ID them as identical. The foolproof and obvious way is to compare hashes, but the problem is that while we have the hash for the encrpyted file, the cloud drive can't possibly know the hash of the real file inside because it obviously doesn't have our private key and the encrypting a file changes it's hash. Comparing your unencrypted file to a crypted one would be meaningless because they wouldn't match even if identical.
This means that --track-renames work fine on unencrypted->unencrypted, or crypted->crypted (with the same crypt key), but with any other combination we either lack hashes to compare on one side or both.
So the best solution is to avoid re-encryption if possible. For example if you set up a local crypt mount (with the same key) and just pull your files through that before you upload then --track-renames will work fine! (since we now have the same resulting hashes to compare on both sides) This is likely the best and most direct workaround for you, and should be fairly convenient too. Just remember to avoid uploading already encrypted files through your existing crypt remote or else you get them double-crypted You will probably want to have 2 mounts for uploading. One for your regular use and another with no encryption (meant specifically to handle these pre-encrypted files).
Let me know if I need to elaborate on how to do this and I will do my best to assist.
And if avoiding the issue isn't possible... then it may still be doable, but it will require new code and some careful thinking. If you are interested in delving deeper into this I made a thread on exactly this topic and had a long conversation with NCW about it and how we may go about attempting to develop a feature that would allow you to bypass this restriction.
Thanks for your detailed reply! I appreciate the info. Your suggestion would require an additional local rclone store, correct? This means double the local disk space used. I don't always have such space available for everything I intend on backing up to the cloud, so unfortunately, I'm not sure this would work for me.
Perhaps Rclone can't do what I need right now? I'm also looking at Restic and Duplicati. Each have their issues. I like Borg, but it doesn't support cloud backups, and I'm thinking using it with an Rclone mount is probably not a good idea (slow, unstable)? Does anyone have any experience with this? I tried Arq, and I don't mind it, but it isn't available on Linux. I've also read about some restore/corruption issues that give me some pause.
I don't see why it would require double storage.
You can just store it once in encrypted form.
Any unencrypted access you need outside of backing up can be done on-the-fly via the local crypt remote. There is basically no performance penalty to this. Just a fairly trivial amount of CPU cycles. Nothing to worry about unless your server is a micro-PC like a PI or something...
I'm also in discussions with NCW about making this whole thing work automatically so --checksum and --track-renames would work natively between local unencrypted files and encrypted remote files. That would remove the need to use the local crypt remote as it would basically do something similar internally for you without having to worry about it.
We basically already understand what needs to be done, so it's "just" a matter of getting it implemented.
If you use the workaround for now you can migrate to this more convenient solution later if you feel you want to.
Regarding other backup and sync software - you can use basically any other software on an rclone mount if you use cache-mode writes. The only thing you need to be aware of is that it is not possible to do block-level access. Almost all cloud services can only operate on whole files - so you can't really transmit changes as partial files on the block-level. Or you could - but these changes would have to be stored as their own files. This is only really an issue of you frequently make small updates to large files though. It's not really a limitation of rclone, but rather the basic functionality of the cloud-services themselves.
I wasn't thinking about storing the original files in an Rclone encrypted store, so thanks for clarifying. Something to consider. I'm glad to hear there should be native support for this coming. Hopefully the wait won't be too long!
I tried using Borg with an rclone mount with both "--vfs-cache-mode minimal" and "--vfs-cache-mode full" and kept getting errors and timeouts. Perhaps I am invoking this incorrectly, or there is something going on with Borg. I'll keep playing around with it.
NCW is likely the one who implements this so I can't promise anything on his behalf, but the current milestone for it is set for 1.49 stable (next stable version).
I haven't used Borg, but from the description, the chunking approach doesn't seem bad for this use. It will kind of give you a fake block level access so you can do partial file updates anyway. I know other software can do similar things by saving only the changed bits as a separate file and then reconstructing the latest version by layering them when read back. As long as software operates on the file level and not sub-file - then rclone should be fully compatible with that.
I'd recommend using cache mode writes. That will be fully compatible with all expected file operations.