Syncing directories to Azure Blob Archive Tier

GoingOffRoading · December 31, 2018, 6:33pm

Hey RClone forum… Perhaps you can help me answer a question I can’t seem to find documentation on.

I have a directory on my home NAS (FreeNAS server) of raw camera footage that I currently rclone (sync) to an Azure Blob (Cool Tier). The footage in the NAS directory is:

Constantly being added
Never modified
Rarely deleted (deletions are singular files like unneeded b-roll)
RCloned (sync) to Azure Blob as a backup of my NAS raw footage directory.
Currently sitting at 5TB and growing

Given that I’m using Azure Blob as a backup, I’d love to move my files from Blob Cool Tier to Blob Archive Tier as I personally have no need to read the data (outside of catastrophic failure of the NAS).

What is unclear to me, is how RClone and Azure behave when syncing directories between a NAS and Azure Blob that have NAS files in Archive Tier. Azure documentation makes it very clear that the Archive Tier data is unreadable but the Archive Tier meta-data is readable… However I can’t find the documentation on exactly what meta-data is available and if that’s the meta data RClone needs to do a sync. This leaves me in an unknown state of how RClone/Azure will behave when trying to sync my NAS Directories to Azure (with the greatest fear of waking up to a massive bill from Azure for hot reads of Archive Tier data).

Can anybody provide guidance here?

ncw · January 3, 2019, 11:23am

I believe that as far as rclone is concerned Archive blobs will appear like any other blobs. So the metadata rclone needs (checksum, last modified time and size) will be present.

I suggest you perform a small experiment in a new bucket if you are worried about the costs.

Maybe we should update the docs a bit more?

GoingOffRoading · January 3, 2019, 7:30pm

I think here I am concerned (and obviously need to do my own small scale testing) is that I EXPECT rclone to use Azure’s checksum when validating the files in the existing Azure Blob (which the rclone documentation supports?).

For some reason, I have a large spike in ‘Hot Block Writes’ every time RCLONE runs even if I’m uploading a menial amount of data… Which makes me think that rclone is doing file reads in Azure instead of maybe using the existing file hashes to validate the directory?

Azure usage: http://prntscr.com/m2p8go

ncw · January 4, 2019, 5:01pm

Yes rclone uses Azure's checksum for validation. You can see the checksums with rclone md5sum bucket:path

I don't think reading the checksums is it. It might be something else though!

Can you post a log with -vv that might shed some light?