--backup-dir pruning code

kai · January 16, 2024, 10:07am

Sharing some bash code for pruning date stamped --backup-dir directories. You'll have to integrate the below snippet with your regular backup code. It should run immediately after every cron/periodic backup - once its stabilised it should just working in a first in first out arrangement. It will slowly converge to the target # of backups you want to keep, and stay there.

While I think its pretty resilient against over purging, you'll need to decide if you want to carry the risk on your own.

# some header stuff you'll want at the top

# auto_prefix is used to identify cron generated backups vs. manual backups for pruning.  Manual backups are never pruned
# comment out below line for the manual version of the script
auto_prefix='AUTO-SNAPPY-'
# max number of backups to keep (minimum 1).  We will keep backups 1,2,3,...,keep_size.  Older backups are aged out.
# make sure max_keep is something reasonable.  It is possible to zero out the backups over time if max_keep is 0 or negative!!
max_keep=5

<do your backup thing with rclone sync --backup-dir, etc >

echo 'Script: aging out old backups'

# credit to various stackoverflow contributors for the general scheme
#
# approach here is to progressively filter out what shouldn't be purged, and end up with a target list of which a small number may be culled via purge
# this removes a lot of the dependency on time attribute, exactly when the backups were taken and substantially
# reduce the risk of over purging files.

# setting things up
check_regex='^'$auto_prefix'\d{2}\-(0[1-9]|1[012])\-(0[1-9]|[12][0-9]|3[01])T\d{6}\/$'
echo 'Script: checking against regex     '$check_regex
tmpfile=$(mktemp)

# use rclone to pull top level directory, without recursion.  This should give us current + backup dirs + whatever else is in there
# there are some redundant flags c.f. regex to make sure output is good, because we're going to make purge decisions based on it
# if expanding to multiple remotes: MUST use one copy of the code for each remote.  Do not use the purge calcs from one remote on another!

sudo -u kai rclone lsf --dirs-only --dir-slash --exclude "current/" \
    --config='/home/'$owner'/.config/rclone/rclone.conf' \
    $remote1_location | \

# now pipe it to grep with a regex to select:
# 1. entries that which have the $auto_prefix prefix.  current dir will not meet this criterion and is protected.
# 2. entries that conform to the dateTtime format.  current dir will not meet this criterion and is protected.
# 2. entries that end with a slash (i.e. is a directory).
# What is selected *should* be a complete list of all non-current backups and nothing but that.  Manually generated backups, other directories and
# any files at the top dir level should not be selected by the regex.

grep -P $check_regex | \

# now pipe it to a reverse sort which orders the non-current backups in decreasing order by filename (which is the backup date).  i.e. most recent first
# this depends on the dateTtime format being sortable!

sort -r > $tmpfile

# determine the purge count using total # - max_keep.
# wc -l does a newline (\n) count of its input.  the input tmpfile should be clean as long as upstream is clean back to rclone lsf
# wc output should be 0 or greater normally; negative wc output would be ok in terms of not purging
file_count=$(wc -l < $tmpfile)
purge_count=$(( $file_count - $max_keep ))

echo 'Script: file_count is '$file_count', max_keep is '$max_keep', purge_count is '$purge_count

# there are some protective measures here to prevent catastrophic results due to an abnormally high and positive purge_count
# purge_count's inputs may have unintended results, e.g. if max_keep was 0/negative/undefined for some reason
# because of above & other factors purge commands are not looped and only max 2 purges will be executed in a single invocation.
#
# purge_count = 1 means we're in steady state, so take last (oldest) entry and delete.
# purge_count > 1 means max_keep has been adjusted downwards in a live system.  2 purges are used to gradually bring us back to steady state over time.
# purge count < 1 means we're not yet in steady state.  No purging needed.

# the first purge is if purge_count >= 1; needed in both cases of =1 and >1
if (( $purge_count >= 1 )); then
     echo 'Script: purging oldest backup'
     sudo -u kai rclone --fast-list --config='/home/'$owner'/.config/rclone/rclone.conf' purge $remote1_location$(tail -n 1 $tmpfile)
fi
# the second purge is only if purge_count > 1
if (( $purge_count > 1 )); then
# todo: add in some kind of warning/report that this is happening beyond the echo.  User should pay attention/be aware of this happening
     echo 'Script: purging second oldest backup'
     sudo -u kai rclone --fast-list --config='/home/'$owner'/.config/rclone/rclone.conf'purge $remote1_location$(tail -n 2 $tmpfile | head -n 1)
fi

rm $tmpfile
echo 'Script: done'