Dirclean: delete oldest files until dir size is less than threshold

This is somewhat off-topic, but I imagine some rclone users may find this useful for their cache directory. (The idea of a cache directory was originally described here with acd_cli + UnionFS. I personally use rclone mount + mergerfs. This approach works reasonably well to cache new files locally. rclone copy then pushes the data to the cloud every night.)

To maximize the usage of my local cache directory, I wrote a script called dirclean. This script will recursively delete files in a directory if the directory exceeds a provided size threshold. The oldest file is deleted until the directory is under the size threshold.

download dirclean

usage: dirclean.py [-h] [--delete] dir bytes

Recursively delete files until dir size is less than threshold size. Files
with oldest modification time are deleted first.

positional arguments:
  dir         root directory
  bytes       threshold size in bytes

optional arguments:
  -h, --help  show this help message and exit
  --delete    perform delete (default: print files to be deleted)

IMPORTANT! Always test without the --delete parameter to ensure the delete behavior is as expected. Also, the threshold is provided in bytes, so be careful providing a low number.

Previously, I cleaned my cache directory via a find command that deleted files older than n days. However, this approach was not ideal and did not maximize the space provided for the cache. Hopefully this is useful to other people. PRs welcome!

Nice one! I note that some people have been using rclone move instead of rclone copy for this purpose clearing the cache when the file has been uploaded.

Thanks! Yes, rclone move is definitely useful for certain situations. However, in the scenario of a merged rclone + local filesystem, my goal is to keep a local copy as long as possible to increase the chance of local read versus remote read. Cached writes are also possible in this strategy, but I usually write directly to the local filesystem instead of going through the merge layer. Less complicated.

Initially, I had been using acd_cli + UnionFS + encFS. The mount(s) would always become unstable after a while. After switching to rclone crypt/mount + mergerfs, I’ve been really impressed by the stability. Kudos!

1 Like

nice one. thank you :slight_smile:
Can you provide us more info, how your setup looks like?
e.g. how does your folder structure look like?
How do you mount your mergefs?

Ok, I think I have a similar setup now:
My local storage and my Google Drive (crypted mounted by rclone) are now behind a mergefs mount (policy found first) now.

What doensn’t work is editing existing files (on gdrive). For example if I try to edit TXT files by nano and try to save, I get something like 'illegal seek".
Does it work for you? Can you try it?

rclone can’t open files for read and write together, and it can’t seek in write only files. To fix that, rclone would have to buffer files locally which it may one day.

good to know.

but there is one ugly thing on this issue:
when it comes to “illegal seek” all the content is lost in the “modified” file after reopening it.

Hi aus, could you please share with us your setup details? By details I mean:

  • your directory structure
  • commands and/or fstab entries, that you use for mounting: rclone, crypt (if you use it) and mergerfs
  • crontab entries for maintenance (move/copy/sync)

If you’ve already created a stable infrastructure, please don’t leave us reinventing the wheel :slight_smile:

Any chance until that day for rclone not to do any changes eg not delete the file, but just give error Rclone mount crypt (amazon drive) filed destroyed on edit/save