Extend glob-patterns for "not" matches without regex

It would be nice to be able to use rclone glob filtering to exclude NOT matches. For example,

A/
B/
C/

to be able to say "NOT /A/** OR /C/**"

To help clarify, it may be helpful for me to say what is currently missing (and maybe I am also wrong!)

Why not regex

Regex can absolutely do this except that regex breaks directory filter rule optimization. At least in my use case, this makes it way too slow to use

Docs:

If any regular expression filters are in use, then no directory recursion optimisation is possible, as rclone must check every path against the supplied regular expression(s).

Why not include A/ and C/"

The reason not to just use a filter rule like:

+ /A/**
+ /C/**
- **

is that rclone will then keep all of /A and all of /C regardless of other filters. If I had no other filters, this would work but it is not general enough. Also, you need to know all directories in the remote besides the ones you don't want

Why not sync remote:A/ rather than remote:

The biggest reason is that I could have other anchored filters (e.g. - /A/sub/**) which would break.

Second is that you need to know the top level directories and need a sync call for each.

(This is actually my current approach but it adds this risk).


Again, I may be mistaken on whether this is possible currently, but the alternatives I laid out are the best I could think of and are insufficient for the reasons I mentioned.

Thanks! I welcome all feedback!

instead of words, maybe a real work example with real files, so there is no confusion.

rclone tree . 
/
├── A
│   └── A.txt
├── B
│   └── B.txt
└── C
    └── C.txt

if the above tree is not a good example, the post:

  • the simplest tree of the source.
  • the exact tree of the desired output.

If you put the other filters first then they will run first and take priority.

I'm not sure I understand what you are getting at though :frowning: Maybe an actual example would be helpful?

Let's say my normal sync job is as follows

rclone sync gdrive: onedrive: \
    --filter-from filters.txt

and filters.txt has tons of filters that are anchored and may have lots of deep filters.

Now, I just want to sync a (deep?) subdirectory. I can't just do:

rclone sync gdrive:my/sub/dir onedrive:my/sub/dir \
    --filter-from filters.txt

since filters.txt is now broken and any anchored filters are rooted in gdrive:my/sub/dir.

What I want is something like:

rclone sync gdrive: onedrive: \
    --filter "- <NOT> /my/sub/dir/**" \
    --filter-from filters.txt

so the first filter will exclude everything but the deep subdir (and in an optimized way).

And again, I cannot do:

rclone sync gdrive: onedrive: \
    --filter "+ /my/sub/dir/**" \
    --filter-from filters.txt

because any filters in filter.txt that apply to /my/sub/dir/ will not apply since the positive filter will include it.

I hope this makes more sense!

I see what you mean!

Correct

That is conceptually quite simple.

We currently have +, and - rules. Perhaps we could make +! and -! rules which include the inverse of the match

So you'd have something like

rclone sync gdrive: onedrive: \
    --filter "-! /my/sub/dir/**" \
    --filter-from filters.txt

So -! would mean "exclude everything but" and +! would mean "include everything but".

It is always possible to re-arrange the rules so -! and +! aren't needed I think but in this specific case of combining rules it would be very useful.

Given that ! on its own isn't a valid regular expression, we could introduce this feature with {{!}} instead, to mean this match should be inverted. In that case your example would look like

rclone sync gdrive: onedrive: \
    --filter "- {{!}}/my/sub/dir/**" \
    --filter-from filters.txt

Which could conceivably be more useful elsewhere.

Is that the kind of thing you were thinking of?

Yes!

My only question is to verify that it wouldn’t break directory optimization. Otherwise, I could just accomplish this with regex (I think).

I think either of the proposed syntaxes would work though I prefer the former.

Thanks for taking the time to understand! I am also glad I didn’t just miss it.