Implied include ** filter

What is the problem you are having with rclone?

No problem, just trying to understand implied --include ** filter with a view to clarifying documentation.

What is your rclone version (output from rclone version)

1.53.2

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Ubuntu 20.4 ARM64

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run

See below

The rclone config contents with secrets removed.

[edrclone]
type = drive
client_id = xxxxx.apps.googleusercontent.com
client_secret = xxxxx
scope = drive
token = {"access_token":"xxxxxx"}
root_folder_id = xxxxxx

A log from the command with the -vv flag

n/a


I have been looking at clarifying some points in the filter documentation page and got drawn into the implied + ** at the end of the rules in a file for --filter-from. That behaves as I expected. Any object / path that has not matched a rule is included. That works whether or not there are any + rules.

If there are no --include... options on the command line an --include '**' seems to be implied.

For instance with a test set in edrclone:test:

├── 1.docx
├── 2.docx
└── 3.docx

Listing without filters seems to use an implied --include '**':

ubuntu@ubuntu:~$ rclone ls edrclone:test
       -1 3.docx
       -1 2.docx
       -1 1.docx

With an --exclude... the implied include is still evident:

ubuntu@ubuntu:~$ rclone ls edrclone:test --exclude 3.docx
       -1 2.docx
       -1 1.docx

As soon as I add one (or more) --include... the implied --include '**' seems not to be applied:

ubuntu@ubuntu:~$ rclone ls edrclone:test --exclude 3.docx --include 1.docx
2020/11/04 12:06:00 ERROR : Using --filter is recommended instead of both --include and --exclude as the order they are parsed in is indeterminate
       -1 1.docx

In spite of that ominous warning the --include... and --exclude... combinations I tried all seemed to predictably follow the rules for precedence of filter options set out in the existing documentation. The implied include still does not seem to be observed without the include / exclude clash, i.e.if there is just a single --include...` option:

ubuntu@ubuntu:~$ rclone ls edrclone:test --include 1.docx
       -1 1.docx

I have no opinion about what the behaviour should be - merely trying to get it unambiguous in the documentation. Is it true to say that there is an implied final +** with --filter... and implied --include '**' on the command line but whenever there are one or more --include... on the command line those two implied includes do not apply?

(Adding an --include... to the command line also seems to negate the implied +** in a --filter-from but I have not provided the example here.)

(Behaviour seems to be consistent essentially between rclone 1.53.2 and 1.51.0)

From my understanding, you should not mix include and exclude.

If you want to see what the filters are doing, just dump the filters.

felix@gemini:~$ rclone lsf GD: --dump filters
--- start filters ---
--- File filter rules ---
--- Directory filter rules ---
--- end filters ---
blah
crypt/
hosts
test/

Yes that seems to be the general message, though behaviour turned out to be predictable in this simple case - not that I seek to change that advice.

Thanks, I had not thought about using --dump filters

ubuntu@ubuntu:~$ rclone ls edrclone:test --include 2.docx --filter-from z.txt --dump filters
--- start filters ---
--- File filter rules ---
+ (^|/)2\.docx$
+ (^|/)1[^/]*$
- (^|/)2[^/]*$
- ^.*$
--- Directory filter rules ---
+ ^.*$
- ^.*$
--- end filters ---
       -1 2.docx
       -1 1.docx
ubuntu@ubuntu:~$ rclone ls edrclone:test --filter-from z.txt --dump filters
--- start filters ---
--- File filter rules ---
+ (^|/)1[^/]*$
- (^|/)2[^/]*$
--- Directory filter rules ---
+ ^.*$
--- end filters ---
       -1 3.docx
       -1 1.docx

z.txt contains

+ 1*
- 2*

Ah, regex!

Why does adding --include 2.docx to the commandline add - ^.*$ to the file and directory rules?

In any event it is consistent with the behaviour I observed above.

I am making heavy weather of this. It may be the concern about not using an --include... filter option applies not only with --exclude, but any other filter option (ie --filter...).

Here is a bit of draft of my trying to make sense of the 'how filter rules are applied' section:

How filter rules are applied

Important Avoid using --include or include-from with any
other filter options. The results may not be what you expect. Instead
use a --filter... option.

Rclone filters are made up of one or more of the following options:

  • --include
  • --include-from
  • --exclude
  • --exclude-from
  • --filter
  • --filter-from
  • --filter-from-raw

There can be more than one instance of individual options.

Rclone internally uses a combined list of all the include and exclude
rules. The order in which rules are processed can influence the result
of the filter.

All options of the same type are processed together in the order
above, regardless of what order the different types of options are
included on the command line.

All --include options are processed first in the order they
appeared on the command line, then all --include-from options etc..
Multiple instances of the same option type are processed from left
to right according to their position in the command line.

To mix up the order of processing includes and excludes use --filter...
options.

Within ...-from options, rules are processed from top to bottom
of the referenced file.

If there is an --include or --include-from option specified, rclone
implies a -** rule which it adds to the bottom of the internal rule
list. Specifying a + rule with a --filter... option does not imply
that rule.

Each object / path name passed through rclone is matched against the
combined filter list. At first match to a rule the object / path name
is included or excluded and no further filter rules are processed for
that object / path.

If rclone does not find a match, after testing against all rules
(including the implied rule if appropriate), the object / path name
is included.

Any object / path included at that stage is processed by the rclone
command.

To see the internal combined rule list, in regex form, for a command
add the --dump filter option.

This advice is due to a deficiency in rclone's argument parser rather than anything else.

The argument parser parses all the --include entries into a single list and all the --exclude entries in a separate list. Rclone then uses first the --include list (or maybe the other way round) then the --exclude list, which is very different to what the user might have intended if they had written --include x --exclude y --include z etc.

For example --include *.jpg --exclude *.doc will work regardless of the order that you process the --include or --exclude.

However --exclude "*.jpg" --include "/dir/**" will work differently if the --include or the --exclude is processed first. It could be processed as

- *.jpg
+ /dir/**
- *

Or

+ /dir/**
- *.jpg
- *

The first of those will include everything in "dir" except jpegs, whereas the second of those will include everything in "dir" regardless.

You should only use one of --include* --exclude* or --filter* on a given command line, but you can use as many of them as you like!

So for simple cases you might use --include x --include y or --exclude x --exclude y for more complicated cases --filter "- x" --filter "+ x" and for even more complicated cases --filter-from filename

Rclone builds a list of rules (which you can examine with -vv --dump filters) which are processed in order for every potential match.

If you write --include *.jpg we understand that to mean --filter "+ *.jpg" --filter "- *" - if we didn't put that implicit exclude on the end then rclone would list everything as falling off the end of the list is equivalent to matching - the implied + ** at the end of the list.

So the default state for an empty list is to match everything because of the implied + ** at the end.

This should probably say something like: on any given command line you should only use either the --include flags or the --exclude flags or the --filter flags

There isn't a --filter-from-raw flag - you are thinking of --files-from and --files-from-raw

To be fair it does what the documentation says it does given the precedence of filter options. Maybe a 'feature' rather than 'deficiency'.

That came from the original documentation! Well spotted, it had been puzzling me. Do the --files-from and --files-from-raw options use the same filter mechanism - that is to say would it be dangerous to mix --exclude or --filter options with --files-from.. options?

That makes rclone filters a lot clearer to me.

OK sorry I found the documentation that this isn't an issue

The **filtering rules are ignored**

:slight_smile:

I think if you try to mix --files-from and other filters then rclone will give a fatal error:

2020/11/06 16:19:04 Failed to load filters: The usage of --files-from overrides all other filters, it should be used alone or with --files-from-raw

Yes, sorry, I had found that one in the docs and by testing and should have edited the Q out.

I did raise a Github issue to query the treatment of --include* or --exclude* with --filter* at https://github.com/rclone/rclone/issues/4741

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.