ylorn
June 30, 2023, 6:14pm
1
What is the problem you are having with rclone?
The rclone ls[f|d] --recursive
command produces results that does not match the given regex pattern.
Take following dir structure as an example
test/
├── _A_
│ └── _a
│ └── a
├── _B_
│ └── b_
│ └── b
└── _C_
└── _c_
└── c
When running rclone lsf g:test --dirs-only --recursive --fast-list
, this is the raw result:
_B_/
_A_/
_C_/
_B_/b_/
_A_/_a/
_C_/_c_/
Now I want to only list folders/subfolders that are surrounded by _
, which means only following should be included:
_B_/
_A_/
_C_/
_C_/_c_/
So I tried to apply filters to the original command with a regex pattern:
rclone lsf g:test --dirs-only --recursive --fast-list --include "{{^(_.+_/)+$}}"
presumably, the result should be only _A_
, _B_
, _C_
and _C_/_c_
, but it STILL gives the same result as the original command without any filters.
How can I get the expected result?
Run the command 'rclone version' and share the full output of the command.
rclone v1.62.2
os/version: arch (64 bit)
os/kernel: 6.3.7-arch1-1 (x86_64)
os/type: linux
os/arch: amd64
go/version: go1.20.4
go/linking: dynamic
go/tags: none
Which cloud storage system are you using? (eg Google Drive)
Google Drive & Dropbox
As per docs :
Here is how the {{regexp}}
is transformed into an full regular expression to match the entire path:
{{regexp}} becomes (^|/)(regexp)$
/{{regexp}} becomes ^(regexp)$
In addition your regexp "^(.+ /)+$" does not do what you intended. Here results using https://regex101.com/
Using test folder with the same dirs as in your example:
rclone lsf test --recursive --include="{{(_.+_)}}/"
_A_/
_B_/
_C_/
_C_/_c_/
_B_/b_/
ylorn
July 1, 2023, 6:39pm
3
Thank you for pointing out the auto transformation of the regex in clone filters. But it don't think --include="{{(_.+_)}}/"
gives the intended result, since _B_/b_/
is included.
I tried on https://regex101.com with this regex: ^(_[^\/]+?_\/)+$
, and it seems to give me the expected 4 matches:
Presumably, it fits the /{{regexp}} becomes ^(regexp)$
transformation, therefore I tried the filter --include="/{{(_[^/]+?_\/)+}}"
with ^
and $
stripped. But it still gives the unfiltered result:
❯ rclone lsf g:test --dirs-only --recursive --fast-list --include="/{{(_[^\/]+?_\/)+}}"
_B_/
_A_/
_C_/
_C_/_c_/
_B_/b_/
_A_/_a/
1 Like
You are right - I think it will require of checking source code of regex implementation in rclone filters.
ylorn
July 1, 2023, 6:58pm
5
Shall I open an issue on GitHub?
I would say yes - maybe it is indented but definitely it looks like it can not handle groups in regexp.
And your example is perfect as it is easy to replicate and test.
1 Like
Rclone doesn't use those regular expressions so putting it into that won't be quite helpful.
You can see what's available here.
Rclone uses GO Regular Expressions which are a little different.
1 Like
Here we are:) I will definitely read it but I have doubt that Go decided to reinvent the wheel.
and either way - it should be documented or fixed. So IMO it is useful exercise.
1 Like
ylorn
July 1, 2023, 7:18pm
10
1 Like
I think there is confusion as rclone works on files not really directories.
That regex won't work on directories.
[felix@gemini test]$ rclone lsf /home/felix/test --dirs-only --recursive --fast-list --include="/{{(_[^\/]+?_\/)+}}" --dump filters
2023/07/01 15:22:42 NOTICE: Automatically setting -vv as --dump is enabled
2023/07/01 15:22:42 INFO : Can't figure out directory filters from "/{{(_[^\\/]+?_\\/)+}}": looking in all directories
--- start filters ---
--- File filter rules ---
+ ^((_[^\/]+?_\/)+)$
- ^.*$
--- Directory filter rules ---
+ ^.*$
- ^.*$
So it dumps it out.
ylorn
July 1, 2023, 7:30pm
12
Removing the --dirs-only
is fine with me, I am just using dirs to demonstrate the issue. But even without that flag the result still is not expected:
raw listing without any filters:
❯ rclone lsf g:test --recursive --fast-list
_B_/
_A_/
_C_/
_C_/_c_/
_B_/b_/
_A_/_a/
_C_/_c_/c
_B_/b_/b
_A_/_a/a
adding regex filter --include="/{{(_[^/]+?_/)+}}"
:
❯ rclone lsf g:test --recursive --fast-list --include="/{{(_[^/]+?_/)+}}"
_B_/
_A_/
_C_/
_B_/b_/
_C_/_c_/
_A_/_a/
I'm having a hard time in your output discerning what is a file and what is a directory.
What is your goal?
show what files at the end?
I think it is irrelevant - as it is only example where rclone regexp are not clear. regexp is just a rule - applied - should bring some results. Does not? Why?
1 Like
Because I cannot test what's he doing so it's very relevant for me. Can you let the OP answer please.
Thanks.
ylorn
July 1, 2023, 7:40pm
16
There are some folders that does not follow our naming guidelines, we want to find them and correct them. But there are too many of them to find and rename manually, so I want to filter them out and rename them programmatically in batch.
kapitainsky:
Does not? Why?
And I've answered 'why' above.
I'm not sure if that's by design or a bug or whatnot, but if I can figure out the use case and the goal, I might be able to assist so I need a way to replicate it so I can test.
Even more specific from the docs:
When using pattern list syntax, if a pattern item contains either / or **, then rclone will not able to imply a directory filter rule from this pattern list.
Link here -> Rclone Filtering
ylorn
July 1, 2023, 8:11pm
18
I figured out the problem, the filters we tried is not recognized as directory filters, hence not applied.
As per doc :
Directory filter rules are defined with a closing / separator.
So we have to surround our regex filters with /
at the outmost layer as /{{.*}}/
to be recognized by rclone.
As a result, we need to compose the filter like this: --include="/{{(_[^\/]+_\/?)+}}/"
so that rclone can add it to the directory filters:
2023/07/01 22:03:36 NOTICE: Automatically setting -vv as --dump is enabled
--- start filters ---
--- File filter rules ---
- ^.*$
--- Directory filter rules ---
+ ^((_[^\/]+_\/?)+)/$
- ^.*$
--- end filters ---
Therefore, the following command will give us the correct result:
❯ rclone lsf g:test --dirs-only --recursive --fast-list --include="/{{(_[^\/]+_\/?)+}}/"
_A_/
_B_/
_C_/
_C_/_c_/
1 Like
Very clever figuring it out. Thank you.
darthShadow
(Anagh Kumar Baranwal)
July 1, 2023, 10:39pm
20
Just for future reference: https://filterdemo.rclone.org/ is an excellent site for testing out filters. It also logs the file & directory filter rules to the browser console.