Help with filtering (excluding) syntax please

Hi,
I have been trying for hours. I read multiple entries on different forums about this. And I still cannot achieve the correct filtering. Particularly Virtualbox VMs, I gave up and tried different versions of trying to exclude this damn folder.

The idea is to backup the home folder to Google Drive, excluding the folders that are either to big or not important to backup.

I tried the general syntax:
-v --exclude "/.folder/**" for hidden folders
-v --exclude "folder/**" for non hidden folders

and for files
-v --exclude ".file" and
The command (1) I am trying:
rclone --delete-excluded sync -v -P /home/user/ khan:/xps-backup/00/home/ -v --exclude ".cache/**" -v --exclude ".Trash/**" -v --exclude ".dbus/**" -v --exclude ".local/share/Trash/**" -v --exclude ".aptitude/**" -v --exclude ".cddb/**" -v --exclude ".Private/**" -v --exclude ".thumbnails/**" -v --exclude "/vmware/**" -v --exclude ".vmware/**" -v --exclude ".steam/**" -v --exclude "/Virtualbox\ VMs/**" -v --exclude ".rvm/**" -v --exclude "/Trash/**" -v --exclude ".npm/**" -v --exclude ".DS_Store/**" -v --exclude "/thumb/**" -v --exclude "/Thumbs.db/**" -v --exclude ".gksu.lock/**" -v --exclude ".pulse/**" -v --exclude ".pulse-cookie" -v --exclude ".config/google-chrome/ShaderCache/**" -v --exclude ".config/google-chrome/Default/Local Storage/**" -v --exclude ".config/google-chrome/Default/Application Cache/**" -v --exclude ".config/google-chrome/Default/History Index */**" -v --exclude ".config/google-chrome/Default/Service Worker/CacheStorage/**" -v --exclude ".viminfo/**" -v --exclude ".config/VirtualBox/VBoxSVC.log/**" -v --exclude ".mozilla/**" -v --exclude ".scribus/**" -v --exclude ".steam/**" -v --exclude "/Downloads/sitesarchiveren/**" -v --exclude "Music/**" -v --exclude "Pictures/**" -v --exclude "windows95/**" -v --exclude "Steam/**" -v --exclude ".gvfs/**" -v --exclude ".npm/**" -v --exclude ".gimp-2.8/**" -v --exclude ".viminfo" -v --exclude ".wget-hsts" -v --exclude ".zsh_history" -v --exclude "/snap/**" -v --exclude ".shutter/**" -v --exclude ".bash_history" -v --exclude "Desktop/**" -v --exclude ".vmware/**" -v --exclude "vmware/**" -v --exclude "Virtualbox VMs/**" -v --exclude "/Virtualbox VMs/**" -v --exclude "snap/**" -v --exclude "Virtualbox VMs/" --dump filters

I know I can shorten everything, I tried, it wasn't working so I went back to the basic switches to debug it.

I've been messing around more in this long command but here is the filter file that did not really work as well:
https://pastebin.com/GTYqbPBY
With this command (2):
sudo rclone sync -v -P --filter-from ~/Documents/scripts/filter.txt --dump filters /home/user google:/xps-backup/00/home/

Command (1) output:
https://pastebin.com/ijSiZkfD

Command (2) output:
https://pastebin.com/2gfPEqQ5

I tried both options multiple times, if anyone have any ideas what to change or tell me what I am doing wrong. Is it because I am using /home/user/ instead of /home/user as a source?

Regards,

I'd probably focus on the filter from which is the second one. It doesn't look like you have anything to back up included in the filter list.

Can you share the:
~/Documents/scripts/filter.txt

It's always better to use full paths rather than the ~ in things too.

Exactly what isn't working?

Your filter file look pretty good :slight_smile: I would stick to using a filter-from file.

I think you want a + * at the bottom of the filer file just to be explicit about what happens to all the other files that didn't match anything so far.

Note that you don't want escaping in the filter from file.

- /Virtualbox VMs/**

Note that if you don't use a leading / rclone will exclude any directory called Pictures and its contents from anywhere in the file system hierarchy. That may be what you want I'm not sure.

- Pictures/**

Using rclone ls --filter-from file.txt /home/user/ is an easy way to test filters.

Ok so I clearly forgot to describe the problem I was to frustrated, my bad sorry.
The problem is that the filtering is not working, it is still uploading the map Virtualbox VMs amongst other files. I do not trust the filters I am using. So when I do rclone sync /home/user remote:/target/
And I want to filter out Pictures
I need to put - "Pictures/**"'
- "/Pictures/**" wont work as far as I understand?

Note that you don't want escaping in the filter from file.


- /Virtualbox VMs/**

I put that folder in different ways in the filter file and it was still uploading as I was looking at the verbose output and progress. Amongst other things are still ending up on the remote.

Using  `rclone ls --filter-from file.txt /home/user/`  is an easy way to test filters.

This looks like a better way indeed, thank you

Further, on this topic, imagine I get all the filters working like I want and there is still old files in the remote. Where do I put --delete-excluded ?

The first one (no /) will filter out /home/user/Pictures, /home/user/subdir/Pictures and /home/user/sub/subdir/Pictures

whereas the second will only filter out /home/user/Pictures

Is that clearer?

Is there a folder called "/home/user/Virtualbox VMs" - that is what you are filtering out.

The --delete-excluded goes on the rclone sync --delete-excluded command line.

Do try with --dry-run first though!

The first one (no /) will filter out  `/home/user/Pictures` ,  `/home/user/subdir/Pictures`  and  `/home/user/sub/subdir/Pictures`

whereas the second will only filter out  `/home/user/Pictures`

Is that clearer?

Yes

You are writing it now without the quotation marks, this is not necessary in the filter file?

I'd probably focus on the filter from which is the second one. It doesn't look like you have anything to back up included in the filter list.

Haha I guess you are right, I ended up with a really cumbersome looking bash script, that was before I found that you can make a filter file. And after I stopped trusting the filter file.

Can you share the:
~/Documents/scripts/filter.txt

Thanks for the tip, using full paths now.
Sure: https://pastebin.com/GTYqbPBY

I think you want a + * at the bottom of the filer file just to be explicit about what happens to all the other files that didn't match anything so far.

Wait what, does that + * do at the bottom?
Sorry it was a crazy busy day yesterday between figuring out rclone I am going over this again today with more sleep and less coffee.

That includes everything else that remains.

You don't need to do this as it is the default when rclone falls off the bottom of the list - the file gets included, but it is nice and explicit.

So I used rclone ls --filter-from on the filter file and it spit out everything it would exclude, looked good, I found Virtualbox VMs in there. So I started the sync and it completely ignored the filter for Virtualbox VMs, what am I doing wrong?

sudo rclone sync -v -P --delete-excluded --filter-from /home/user/Documents/scripts/filter.txt /home/user/ remote:/xps-backup/00/home/

Did you change this?

- /Virtualbox\ VMs/** #WTF YOU NEED RCLONE ESCAPING? NO ESCAPING?

Needs to be this without the \

- /Virtualbox VMs/**

Yes I changed it, it was in my filter file like this "- /Virtualbox VMs/** "

This is the file I used:
https://pastebin.com/a3EzMA5M

Ok this was a typo, the directory is called VirtualBox VMs and not Virtualbox VMs, embarrassing

Ok turns out you need to put - /'VirtualBox VMs'/** in the filter file and not -/VirtualBox VMs/**
It wont work otherwise same with the folder " 'Calibre Library' " for example

I don't think that is correct... Here is an example

$ rclone ls src
        0 other.txt
        0 VirtualBox VMs/vm.txt

$ cat filter-file 
- /VirtualBox VMs/**
+ *

$ rclone ls src --filter-from filter-file 
        0 other.txt

$ cat filter-file2
- /'VirtualBox VMs'/**
+ *

$ rclone ls src --filter-from filter-file2 
        0 other.txt
        0 VirtualBox VMs/vm.txt

I did rclone ls src --filter-from filter/file/location.txt | grep VMs
With this in the filter file /'VirtualBox VMs'/** and the output was this:
https://pastebin.com/u9Eee8wm
VirtualBox VMs was showing up, great!

So I went ahead with the sync, and the huge folder still ended up in the remote location. :frowning:

Can you share the contents of the filter file you used?

cat /home/user/Documents/scripts/filter.txt

Yes of course.

https://pastebin.com/Lx4mJFtN

What OS are you on?

I can see it working without single quotes and not working with single quotes.

[felix@gemini test]$ rclone ls /home/felix/test --filter-from /home/felix/filter.txt  --dump filters
--- start filters ---
--- File filter rules ---
- ^'VirtualBox VMs'/.*$
+ (^|/)[^/]*$
--- Directory filter rules ---
- ^'VirtualBox VMs'/.*$
+ ^.*$
--- end filters ---
      254 hosts
        0 VirtualBox VMs/win10.ovf
[felix@gemini test]$ vi ../filter.txt
[felix@gemini test]$
[felix@gemini test]$
[felix@gemini test]$ rclone ls /home/felix/test --filter-from /home/felix/filter.txt  --dump filters
--- start filters ---
--- File filter rules ---
- ^VirtualBox VMs.*$
+ (^|/)[^/]*$
--- Directory filter rules ---
- ^VirtualBox VMs.*$
+ ^.*$
--- end filters ---
      254 hosts

Ubuntu 18
5.0.0-27-generic #28~18.04.1-Ubuntu SMP Thu Aug 22 03:00:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I did the rclone ls --filter-from file.txt /home/user/ etc command and it was not showing up when the folder was written without the quotes.
When I changed it to /'VirtualBox VMs'/** It showed up in the output so yeah I went ahead and ran the sync command.
I caved sir, I just added all the virtualbox file types to the file type exclusion list as you can see. And that also worked. Still do not understand what I am doing wrong though.

Do you think it has something to do with the shell I am using? zsh?

Can we take a step back and try to do a simple test to validate it includes it properly?

If you can make a new filter file like mine and just run that and share the output here like this:

[felix@gemini test]$ rclone -vv ls /home/felix/test --filter-from /home/felix/filter.txt --dump filters
--- start filters ---
--- File filter rules ---
+ ^VirtualBox VMs.*$
- (^|/)[^/]*$
--- Directory filter rules ---
+ ^VirtualBox VMs.*/$
+ ^VirtualBox VMs.*$
- ^.*$
--- end filters ---
2019/09/26 10:39:12 DEBUG : rclone: Version "v1.49.3" starting with parameters ["rclone" "-vv" "ls" "/home/felix/test" "--filter-from" "/home/felix/filter.txt" "--dump" "filters"]
2019/09/26 10:39:12 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2019/09/26 10:39:12 DEBUG : hosts: Excluded
        0 VirtualBox VMs/win10.ovf
2019/09/26 10:39:12 DEBUG : 3 go routines active
2019/09/26 10:39:12 DEBUG : rclone: Version "v1.49.3" finishing with parameters ["rclone" "-vv" "ls" "/home/felix/test" "--filter-from" "/home/felix/filter.txt" "--dump" "filters"]
[felix@gemini test]$ cat ../filter.txt
+ /VirtualBox VMs**
#Add everything else
- *

That's a one line include for that directory and exclude everything else.