Rclone copy using regex expression using include multiple expression for file name

What is the problem you are having with rclone?

how we can copy using rclone with regular expression :-1:
suppose we have file name
CLIENT_DELTA.20210710.A901
we have multiple file here with date and .A901
so what our requirement we need to filter file using "20210710" and at the end of file name "A901"
how we can copy all the files with this info from remote folder to avoid extra files copied to our server and choke the network

What is your rclone version (output from rclone version)

rclone v1.56.0

Which cloud storage system are you using? (eg Google Drive)

rclone copy data using SFTP on port 22 from remote Linux server

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Paste command here

rclone copy --include=?20210710?A901 --sftp-host=remote_server :sftp:/incoming --sftp-user=minio --sftp-pass='' datasync --log-file=/var/log/rclone/rclone_pull.log -vvv -P

The rclone config contents with secrets removed.

Paste config here

no special config as we are using rclone copy using SFTP on port 22

A log from the command with the -vv flag

Paste  log here

hi,
rclone should be able to do that as documented here

sorry but i am still not understand how to pick the correct expression for filter or with include :frowning:

not sure i understand your example,
perhaps --include={*}.20210710.A901

if not, then post a few examples of filenames that you want to match

I wanted to match only date and last .A901 ,no matter what name of file starting with

DWH.2MMO.BILLING_PARAMETERS.20170425.A901
DWH.2MMO.BILLING_PARAMETERS.20180723.A901
DWH.2MMO.CHARGE.20210916.A901
DWH.2MMO.CHARGE.20210917.A901
DWH.2MMO.SERVICE_AGREEMENT.19000101.A901
DWH.2PUA.CHARGE.20210916.A901
DWH.2PUA.CHARGE.20210917.A901
DWH.2PUA.SERVICE_AGREEMENT.19000101.A901
DWH.3MMC.BILLING_PARAMETERS.20180723.A901
DWH.3MMC.BILLING_PARAMETERS.20210415.A901
DWH.3MMC.CHARGE.20210917.A901
DWH.3MMC.SERVICE_AGREEMENT.19000101.A901
DWH.3PUC.CHARGE.20210916.A901
DWH.3PUC.CHARGE.20210917.A901
DWH.3PUC.SERVICE_AGREEMENT.19000101.A901
mkp_DWH.3MMC.BILLING_PARAMETERS.20180723.A901

try --include=*.{\d\d\d\d\d\d\d\d}.A901

I tried with below command but not worked
rclone copy --include=*.{\20210916\20210917\20170425}.A901 --sftp-host=remote_server :sftp:/incoming/ --sftp-user=minio --sftp-pass='' /datasync/demo1 --log-file=/var/log/rclone/rclone_pull.log -vvv -P

  • your --include does not match mine?
    do you want to match specific dates or any date?

  • better to test using rclone ls, not rclone copy

  • also, when posting, post the command and entire output
    enclose it with three backticks so it looks like this

rclone ls D:\files\reg --include=*.{\d\d\d\d\d\d\d\d}.A901 -vv 
2021/10/05 12:12:57 DEBUG : Setting --config "C:\\data\\rclone\\scripts\\rclone.conf" from environment variable RCLONE_CONFIG="C:\\data\\rclone\\scripts\\rclone.conf"
2021/10/05 12:12:57 DEBUG : rclone: Version "v1.56.0" starting with parameters ["c:\\data\\rclone\\scripts\\rclone.exe" "ls" "D:\\files\\reg" "--include=*.{\\d\\d\\d\\d\\d\\d\\d\\d}.A901" "-vv"]
2021/10/05 12:12:57 DEBUG : Creating backend with remote "D:\\files\\reg"
2021/10/05 12:12:57 DEBUG : Using config file from "C:\\data\\rclone\\scripts\\rclone.conf"
2021/10/05 12:12:57 DEBUG : fs cache: renaming cache item "D:\\files\\reg" to be canonical "//?/D:/files/reg"
2021/10/05 12:12:57 DEBUG : CLIENT_DELTA.2021071_.A901: Excluded
2021/10/05 12:12:57 DEBUG : DWH.2MMO.BILLING_PARAMETERS.20170d42.A901: Excluded
2021/10/05 12:12:57 DEBUG : kindle-for-pc-1-17-44170.exe.md5: Excluded
2021/10/05 12:12:57 DEBUG : mkp_DWH.3MMC.BILLING_PARAMETERS.20180723.A902: Excluded
        1 CLIENT_DELTA.20210710.A901
        1 CLIENT_DELTA.20210711.A901
        1 CLIENT_DELTA.20210712.A901
        1 DWH.2MMO.BILLING_PARAMETERS.20170142.A901
        1 DWH.2MMO.BILLING_PARAMETERS.20170425.A901
        1 DWH.2PUA.SERVICE_AGREEMENT.19000101.A901

these are the files there
DWH.2MMO.SERVICE_AGREEMENT.20210702.A901
mkp_DWH.3MMC.BILLING_PARAMETERS.20180723.A901
DWH.3PUC.SERVICE_AGREEMENT.19000101.A901
DWH.3MMC.SERVICE_AGREEMENT.19000101.A901
DWH.3MMC.BILLING_PARAMETERS.20210415.A901
DWH.3MMC.BILLING_PARAMETERS.20180723.A901
DWH.2PUA.SERVICE_AGREEMENT.19000101.A901
DWH.2MMO.SERVICE_AGREEMENT.19000101.A901
DWH.2MMO.BILLING_PARAMETERS.20180723.A901
DWH.2MMO.BILLING_PARAMETERS.20170425.A901

I ran the command
rclone ls /opt/SP/minio/datasync/demo1 --include=.{\d\d\d\d\d\d\d\d}.A901 -vv
2021/10/05 18:20:03 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "ls" "/opt/SP/minio/datasync/demo1" "--include=
.{dddddddd}.A901" "-vv"]
2021/10/05 18:20:03 DEBUG : Creating backend with remote "/opt/SP/minio/datasync/demo1"
2021/10/05 18:20:03 DEBUG : Using config file from "/opt/SP/miniousr/.config/rclone/rclone.conf"
2021/10/05 18:20:03 DEBUG : DWH.2MMO.SERVICE_AGREEMENT.19000101.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.2MMO.SERVICE_AGREEMENT.20210702.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.2MMO.BILLING_PARAMETERS.20170425.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.2PUA.SERVICE_AGREEMENT.19000101.A901: Excluded
2021/10/05 18:20:03 DEBUG : mkp_DWH.3MMC.BILLING_PARAMETERS.20180723.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.3MMC.BILLING_PARAMETERS.20210415.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.3PUC.SERVICE_AGREEMENT.19000101.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.2MMO.BILLING_PARAMETERS.20180723.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.3MMC.SERVICE_AGREEMENT.19000101.A901: Excluded
2021/10/05 18:20:03 DEBUG : DWH.3MMC.BILLING_PARAMETERS.20180723.A901: Excluded
2021/10/05 18:20:03 DEBUG : 2 go routines active

I ran this now its working wit rclone ls
rclone ls /opt/SP/minio/datasync/demo1 --include=.{\19000101,\20180723,\20170425}.A901 -vv
2021/10/05 18:23:12 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "ls" "/opt/SP/minio/datasync/demo1" "--include=
.19000101.A901" "--include=.20180723.A901" "--include=.20170425.A901" "-vv"]
2021/10/05 18:23:12 DEBUG : Creating backend with remote "/opt/SP/minio/datasync/demo1"
2021/10/05 18:23:12 DEBUG : Using config file from "/opt/SP/miniousr/.config/rclone/rclone.conf"
2021/10/05 18:23:12 DEBUG : DWH.2MMO.SERVICE_AGREEMENT.20210702.A901: Excluded
2021/10/05 18:23:12 DEBUG : DWH.3MMC.BILLING_PARAMETERS.20210415.A901: Excluded
36827 DWH.2MMO.BILLING_PARAMETERS.20170425.A901
3719 DWH.2MMO.BILLING_PARAMETERS.20180723.A901
1106 DWH.2MMO.SERVICE_AGREEMENT.19000101.A901
1106 DWH.2PUA.SERVICE_AGREEMENT.19000101.A901
5207 DWH.3MMC.BILLING_PARAMETERS.20180723.A901
71573 DWH.3MMC.SERVICE_AGREEMENT.19000101.A901
5347 DWH.3PUC.SERVICE_AGREEMENT.19000101.A901
4091 mkp_DWH.3MMC.BILLING_PARAMETERS.20180723.A901
2021/10/05 18:23:12 DEBUG : 2 go routines active

let me trt with rclone copy

He ran that.

You ran

Which aren't the same.

You need to match his filter specifically as you are missing the *

correct :slight_smile:

if we want to copy any file name having contains 'SERVICE_AGREEMENT' all the file copy and rest will be exclude

any way if can copy file using rclone copy /demo1/*SERVICE_AGREEMENT*.A901 to dest folder without using --include

That is a very interesting answer and shouldn't work at all because rclone filter strings are file globs not regexps.

However it does work...

I think the reason it works is actually a bug. Rclone translates the filters into regular expressions (you can see them with -vv --dump filters), but it is dumping the \d straight into the regexp which it shouldn't be - it should be escaping the \..

This means that any of the regexp \ chars will work in the glob which wasn't unintended certainly!

So I should probably either document this or fix it....

(I keep meaning to make an alternate syntax which can be a regexp for filters - my best idea for not breaking too much stuff is to allow regexp in the --filter family of commands, so instead of writing --filter + blah you'd write --filter r+ .*\d{5})

thank You nick , is there any way we can filter the last name of file from this full path,

/stream4/interface1/in_folder/abc-interface-files.????????.txt

anyway to copy this patterns of file via rclone as rclone copy working only when we are giving full path and file name only , how I can use the path and this file pater to copy this path rclone reading from github :frowning:

I am reading one config file from github using curl and its having this path mentioned .
"/opt/SP/edwdata/ab_data_mount/main/serial/VFDE/public/DWH_PUB/main/mozart/incoming/DWH.2MMO.SERVICE_AGREEMENT.????????.A901"
rclone not able to copy directly with this patern without using include or filter , things is file name and path could be change every time , any way we can filter the file name , uning ???? and put the filter filename into --include=DWH.2MMO.SERVICE_AGREEMENT.????????.A901 like this to done the rclone copy job

Pls help any filter i can use using grep or awk i can separate path and file pattern to work with rclone copy uisng include.

and the magic 8 ball says!

1 Like

You'll have to pre-process the data.

You could do something like this assuming you had a file full of path names - this would return the file name

awk 'BEGIN { FS="/" } { print $NF }' < filez
1 Like