Rclone - Transfer only a list of files from a specific folder

Hi,

I use Rclone for Google Drive. I have hundreds of files inside a specific folder (inside that folder there are thousands of files organized in several sub-folders). Then I have a list of specific files - let's say 200 files - I want to automatically download (not having to enter one by one). Does anybody know if there's an easy way to create a script to search inside that specific folder (and inside sub-folders of that folder) for those files and then download just those files?

The same thing for folders - I have hundreds of folders inside a specific folder (inside that folder there are other folders organized in several sub-folders). Then I have a list of specific folders - let's say 200 folders - I want to automatically download (not having to enter one by one). Does anybody know if there's an easy way to create a script to search inside that specific folder (and inside sub-folders of that folder) for those folders and then download just those folders?

Thanks,
Nuno

You can use filtering:

https://rclone.org/filtering/#files-from-read-list-of-source-file-names

If you can give an example of your use case, we can help out to define it.

Thank you very much for your help. I read the LS instructions but they don't mention ways to look of files inside folders and sub-folders (and how to limit the search to a given folder and all the contents inside it). Also, there's no mention if I can use LS to search and download specific folders - instead of files.

Please have a look at the image.

I want to create a script that I can use to point Rclone at /Main folder/ and do the following:

a) download the files from a list, let's say, download
file3.txt
file5.txt
file9.tx
file10.txt
file11.txt

(of course in a real world the folder tree will be more complex and I want to choose a bigger list of files, probably a list on a csv or txt)

b) Do the same, but download folders (and its contents)
12342223124.RDC
12342223126.RDC
22342223125.RDC
42223126.RDC
62342223125.RDC

(of course in a real-world the folder tree will be more complex and I want to choose a bigger list of folders, probably a list on a csv or txt)

You'd probably need to take a step back though.

If you can't identify logic or a method to programmatically identify files, filters are not the way to go as they require you to have a method to identify files.

If you system is a pick list of what you are defining and what you want, you do not have to many options other than use a file based list to download.

In that example, how did you define those files to download?

Same here, you need to figure out if this is a random list based on what you want or some way to identify them.

You have may options with regular expressions, time of the files, size, etc.

Thanks. I know which files I want. The starting point is that I have a list of files (or directories) that I want to download and I know on which folder they are (although, inside that folder, they can be organized in several levels of sub-folders like in the example in the image).

What I'm looking for is a simple way - script - where:

  • I can tell Rclone where to look (so Rclone doesn't have to search all the cloud) - define the top folder where to search;
  • Send a list of files or folders I want to download - link to a simple .txt or .csv;
  • Define the folder on my computer where I want the files to be downloaded to.
  • Rclone runs the scripts and downloads the files or the folders that are listed on the .txt or .csv. Exact matches that rclone finds inside the specified folder.

Is this doable?

Thanks. My starting point is that I have a list of files / folder that I want Rclone to download. I want to search and download a specific file or folder (a list of files / folders). I don't want Rclone to create that List. I start with a list. I want rclone to find it and download it. But because the list can be 1000 files, I want to automatize that.

Is the assumption you have or know the full paths for the files?

You could use https://rclone.org/filtering/#files-from-read-list-of-source-file-names as part of the list are you trying to search the cloud for a specific file?

So if I run back a use case with one file as an example.

file name is 12342223124.RDC
search remote and find file
download file

and basically iterate through that?

The problem with read-list is that I know the top folder the files are but not the absolute path.
I know that the files they are somewhere in the /Main Folder/ but don't know on which subfolder.

So from RClone example,

# comment
file1.jpg
subdir/file2.jpg

my List file is

# comment
file1.jpg
file2.jpg

But I want Rclone find and copy file2.jpg inside the subdir.

That's my challenge.

The other question: can I use this "from-list" for folders, not just files?
12342223124.RDC --- is a folder

No, from list is for files.

I'm not sure how logically that works as if you have 4 file1.jpg in 4 different folders, how does it pick the right one?

You can use wildcards in the filter.

+ /directory/*/file.ext

  • = any folder inbetween (not totally sure if it "digs" down, or if it's only directories on the same level). You can experiment with the --dry-run flag

Thanks. I will give the --dry-run flag a try.
@Animosity022 ▲I know that it should only be one copy of the same file.

Is there any way to tell rclone to find and download full directories and not just files?

+ /path/to/dir/** will download/upload all inside the dir both files and directories.

See https://rclone.org/filtering/#filter-from-read-filtering-patterns-from-a-file

Hi,

I'm still struggling with files-from option on Rclone. I tried the different syntaxes options and nothing works.

On the example, If I point at the DemoFootage Root directory to download, what do I need to list (what's the syntax) in the from-file.txt file to be able to download all the files that include A004_C186_011278 on their file name? These files will be inside of the /A004_C186_011278/ sudirectory of DemoFootage Root directory?

Also, How should I list files inside subdirectories?

For example if I point at the Demofootage Root directory and want to download A004_C186_011278_001.R3D that is inside the sub directory A004_C186_011278?

Nuno

Can you share what you tried?

It reads as a top download and the first match wins.

I'm a bit confused on how the files-from ties in. If you have a file from, wouldn't use that and not filter? If you can step through the logic of what you are trying to do, that would help me anyway.

Hi,

I'm trying to set up a workflow where I can select 10s of videofiles (hundreds of gigabyte each) from a directory that has 100s of videofiles without having to download all of them. So, I can download only the files I need, let's say 1TB and not ALL the files that could be 20/30TB.

For a given TV Show, I shot during weeks and record hundreds of .mov files. File are organized this way and saved on the cloud.

TV SHOW NAME
/Day 1
/Cam 1
/Card1
/Card2
/Cam 2
/Card1
/Card2
/Day 2
/Cam 1
/Card1
/Card2
/Cam 2
/Card1
/Card2
....
/Day 35
/Cam 1
/Card1
/Card2
/Cam 2
/Card1
/Card2

As you can see, a complex hierarchy of subfolders and folders to archive them.

Then, later, after the editing, I have a list of files that are the ones I need for finishing the episode, so I can assemble the final episode. I know the name of the files (I have the names of the files on a list - but not the path) but I don't know if they are inside /Day 1/Cam 1/Card 2/ or inside /Day 14/Cam 2/Card 4/. I just know that they are inside - somewhere - the TV SHOW NAME root directory.

And I want to download only those files from list in an automatic way. Not the full TV SHOW NAME root directory.

So I would like to get a syntax that if I say to Rclone to download --files-from listed on a .TXT doc from the TV SHOW NAME root directory in the cloud, rclone will download the file whatever is path to that directory.

I tried:

filename.mov
/filename.mov
//filename.mov
/*/filename.mov

but unless I point on the specific folder where the file is, rclone doesn't go deeper in the subfolder path.

RCLONE manual says that filtering should do this

file.jpg  - matches "file.jpg"
            - matches "directory/file.jpg"

But it's not the case. IT only matches "file.jpg"
If I list filename.mov but the file is inside a /subdirectory/ rclone can't find it and download it. Only if it's in the root of the directory I select to download from.

I hope it's clear.

Would include be better? (two stars tells rclone to match even across directories)

--include=**/filename.mov

**/filename.mov doesn't work.

Rclone can't find the file it's not on the root folder - if it's inside a subfolder.

Can you paste the full command you used with a -vv log?

root@s163042:~# rclone lsl robgs: --include=**/IMG_20200430_171631.jpg
  4467675 2020-04-30 17:16:33.824000000 Local-Pictures/2020/04/IMG_20200430_171631.jpg

I'm not using --include
I'm using --files-from using a txt file with several entries

**/file1.mov
**/file2.mov
etc.

Yes I know. That won't work and that's why I suggested a filter instead. For a --files-from you need a full path. For a filter, you can drop in partials.