Sync folder exclude

Hello New to this forum. I'm trying to sync my user folder to drive but I could care less if temporary files are backed up. is there a way to do a sync that ignores/deletes files within folders that have a specific string in their name like "temp" or "cache?" Thanks for any assistance

There sure is!

What you probably want to use is the filtering system.

In particular, you probably want to use --exclude [yourpattern]
for example, if you wanted to ignore all .txt files then it would be --exclude .txt
Then they would not be shown or registered on the remote, nor would sync upload them from a location. It applies on both sides in other words.

To know the exact command you need I would need to know how these temporary files are named. Are we talking about office-format something along the vein of ~$ at the start of the name? I don't want to make any wrong assumptions here, so please do specify :slight_smile:

There are lots of other ways to filter, so it's well worth skimming through the docs so you know what is possible for when you need something else later :slight_smile:

yeh I looked at that but it's all a bit confusing honestly. I looked for anything about filtering folders but all I found info about was fildering files. I even did a find and folder the word itself doesn't exist. lets just simplify things for your benefit. I want the sync to ignore folders named cache.

You can filter both on files and folders - as you wish.

I do agree that regex type patterns can be confusing (although this is technically BLOB which is a little simplified compared to regex). I will help make what you need as long as you can describe your need (and make sure to test it - because I am not perfect at this either hehe).

Ok, sure. But we need to be more spesific than this in a filter definition :slight_smile:
Do you want to ignore...

  • All files inside any folder named exactly "cache" ?
    --ignore /cache/*
  • All files inside any folder that starts with cache.... ?
    --ignore /cache*/*
  • All files inside any folder that ends with ....cache ?
    --ignore /*cache/*
  • All files inside any folder that has ....cache.... anywhere in the name?
    --ignore /*cache*/*

You can filter in even more elaborate ways, but this is a good start yea?
PLEASE NOTE: These are not tested, and I'm no expert at these definitions. You are responsible for testing that these look like they are working correctly. An easy way to test is to just do rclone ls /somepath or you can do rclone sync /somepath MyRemote:/somepath --dry-run -v . This latter command just simulates the sync due to dry-run, and -v will make ti tell you everything it copies so you can check that it is filtering as intended.

Let me know if it looks like it's not working properly and I'll try to refine it if that is the case.

EDIT: Did a quick cursory test, and it seems to work as I intended. You will have to verify on your end though :slight_smile:

the only reason I'm trying to do it this way is because when teh sync starts, it sarts checking showing status x/y and y constantly goes up endlessly and I'm guessing it's from temp files being constantly added or removed.

Aaaah, well this is actually a totally different thing :slight_smile:

This is intentional behaviour from rclone. There is nothing wrong happening. it just has a limit to how many files it keeps in it's transfer-list at a time, so if there are a LOT of files then this list can become full and it will add more stuff to it once it has finished more files. This makes it look like the transfer is just growing and grownig, but it's really not. rclone just isn't trying to look at the whole job at once.
I think the default is 5.000 or 10.000. Something like that.

If you want to override this and make the whole job fit in the transfer list at once (so rclone can immediately figure out how big the job is from the start) then just use --max-backlog 9999999
That should solve exactly the "problem" you are talking about.
Do note this costs a little more memory, but it's probably just a few dozen MB (depending on how many files). For reference my drive of about 85.000 files takes about 150-200'ish MB to fully list in memory.
The limit is there just so a small system won't crash from running out of memory if you try to sync millions of files... :slight_smile:

Lastly - what service provider do you use? (for example Google Drive, Wasabi or Onedrive) ?
Depending on which you have probably have a few more optimization tips for you.

I use gsuite with a client ID. it's about 250gb backup

Then I would highly recommend using --fast-list in your sync command.
This may reduce the time it takes to do a full listing by up to 15x
(it may appear to have "frozen" for a few seconds at the start when using this - but this is normal - it is just collecting all listings at once - much more efficiently).

Thanks stigma, I will try those things when I get home

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.