How to copy "new" sync files/directories to a separate directory on remote drive to organize it with a bash script

Good Day!

I'm a rclone newbie and need a professional advise on my task :slight_smile:

What is the problem you are having with rclone?

We sync SFTP and Google drive. Rclone copy works well and there is no issues at all. Thank you guys for a nice tool!

I will try to describe my task in details, but it similar like this request , I found on your forum (Group files based on regex/custom code)

  1. We "copy" SFTP directory (source ) to a Google Drive Folder (remote).
  2. After , we need the new files to organize according to the filters (like, if filename is request and has more than 3 lines or XXX xharacters ... or if directory name contains DD/MM/YYYY) and place those files into the separate structured directory on the remote or Google Drive.

So , my question is - can I save the new files synced on Google Drive to a separate folders for scripting when rclone copy command is running ? I didn't find the option in manual, how to save the new files during the sync into multiply/separate destination folder. Or to be more clear - when I sync/copy source:/folder1 and destination:/folder1 and need to copy only new files into destination:/folder3
Or, maybe, rclone have the feature, according to the filters, put the files into a separate folder on the remote ? Please , advise me!

Run the command 'rclone version' and share the full output of the command.

rclone v1.60.0

  • os/version: ubuntu 22.04 (64 bit)
  • os/kernel: 5.15.0-1022-aws (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.19.2
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy SFTPServer:/folder1 GDrive1:folder1 -P -v

Thanks in advance!

Hi Sergejs,

Here is what I would do if I wanted to sync my local Pictures folder into yearly ordered folders in my Google Drive:

rclone sync --include="2022*" /path/to/Pictures GDrive:Pictures/2022
rclone sync --include="2021*" /path/to/Pictures GDrive:Pictures/2021
rclone sync --include="2020*" /path/to/Pictures GDrive:Pictures/2020
...

My pictures are named something like this: YYYYMMDD_TTTTTTT.heic

I intentionally kept it very simple, but the approach can be extended with loops, regex filters etc. - as long as the number of different groupings aren't too high (the source directory needs to be scanned for each group).

Perhaps you can do something similar?
How many different groups do you have?

Hi Ole,

Many Thanks for your prompt reply and provided solution. But I need to "chase two rabbits at once" :slight_smile:

  1. I use rclone copy to sync SFTP/folder1 to GDrive/folder1 - that's what I need to sync all files from SFTP to GDrive
  2. I need the "all new uploaded" files only (in 1st ) to be synced to GDrive/folder2 and in structured way. ( depends on lines count in file, filename).

I can run bash script to organize the files/driectories and later sync it to GDrive, but I can't understand how to accomplish 2nd task line, because , if I sync SFTP/folder1 to GDrive/folder1 - how do I know what files/directories was synced and how to copy them to /local/path and launch the bash script and waht organize the files in structured way and sync it later to Gdrive/folder2 ?

I would like to sync SFTP/folder1 to GDrive/folder1, and in the same time save "the synced files/directories" in a separate folder Gdrive/folder2 (like parallel copy of files)? So, for sincyng I use 2 directories, but also want to save newly created/synced files to a separate directory. Is it possible with rclone tool?

Hi Sergejs,

I think my solution solves your overall need (as I understand it) but using a different approach to get there.

Can you describe your overall situation and need without devising a specific way to solve it?

What do you have in the SFTP server is it photos, documents, logs, ... and how do they end there?

How would you like it to be organized in the Google Drive (after all the processing is done)?

Can you give a small concrete example (with two files or so)?

Sure, Ole!

So SFTP contains Folders , like DD/MM/YYYY and other Folders names like Request, In_file, Out_File , etc. The most files are in CSV format and some in PDF.

What I need :slight_smile:

  1. Sync All Files from SFTP/FolderName to GDrive:/FolderName. This is used for archiving (because we use rclone copy command w/o file/folder delete on destination.
  2. We need to know what files is synced during the last sync and copy (using rclone copy) those files only to GDRIVE/Folder2 in such order:
    2.1. If file name contains "request" and line count in file (wc -l < filename.csv) is more than 5 lines - move this file to GDrive/Folder2/Request/
    2.2. If the file name contains "adinfo" , adinfo file format is: Pattern1_adinfo_DDMMYYYY.csv - move this file to GDrive:/Folder2/Pattern1/adinfo/DDMMYYYY/Pattern1_adinfo_DDMMYYYY.csv. Here I need to extract Pattern1, DDMMYYYY from the filename and create the directories on the destionation.
    2.3. If new folder is synced (during the first command) and this Folder located on Grive:/Folder1/PreambulaDDMMYYYY copy this folder to Gdrive:/Folder2/Out_Folder/PreambulaDDMMYYYY

So, I understand how work the 1st part of my task for Archiving. But I can't accomplish the 2nd part of my task. I understand that I need to know somehow what files was synced when I launch the 1st command. And after I need to copy them into the separate folder and organize it.

I hope, now I'm was clear :slight_smile: Apologies for confusing you

Thanks Sergejs!

Good explanation, pretty sure I got it right now :slight_smile:

I will make my answer a little brief and assume some script experience, so please ask if things are unclear.

I suggest you add --log-level=INFO --log-file=archive_sync.log to the sync to the archive, then the log file will among other things contain an INFO line for each file copied.

You will then be able to make a script, that first extract the new filenames (and their location in the archive) using sed and regex something like this:
https://forum.rclone.org/t/how-to-extract-file-names-after-move-command-is-successfully-done/33725/3

and then iterate over the list of new files to determine where to (optionally) place an extra (server side) copy using the following rules:

Here you also need (some) of the actual content of the file. You can either make a rclone copy from GDrive to a local temp folder and then do the wc -l, or perhaps it is possible and faster to do it directly on the output from rclone cat (perhaps using --head or --count).

When decided what to do you can do a server side copy something like this:

rclone copy GDrive:/FolderName/archive/some/folder/filename.csv GDrive:Folder2/Request

When you have extracted the pattern then just do a server side copy something like this:

rclone copy GDrive:/FolderName/archive/some/folder/filename.csv GDrive:/Folder2/Pattern1/adinfo/DDMMYYYY/Pattern1_adinfo_DDMMYYYY.csv

any non existing higher level directories will be created automatically.

Not sure I fully understand this, but perhaps it is just something like

rclone copy GDrive:/FolderName/PreambulaDDMMYYYY GDrive:/Folder2/Out_Folder/PreambulaDDMMYYYY

Hope I got it somewhat right and (most of) the above gives meaning to you :sweat_smile:

Great explanation, Ole!

Looks the right way for me! Thank you for advising me.

That mean, I should run the bash script when syncing Archive. The script should contain command lines in my case:

  1. rclone copy SFTP:/Folder1 GDrive:/Folder1 --log-level=INFO --log-file=archive_sync.log
  2. Extract filenames/directories from archive_sync.log file
    3,4,5 ... is "rclone copy ..." commands with defined rules to a specifc destination Folder ( adinfo, request, folders) depends on files/folder count in archive_sync.log

You are my savior, Ole! I will keep you posted about the results :slight_smile:

1 Like

Hi Ole,

Just for your info - after advising me "how to get file list from the log file", I complete must task with filtering using bash script. All works great!!!

The only thing rclone cat XXXX --tail command works slowly ( approx. 6 sec for each file). It doesn't matter for me, because I'm not having a huge file count during the last sync.

All Rclone Team - Have an amazing weekend!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.