GCS: auto-create subfolders based on file creation time?

rzino · March 27, 2020, 8:17am

What is the problem you are having with rclone?

I am regularly running 'rclone copy' to copy files from local folders to a Google Cloud Storage bucket.
Each local folder contains 1 week worth of specific log files (rolling delete after 1 week).

In the GCS bucket, I have a hierarchical folder structure similar to:
bucket/log_type/owner/date/

I supply the /log_type/owner prefixes manually as shown in the rclone command below. The date sub-folder should be computed automatically from the file creation date, and added as a prefix to each file encountered in the source directory. Is there such an option in rclone?

What is your rclone version (output from `rclone version`)

rclone v1.51.0

os/arch: linux/amd64
go version: go1.13.7

Which OS you are using and how many bits (eg Windows 7, 64 bit)

ubuntu 18, 64 bits

Which cloud storage system are you using? (eg Google Drive)

GCP Storage buckets

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone: Version "v1.51.0" starting with parameters ["rclone" "copy" "/my_source_dir" "my_remote:my_bucket/log_type/owner/" "-vv"]

asdffdsa · March 27, 2020, 1:16pm

hi, not sure if i understand what you want.

if you copy an entire directory to a new folder in the remote, then you have duplicates in each new folder.

perhaps you can use sync and --backup-dir.
rclone sync /my_source_dir my_remote:my_bucket/pictures/current --backup-dir=my_remote:my_bucket/pictures/current_date/"

and you can use --dry-run when testing.

rzino · March 27, 2020, 2:24pm

Thank you for your input. My initial post was unclear, I have rewritten it now

ncw · March 27, 2020, 10:06pm

Not yet!

I'd probably use some variant of $(date) in the script that does the transfer which doesn't quite get the creation date into the file name but it is close...

asdffdsa · March 27, 2020, 10:21pm

sync and --backup-dir combined with a current date+time could be helpful to newbies.

and then there would be the issue of how to format date+time.
perhaps just use the format from the date command from linux or the golang.

rzino · March 28, 2020, 6:46am

Thanks for your response!

My current hack is to sync every midnight with the --max-age 24h flag, to get the correct binning by date.
However, if a sync job fails, the failed data then needs to be resynced manually.

As the perfect solution does not yet exist, I could envision to have 7 sync jobs one after the other, each filtering the data in the source folder for one specific date, and prefixing the encountered files with that date. Since I have max. 1 week of data, that still seems doable.

ncw · March 28, 2020, 6:00pm

If I was doing this I would rename the log files with dates in at the source - that will make syncing them easier if you miss a sync job etc.

rzino · March 30, 2020, 6:54am

Good point, I think that's what I'll do.

rzino · April 3, 2020, 12:05pm

Update for precision: you can't rename filenames to contain forwardslashes --> I recreated the lowest level of desired folder structure (date) locally.

system · June 3, 2020, 8:05am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.