Keeping track of succesfully uploaded directories

What is the problem you are having with rclone?

I am using Windows task scheduler and a batch file to copy local folders to a remote S3 server every hour. These folders get created irregularly and need to be removed quite often to make space on the local machine.

One problem I am running into is that it can get very confusing to keep track of which folders have been transferred successfuly and can be removed as the logfile is many thousand lines long.

My idea was to add a line to the end of the batch file that gets added to the log like:
"If there is no error message above the following folders can be deleted:..."
and then a dir command to show the folders in the target directory that were just copied.

The question:
Does a running rclone copy process also copy files that get created in the target folder after it starts?
If not this would mess-up my plan as the user would feel save to delete folders that have not been copied yet.
Is there a better way to keep track of succesfully copied folders?

Run the command 'rclone version' and share the full output of the command.

rclone v1.62.2

  • os/version: Microsoft Windows 11 Pro 22H2 (64 bit)
  • os/kernel: 10.0.22621.3007 Build 22621.3007.3007 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.20.2
  • go/linking: static
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Amazon S3-like

Wouldn't be easier to use rclone move?

Or run hourly rclone copy and then rclone move on oldest folder if space is below certain threshold. Repeat rclone move until you have whatever free space you need. It would require simple shell script.

This way you have guarantee that nothing is missed and have maintenance free solution.

Yet another approach would be to maintain rclone mount with full vfs caching mode. This is where you would keep all your files. rclone mount would sync all to S3 and make sure that size is below max cache size limit. As an added bonus it would give end used access to all files - even historical ones.

Thank you for your in-depth answer.
To elaborate a little bit. I am running rclone on machines that processes data.
The folders I am copying are continiously being written to and also read by data generation pipelines while they are processed until they are done.

If I rclone move and that happens while folders are still processed by the pipeline the pipeline will error which I want to avoid.

I guess I could check if by timestamp if a folder has not been written to and than (re)move it. I just really want to avoid data loss which is why I have been sticking to copy + manual checking for now.

What will happen in the move case? Rclone compares the folder to the remote (size and checksum), sees that they are identical and than removes the local folder?

About the vfs caching mode. The remote is hundred of TB big, I am not sure how that would work with the cache and stuff. I also want to avoid the local user removing something by accident.

Then maybe use age based filtering and only move files not changed in the specified period of time. So e.g. rclone move would only act on files not changed in the last 24h.

rclone move src dst --min-age 24h

That's a very good idea thank you.

I will test how long the processed take at the maximum and set this with some buffer,. Thank you!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.