Looking for some recommendations on a unique file transfer scenario. The source is a S3 bucket with date folders (e.g., "2020-04-28") and new files are written throughout the day every day. The destination is a SFTP endpoint where new files are immediately moved to a different location when transfer is complete. This means that the destination filesystem cannot be used to maintain the state of file transfers (i.e., blind copies). The goal is to periodically transfer new files to the destination, and provide a way to redo files that failed.
I can probably write a script to periodically "detect" new files, but does rclone provide a clever way to do this? Once a list of new files is generated, would using --files-from be an efficient way to transfer the files using rclone? How can I capture failures, so that I might manually redo them later?
When using SFTP as a destination, is it possible to transfer files without the directory? For example, if the source file is "2020-04-28/unique.jpg", then transfer to destination as "unique.jpg". All of my files have unique names, so there is no chance of conflict.
hello and welcome to the forum,
if would be helpful if you post using the question template, as you woul be asked from basic questions.
like what is your operating system?
if you want to copy all the files from a folder and subfolders you can do this
i am no linux expert and there might be a better way
source=/mnt/c/path/to/local/folder
dest=remote:
rclone lsf --files-only -R $source > source.txt
for f in `cat source.txt`
do
rclone copy $source/$f $dest/ -vv
done
Not currently. If you search the forum and issues you'll see discussion of the --flatten flag which is what you'd need.
You'd probably need to fix that up in the copy phase - stealing @calisro 's shell script
for f in `cat new-files`
do
rclone copyto s3:bucket/${f} sftp:$(basename $f) -vv
done
Note that it would be more efficient not to stop and start rclone lots of times as it will do the sftp negotiation each time so using rclone rcd and rclone rc operations/copyfile would be more efficient but that can be for phase 2!
@ncw The strategy of capturing state using new-files/old-files will work for me, and I like the rclone rcd idea!
With respect to rclone rcd mode, does the daemon work on all async jobs in parallel, or can I control it so that only one job is worked at a time, but additional jobs are queued? Also, how do you control per job parallel transfers (i.e., --transfers) - would I use a options/set command? In my case, due to limited bandwidth, I would only want a fixed number of transfers to be active at once.
You can submit jobs synchronously or asynchronously - see the _async flag in the docs.
If you submit them asynchronously then the rc will not obey --transfers and will transfer as many jobs as you submit at once! Maybe it should obey --transfers - I'm not sure.
You'll have to control this yourself for the moment... You could always set --bwlimit to fix the upper bandwidth useable?