Sync against one remote but copy to another

MistarMuffin · February 6, 2018, 4:58pm

I was brainstorming a way to expedite my upload to Google Drive and had the following idea. I have access to an internet connection with upload that is over 25x faster than my home connection. I would like to use rclone to determine what local data needs to be uploaded to Google Drive and then copy it to a USB hard drive instead. Then I could take the USB disk to the site with faster internet and use rclone COPY to send it to Google Drive.

Ultimately this boils down to rclone running its initial scan against one remote (Google Drive) but performing the actual copy to another remote (USB storage). If this is possible now I can’t figure it out. I am marking it as a feature request so ncw will hopefully chime in.

Thanks.

ncw · February 6, 2018, 5:03pm

You could do this with a bit of scripting I think...

First do a copy with --dry-run from local -> drive
Parse the log file to make a list of files which need to be copied
Use --files-from and this list of files to rclone copy from local -> usb
When on faster internet do rclone copy from usb -> drive (no need for the --files-from at this point)

I think the only hard part is the "parse the log file" part. You could do that with grep and sed, or you could edit in an editor or 100 different ways depending on what you find easiest.

boltn · February 6, 2018, 5:41pm

Just remember that Google Drive limits your uploads to about 750GB/day

MistarMuffin · February 6, 2018, 5:42pm

I’ve been chewing over an approach like that. It’s possible but I’m just not confident enough in my own sed/grep voodoo. Could rclone have a parameter that would generate the file list for use with --files-from? That would be a lot cleaner and could be useful in a few use cases.

Thanks!

MistarMuffin · February 6, 2018, 5:45pm

I didn’t know about the limit, boltn. Thanks for the heads up.

ncw · February 6, 2018, 5:50pm

Here are the magic runes to process the log file

rclone copy --dry-run /tmp/new/ /tmp/new2 2>&1 | perl -lne 'print $1 if /NOTICE: (.*?): Not copying as --dry-run/'

Easier than writing a new mode for rclone

MistarMuffin · February 6, 2018, 5:52pm

Thanks, much easier than I could have done it. I should be able to adapt to Windows pretty easily. Don’t judge

MistarMuffin · February 6, 2018, 7:17pm

ncw,

I have different files with the same names but in different sub directories. That would mean that if any of the files that share a name is added to files-from.txt then all of the files would be copied in the subsequent jobs, right? It seems like the files-from.txt would need to include path information in order to prevent this. I don’t see a way to get paths from the dry-run output. Any ideas?

ncw · February 7, 2018, 8:40am

--files-from takes a list of paths rather than patterns so it should do what you want.

MistarMuffin · February 7, 2018, 1:43pm

I know --files-from takes paths. The part that concerns me is that our regex voodoo against the --dry-run output does not include any paths. So the resulting text file is just a list of filenames. Possible to modify the --dry-run output to include paths?

ncw · February 8, 2018, 8:55am

--files-from takes paths relative to the root. The log shows file names relative to the root, so I think it should all work out!

MistarMuffin · February 8, 2018, 2:09pm

Hmm, for whatever reason I did not see any paths when reviewing the --dry-run log output. You are correct, of course. Sorry for the confusion. I was just trying to anticipate potential problems before they arose. This is working perfectly so far. Thanks for your help!

ncw · February 8, 2018, 9:40pm

No worries!

Great and you are welcome!