Read Source and Destination Paths from a File for copyto and moveto commands

Hi, I am the research lead of the Wayback Machine at the Internet Archive. We at the Internet Archive love rclone for its built-in support of the Internet Archive as a storage systems and we use it in out tooling internally.

Recently I was trying to copy a large list of files from the Internet Archive's Petabox to AWS S3 in which the source and path directory structures and file names were slightly different. For that, I created a TAB-separated 2-column source and destination path mapping file like this:

$ cat src_dst_path_map.txt
ia:/item1/file1    aws:/bucket1/part1/item1_file1
ia:/item1/file2    aws:/bucket1/part1/item1_file2
ia:/item2/file3    aws:/bucket1/part1/item2_file3
ia:/item2/file4    aws:/bucket1/part2/item2_file4

Then I wrote a small script to consume this mapping file and run rclone copyto for each line:

$ cat rclone_map_copier.sh
#!/usr/bin/env bash

while read -r src dst
do
  rclone copyto --progress $src $dst
done < <(cat "$@")

Finally, I ran it as following:

$ ./rclone_map_copier.sh src_dst_path_map.txt

This approach works, but it means a new rclone process is created for each line in the mapping file, which adds a small overhead of boot up and teardown of such processes.

I did see that rclone supports --files-from and --files-from-raw options, but they only support reading a list of source files, not their corresponding destinations.

It would be nice to either allow an optional second column to the input file of --files-from/--files-from-raw option or introduce a new set of flags like --src-dst-map/--src-dst-map-raw to accept such a two-column input file.

PS: If a feature or workaround already exists to achieve the described objective more efficiently, I will be glad to learn about that.

welcome to the forum,

one option is to use the remote control api - https://rclone.org/rc/

operations/copyfile: Copy a file from source remote to destination remote

This takes the following parameters:

    srcFs - a remote name string e.g. "drive:" for the source, "/" for local filesystem
    srcRemote - a path within that remote e.g. "file.txt" for the source
    dstFs - a remote name string e.g. "drive2:" for the destination, "/" for local filesystem
    dstRemote - a path within that remote e.g. "file2.txt" for the destination

there is a new command rclone convmv
"Convert file and directory names in place."
"The --name-transform flag is also available in sync, copy, and move."

1 Like