Does (or can) rclone run on both client AND server?

jim-collier · September 10, 2019, 9:46am

I have two Linux machines on a LAN. One (the "client") has an array I want to clone to an array on the other machine (the "server" and actually my local mirror). Do I run rclone as a daemon on the server?

It seems like that's may not be a thing, as there would be options to connect to a listener by address:port or something.

With an rclone server daemon for the client to talk directly to, both sides could do hashing/compare at the same time, rather than the client having to essentially do a full read of (in this case) 7 TB of server data to check hashes. (If that's how it works.) The closest back-end options I see, are SFTP and local. (And "local" could presumably be a Samba mount. But either way would still imply having to pull all 7TB of server data over to the client for hashing...again, if that's the way rcopy works...)

As an example of the client/server model in this context: rsync. The client talks directly to a server daemon over ssh, to compare hashes, if that's the mode it's working in.

Thanks!

What is your rclone version (output from `rclone version`)

(latest)

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Ubuntu 18.04 x64

Which cloud storage system are you using? (eg Google Drive)

Local Linux server, via ssh/sftp

Animosity022 · September 10, 2019, 10:39am

You can use rclone serve and there a few backends you can serve out listed here:

https://rclone.org/commands/rclone_serve/

ncw · September 10, 2019, 11:50am

rclone can work exactly like that using the sftp backend.

You can also run an rclone server rclone serve sftp/http/webdav/ftp which you can configure the appropriate backend to talk to.

I'd start using the native ssh and the sftp backend and see how you get on with that.

jim-collier · September 10, 2019, 7:11pm

Thanks! That's interesting about rclone as a server.

Follow-up question: When rclone talks to an sftp (or http|webdav|ftp) backend, is that backend doing the hashing of file contents (in order to catch the case of renamed folders without having to copy the entire folder contents over again), or does the client still have to pull the entire target content over (either all up front or as it goes per-file), in order to calculate hashes? I'm not aware of any way that, say for example the gnu sftp server, has any knowledge of, or way to scan and generate server-side, file hashes.

Or is rclone even designed to do that? ("Don't transfer full contents of a folder that has only changed names on the source side, by intelligently comparing checksums generated in parallel on both client and server") It's unclear what --track-renames flag does. The docs say it handles "renames". But renames of what? Only files? Or also folders high in the tree, with potentially many TB of data underneath? Does it also handle file and folder moves? (Which isn't necessarily the same logic as handling renames.)

Of course, I could set up multiple test scenarios to answer it, but that could easily be way more time-consuming than reading the entire rclone website!

And if not (server doesn't perform checksumming in parallel with client), then the same question for the case where rclone is the backend... Is the server presumably smarter in that regard, and able to calculate checksums independently of the client?

Rsync does exactly that, for example, with the following options:

rclone -a --checksum --delete-after /CLIENT/ hostname:/SERVER/

That creates an exact bit-for-bit, verified-checksum-accurate on both sides, mirror of CLIENT to SERVER. It's very efficient because the client and server both calculate checksums independently. In that way, only new & changed bits need to be transferred. But that completely falls apart if, for example, a CLIENT folder that's high in the tree gets renamed. That could easily (and in my case frequently) results in multiple TBs having to be copied over as "new", when there may actually be zero bits of changed file content, other than a single renamed folder. (Rsync has some options that makes it a tiny bit smarter in recognizing renames, but only in very narrow conditions that don't cover basic things like a high-level folder rename/move. There are also some clever scripting solutions that involve creating a full mirror of hardlinks in a hidden folder on both client and server, which is also not appropriate for my use case.)

What I'm trying to accomplish, is exactly what the rsync "mirror" command does above, but that doesn't copy over potentially many TBs of "new" data just because a folder gets renamed or moved. (Which would turn what should be an operation spanning seconds, into days or weeks; and would also run out of server space before finishing, for example if many TBs of "new" [but actually redundant] files are copied over to a server that has a target goal of approximately constant 75% capacity. And if ZFS dedup property is turned on to mitigate the intermediate space problem, the copy may now take literally months, at single-digit MB/s write speed.)

Thanks!

ncw · September 10, 2019, 9:07pm

Yes the hashing is controlled by the backend. In the case of the sftp backend, rclone will call md5sum over ssh to calculate the hash.

I hope I answered your questions about --track-renames in the other thread so I won't repeat myself here.

rclone handles renames of files and moving of files into new directories with --track-renames. Directories are created as necessary in the process.

jim-collier · September 10, 2019, 9:20pm

Bingo - the magic sauce I was missing! Thank you.

I'm guessing you take advantage of SSH multiplexing/controlmaster to accomplish that? (Not that that really matters. I'm just happy to know it's being done on the server, somehow some way!)

system · September 13, 2019, 9:20pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Does (or can) rclone run on both client AND server?

What is your rclone version (output from rclone version)

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Which cloud storage system are you using? (eg Google Drive)

What is your rclone version (output from `rclone version`)