Rclone check from within a script

detwiler · January 16, 2019, 7:18pm

I’m trying to figure out how to use rclone check from within a script. The first step of said script should be to determine if there are any changed files on the remote. If there are not, then exit. If so, proceed to copy and process. Superficially, “rclone check” seems to be the command that I am looking for. But it isn’t entirely clear to me how to use it in a conditional. Do I have to parse the logged output for somewhere containing the substring “0 differences found”? Or is there a cleaner way (a way to tell it to return only a 0 or 1 status for example)?

Thanks,
Todd

calisro · January 17, 2019, 2:51pm

If it were me and not knowing your exact use case, i’d just run the sync itself to copy the files if they were changed but obviously I don’t know your entire use case.

That being said, the easiest way would be to parse the output for

…
Failed to check: XXX differences found

detwiler · January 17, 2019, 4:55pm

Thanks for the suggestion. My use case includes triggering some lengthy processing. I don’t want to do that processing if nothing has changed. So, even if I go ahead and sync, I still want to know if anything is new before I waste cycles processing unchanged files. I’ve already written my script to parse the output to standard out and standard err for " 0 differences found". It works fine, but it feels like there should be a cleaner way than relying on the current format of the logging output.

calisro · January 17, 2019, 5:06pm

I don’t think the return codes will give you what you need but you could experiment.

https://rclone.org/docs/#exit-code

detwiler · January 17, 2019, 5:16pm

Thanks calisro. I suspect that the “check” command is exiting with a success code, whether or not there are changes to report. But I will run a test and see. Thanks again.

detwiler · January 17, 2019, 5:32pm

Interesting. I will include this info here for future searchers. A ‘check’ command, where changes are detected does indeed return a different exit code.

If the ‘check’ command finds no changes, it returns an exit code of 0 (“success”, according to rclone docs). If the execution finds file/directory differences, it returns an exit code of 1 (“syntax or usage error”).

Animosity022 · January 17, 2019, 5:47pm

I’m not sure what you are trying to accomplish though.

If you wanted to check and run another command to address what was missing in the check, why not run one ‘sync’ to do both.

What are you try to accomplish? Can you share your use case of the problem you are trying to solve as that helps explain the ‘why’ and I’m sure many folks can help on the ‘how’.

ncw · January 17, 2019, 7:20pm

That is what I'd expect. A failed check is an error... There should probably be a different error code for "check found differences" vs "the command went wrong"

For example from the grep man page

EXIT STATUS
       Normally the exit status is 0 if a line is selected, 1 if no lines were
       selected, and 2 if an error occurred.

detwiler · January 17, 2019, 7:54pm

Thanks for the reply animosity022. As I mentioned above, and without getting too specific, I am simply trying to determine if any files have changed on the remote, before triggering further processing. If there are changes, my script will sync and then perform a whole processing pipeline. I am trying to avoid that workflow when there are no changes. So, I am using ‘check’ in a conditional. The script should simply exit if there are no changes. Even if I were to sync right away, I’d still need to know if there were any changes (to skip an expensive processing pipeline if there are not).

calisro · January 17, 2019, 7:57pm

I think what Animosity and I were eluding to was since you’re going to take the API hits doing the compare, why not just ‘sync’ and then if it found differences, carry on with your processing. If it didn’t, then stop. Its the same as check but you don’t need to ‘check’ for duplicates and then turn around and ‘redo’ that processing for the sync.

detwiler · January 17, 2019, 7:58pm

Thanks ncw, that reinforces for me that those are the exit status values that I can continue to expect. I will use those as my true/false values for determining if there are changes.
Cheers,
Todd

detwiler · January 17, 2019, 7:59pm

Do you know if I can expect the same exit status codes with sync that I observed for check?

calisro · January 17, 2019, 8:00pm

You’d have to test. I’m not sure. “Check” is a pretty expensive operation to run and then to turn around and run it again under a ‘sync’ though.

detwiler · January 17, 2019, 8:03pm

I did a quick test (actually with copy, not sync) and it looks like it exits with 0 either way. So, if I just sync from the start I don’t have any way to know if there where changes, is that correct?

calisro · January 17, 2019, 8:04pm

I think you're back to scrapping the output. If you're just trying to watch 'drive' for change, you could call the api directly though. Don't want things too complicated for you though.

detwiler · January 17, 2019, 8:04pm

I do want to say, thank you to everyone who has participated in this thread. You’ve helped me a great deal. If check is my only way to capture “has changed”, that is OK. Even if it is expensive, it allows me to skip a processing pipeline that is surely much more expensive.

Cheers

Animosity022 · January 17, 2019, 8:10pm

I’m still confused as what you are trying to really get to.

If you have to run a check, that’s say 1 API hit if you are going for a single file.

If not changed, nothing happens.

If changed, you run a copy or sync, it has to check again if the file matches or doesn’t match and if it doesn’t match, you have to copy leading to 2 API hits as you get a check and a copy operation.

Therefore, if there are no changes, sync would be the same as a check. If you have to copy, it’s going to cost you three times as much as you have to repeat a check, check, copy.

felix@gemini:~$ rclone copy /etc/hosts GD:
felix@gemini:~$ rclone check /etc/hosts GD:
2019/01/17 15:08:10 NOTICE: Google drive root '': 0 differences found
felix@gemini:~$ rclone sync /etc/hosts GD:
felix@gemini:~$ rclone sync /etc/hosts GD: -vvv
2019/01/17 15:08:26 DEBUG : rclone: Version "v1.45" starting with parameters ["rclone" "sync" "/etc/hosts" "GD:" "-vvv"]
2019/01/17 15:08:26 DEBUG : Using config file from "/data/rclone/rclone.conf"
2019/01/17 15:08:26 INFO  : Google drive root '': Waiting for checks to finish
2019/01/17 15:08:26 DEBUG : hosts: Size and modification time the same (differ by -963.831µs, within tolerance 1ms)
2019/01/17 15:08:26 DEBUG : hosts: Unchanged skipping
2019/01/17 15:08:26 INFO  : Google drive root '': Waiting for transfers to finish
2019/01/17 15:08:26 INFO  : Waiting for deletions to finish
2019/01/17 15:08:26 INFO  :
Transferred:   	         0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors:                 0
Checks:                 1 / 1, 100%
Transferred:            0 / 0, -
Elapsed time:       300ms

2019/01/17 15:08:26 DEBUG : 4 go routines active
2019/01/17 15:08:26 DEBUG : rclone: Version "v1.45" finishing with parameters ["rclone" "sync" "/etc/hosts" "GD:" "-vvv"]

detwiler · January 17, 2019, 8:48pm

Animosity022, it isn’t the cost of the check that I am concerned about. If there are differences, I am going to spin up a whole sequence of actions (rebuild a database, extract some file content and push to another drive, generate some new documents and push those, etc.). I am trying to avoid doing those things when there are no changes.

I could use ‘copy’ (the command I am using so as not to delete anything on the target) instead of check, but I’d still need to determine if anything was updated. My current script parses the output of check looking for the substring " 0 differences found" as a way to get a boolean response to whether or not there are changes. But that relies on the stability of the logging message format. It turns out, that I can use the exit status as a more controlled check. It appears that I cannot do the same with copy. Additionally, it doesn’t even appear as though I could parse the output for copy. See the copy command in this sequence, though it resolves the differences, it has no output:

[root@synapse fma]# rclone check ./latest FMATeamDrive:
2019/01/17 12:40:19 ERROR : pun_fma.owl: Sizes differ
2019/01/17 12:40:19 NOTICE: Google drive root ‘’: 1 differences found
2019/01/17 12:40:19 Failed to check: 1 differences found
[root@synapse fma]# rclone copy ./latest FMATeamDrive:
[root@synapse fma]# rclone check ./latest FMATeamDrive:
2019/01/17 12:40:46 NOTICE: Google drive root ‘’: 0 differences found

Animosity022 · January 17, 2019, 8:54pm

I’m just having a hard time following what you are doing.

A ‘check’ indicates you have a source location to compare against a target location.

If source doesn’t match target, you are doing a bunch of things in the middle to add to the source and copy it up?

detwiler · January 17, 2019, 9:09pm

Basically, yes. And I will amend my last message and say that I could parse the ‘copy’ command output if I use the verbose (-v) argument and look for non-zero values in this part of the log message: “Transferred: 1 / 1, 100%”.

Here is a bit more elaborate description of what I am trying to do. I have anther user out there who is editing any of a number of files that look local to him (thanks to Google File Stream) but are actually in a Google Team Drive. Elsewhere, on a Linux server, I have a processing pipeline that needs to execute whenever any file has changed within that Team Drive (changes won’t happen often and the pipeline shouldn’t run if there are no changes). The script with the ‘check’ is running in a cron job. Each day (or whatever schedule I assign) it checks the drive to see if any files are updated and, if so, it launches the pipeline. I can process the log statements sent to standard out or standard err (by check or copy), but I was investigating whether there was a more stable flag to check (one that wouldn’t fail if the log message format is altered). With check there is an exit code that differs depending on whether there are or are not changes. The cost of the extra API hits are not of particular concern, as the whole thing runs in the background on a cron job, and that cost is minimal as compared to the cost of unnecessarily executing the pipeline.