Bisync question

What is the problem you are having with rclone?

Running rclone bisync without --resync flag after running rclone bisync with --resync flag results in errors. Why? Do I have to use --resync flag every time?

NOTE -- I made changes in the path1 directory before running rclone bisync again. Is that my problem? But I made no changes to at least one of the files listed in the error messages.

Run the command 'rclone version' and share the full output of the command.

Bill's Air:~ bmeacham$ rclone version
rclone v1.63.1
- os/version: darwin 10.14.6 (64 bit)
- os/kernel: 18.7.0 (x86_64)
- os/type: darwin
- os/arch: amd64
- go/version: go1.20.6
- go/linking: dynamic
- go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Microsoft OneDrive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Bill's Air:~ bmeacham$ rclone bisync  ~/OneDrive remote_onedrive:
2023/08/31 15:03:33 NOTICE: bisync is EXPERIMENTAL. Don't use in production!

<snip>

2023/08/31 15:07:17 ERROR : -          Path1 file not found in Path2       - Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png

<snip>

2023/08/31 15:07:17 ERROR : Bisync critical error: path1 and path2 are out of sync, run --resync to recover
2023/08/31 15:07:17 ERROR : Bisync aborted. Must run --resync to recover.

The rclone config contents with secrets removed.

Bill's Air:~ bmeacham$ rclone config
Current remotes:

Name                 Type
====                 ====
remote_onedrive      onedrive


Bill's Air:~ bmeacham$ cat ~/.config/rclone/rclone.conf
[remote_onedrive]
type = onedrive
token = {"access_token":"<secret removed>","token_type":"Bearer","refresh_token":"<secret removed","expiry":"2023-08-31T15:48:33.975595-05:00"}
drive_id = 950df50906f3422c
drive_type = personal

A log from the command with the -vv flag

"The command" - which command? I assume you mean "The command you were trying to run." Please update this template to remove ambiguity.

Bill's Air:~ bmeacham$ rclone bisync  ~/OneDrive remote_onedrive: -vv
2023/08/31 15:34:26 DEBUG : rclone: Version "v1.63.1" starting with parameters ["rclone" "bisync" "/Users/bmeacham/OneDrive" "remote_onedrive:" "-vv"]
2023/08/31 15:34:26 DEBUG : Creating backend with remote "/Users/bmeacham/OneDrive"
2023/08/31 15:34:26 DEBUG : Using config file from "/Users/bmeacham/.config/rclone/rclone.conf"
2023/08/31 15:34:26 DEBUG : Creating backend with remote "remote_onedrive:"
2023/08/31 15:34:27 NOTICE: bisync is EXPERIMENTAL. Don't use in production!
2023/08/31 15:34:27 DEBUG : Lock file created: /Users/bmeacham/Library/Caches/rclone/bisync/Users_bmeacham_OneDrive..remote_onedrive_.lck
2023/08/31 15:34:27 INFO  : Synching Path1 "/Users/bmeacham/OneDrive/" with Path2 "remote_onedrive:/"
2023/08/31 15:34:27 DEBUG : Lock file removed: /Users/bmeacham/Library/Caches/rclone/bisync/Users_bmeacham_OneDrive..remote_onedrive_.lck
2023/08/31 15:34:27 ERROR : Bisync critical error: cannot find prior Path1 or Path2 listings, likely due to critical error on prior run
2023/08/31 15:34:27 ERROR : Bisync aborted. Must run --resync to recover.

It's very clear as above that it asks for the command causing an issue and the next line asks for a debug lof of the command. It's very clear as can't say I've ever seen someone say that is ambiguous before.

Sorry, I thought it might mean the "rclone config" that I ran to answer the question about rclone config contents.

Do you have a suggestion on wording that would alleviate the issue?

Sure. Spell it out:

" A log from the command you were trying to run with the -vv flag".

As a requirements analyst and tester I have learned never to underestimate the ability of a user to misunderstand what I think is crystal clear.

1 Like

While I am at it, here is an other suggestion for improvement. Change this:

The rclone config contents with secrets removed.

to this:

The contents of the rclone config file (rclone.conf) with secrets removed. Replace the line that begins with "token =" with "<token secrets removed>".

My best guess, according to your log, is that this file: Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png was either added to Path1 or deleted from Path2 sometime during (not before) the approximately 4 minutes of your bisync run (from 2023/08/31 15:03:33 to 2023/08/31 15:07:17.)

If this is what happened, it is a known issue with the current version of bisync, and I'm working on a fix for it. Progress can be tracked in the following GitHub issues:

In the meantime, my recommendation is to use --check-sync=false and periodically do an independent integrity check as described here.

If you think this is NOT what happened, then I would be grateful for any other debug info you can provide (including, ideally, steps to reproduce the issue), so that I can look into this further and identify the cause.

Note that the error from your second log is a different error, and is expected:

2023/08/31 15:34:27 ERROR : Bisync critical error: cannot find prior Path1 or Path2 listings, likely due to critical error on prior run
2023/08/31 15:34:27 ERROR : Bisync aborted. Must run --resync to recover.

What this is saying is that because your prior run ended in a critical error, bisync refuses to run again until you run a --resync. This is a safety feature, to avoid making any further changes until the user has a chance to check things out and give the green light to proceed. Sometimes, running a --resync will be all that's needed to resolve things, but other times, you'll want to do some manual cleanup first, before running the --resync. Since bisync can't know which scenario it is, it takes the more cautious approach of asking the user to make the decision, rather than risk guessing incorrectly.

No. Generally, you should only include --resync in the following (rare) scenarios:

  1. It is your first bisync run (between these two paths)
  2. You've just made changes to your bisync settings (such as editing the contents of your --filters-file)
  3. There was an error on the prior run, and as a result, bisync now requires --resync to recover

The rest of the time, you should omit --resync. (Here's why.)

It shouldn't be. A path1 and path2 are out of sync error is only expected if changes were made during the run, not before it. So, if you're certain that nothing changed during the run, then more investigation is needed here.

Sorry, that's not what happened. That file has been just sitting there untouched on either side.

Well, rats, I wish I could, but it's lost in the mists of time at this point.

I know that's not very satisfactory. I will continue in another post shortly. I need to log on to another machine to verify what happens rather than relying on memory.

That is interesting, because the following line in your log:

2023/08/31 15:07:17 ERROR : -          Path1 file not found in Path2       - Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png

indicates that bisync sees the file only in Path1, not Path2.

I notice that some parts of that log were redacted (replaced by <snip>) -- are you able to say whether there was another line similar to this one, but with the paths reversed? I ask that specifically because I notice that this particular filename has a special character in it (oŐą) and I'm wondering if somehow that character is being represented differently on the two remotes, thus causing bisync to think they are different files. It could have something to do with unicode normalization, especially since it looks like you are using a fairly old version of macOS, and Mac is notorious for unicode normalization issues. (FWIW, I'm a Mac user myself and haven't run into encoding issues with bisync, but I'm using the latest OS version and it's possible the behavior is different in yours.)

One thing that could still be helpful -- any chance you still have the .lst-err files from the failed run in your cache directory? The default location on mac is /Users/YOURUSERNAME/Library/Caches/rclone/bisync. They should still be there unless you deleted them. If you still have them, could you paste the line containing 300px-SchroŐądinger_cat.png from both files (.path1.lst-err and .path2.lst-err)? It may give us clues as to why bisync thinks they are different.

It could also be helpful if you could run rclone lsl on both the Path1 and Path2 versions of the problematic file, and paste what comes back. i.e.:

rclone lsl ~/OneDrive/Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png
rclone lsl remote_onedrive:Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png

Yes, I found the lst-err files. They are from today, not from a few days ago, but they show the relevant errors.

From Users_bmeacham_OneDrive..remote_onedrive_.path1.lst-err

"Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png"
-   100349 md5:6afc2fef3fd544cb4c88771962b580e5 - 2007-10-02T13:13:58.000000000+0000 

From Users_bmeacham_OneDrive..remote_onedrive_.path2.lst-err

2007-09-04T11:40:54.000000000+0000 "Documents/!JOT/Book/images/300px-Schrödinger_cat.png"
-   100349 quickxor:7a1ee650d5a56f497d0260d33dc218cce3536a88 - 

OK, here you go.

Bill's Air:~ bmeacham$ rclone lsl ~/OneDrive/Documents/'!'JOT/Book/images/300px-SchroŐądinger_cat.png
    80263 2007-09-04 06:40:54.000000000 300px-SchroŐądinger_cat.png
Bill's Air:~ bmeacham$ rclone lsl remote_onedrive:Documents/'!'JOT/Book/images/300px-SchroŐądinger_cat.png
Bill's Air:~ bmeacham$ 

The second rclone lsl command gives no output. But the file is there as shown by doing lsl on the whole directory rather than just the file.

Bill's Air:~ bmeacham$ rclone lsl remote_onedrive:Documents/'!'JOT/Book/images/
     1201 2008-04-14 10:10:16.000000000 180px-Necker_cube.png
    80263 2007-09-04 06:40:54.000000000 300px-Schrödinger_cat.png
   100349 2007-10-02 08:13:58.000000000 800px-SynapseIllustration2.svg.png
   305906 2010-01-15 12:52:22.000000000 BlackSwan1.png
   110995 2010-01-15 15:35:31.000000000 BlackSwan1a.png
    31585 2010-03-19 10:57:03.000000000 BlackSwan1b.png
   391104 2010-01-15 12:59:08.000000000 BlackSwan2.png
    28852 2007-10-02 20:52:56.000000000 NB_Neuron.gif
    29900 2007-10-02 20:54:26.000000000 Neuron-no_labels.png
    26876 2007-10-02 21:14:00.000000000 Neuron_labels.PNG
     2698 2007-09-03 10:55:54.000000000 SchroedingersCat.jpg
  1454372 2009-07-27 10:25:01.000000000 double-slit_experiment_results-tanamura.jpg
   202051 2009-07-29 16:52:57.000000000 double-slit_experiment_results-tanamura_sm.jpg
     2869 2009-07-27 10:13:34.000000000 yin-yang.jpg
   155688 2009-07-29 16:26:41.000000000 yin-yang_big.jpg
     6954 2020-03-22 13:49:16.000000000 yin-yang_med.jpg
Bill's Air:~ bmeacham$

I don't know what's up with that. I hope this helps.

More info in the next post.

(Continued from previous post)

My question is, what constitutes first bisync run? First time ever, or first time since mounting the remote? My experiments lead me to believe it is the latter.

Let me give you some background. I have more than one machine I use for my wtiting and research, and I use OneDrive to keep a particular folder (directory) with subfolders in sync over all of them. On this Mac, which runs an old version of macOS because a very useful piece of 32-bit software won't run on newer versions, I do the following:

  1. Start out with remote OneDrive not mounted. At this point the folder where I will mount my remote (~/OneDriveRemote) is empty.

  2. Do some work, i.e. create, update and/or delete files on my Mac in folder ~/OneDrive

  3. Mount remote OneDrive into ~/OneDriveRemote. Now that folder is not empty.

3a. Run the following command:

rclone bisync ~/OneDrive remote_onedrive:

QUESTION - Do I need the --resync flag at this point or should I omit it?
ANSWER - Evidently I do need it, because if I omit it, I get errors:

Bill's Air:~ bmeacham$ rclone bisync  ~/OneDrive remote_onedrive:
2023/09/02 11:22:36 NOTICE: bisync is EXPERIMENTAL. Don't use in production!
2023/09/02 11:26:39 ERROR : -          Path1 file not found in Path2       - .DS_Store

<snip>

2023/09/02 11:26:39 ERROR : -          Path1 file not found in Path2       - Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png

<snip>

3b. Okay, run the command with --resync flag

rclone bisync --resync ~/OneDrive remote_onedrive:

This time the run is successful. It terminates with no errors.

CONCLUSION: You need to use the --resync flag on the first run after mounting the remote.

  1. Do some work on my local drive in folder ~/OneDrive

5a. Run rclone bisync again without the --resync flag

QUESTION - Do I need the --resync flag at this point or should I omit it?
ANSWER - Evidently I do need it, because if I omit it, I get errors, pretty much the same as before (I have not done a detailed comparison):

Bill's Air:~ bmeacham$ rclone bisync  ~/OneDrive remote_onedrive:
2023/09/02 11:22:36 NOTICE: bisync is EXPERIMENTAL. Don't use in production!
2023/09/02 11:26:39 ERROR : -          Path1 file not found in Path2       - .DS_Store

<snip>

2023/09/02 11:26:39 ERROR : -          Path1 file not found in Path2       - Documents/!JOT/Book/images/300px-SchroŐądinger_cat.png

<snip>

5b. Okay, run the command with --resync flag

rclone bisync --resync ~/OneDrive remote_onedrive:

This time the run is successful. It terminates with no errors.

CONCLUSION: You need to use the --resync flag on subsequent runs as well as the first run after mounting the remote. This is contrary to the docs, which say "For successive sync runs, leave off the --resync flag."

  1. Unmount remote by terminating the Terminal session.

At this point the folder where I mount my remote (~/OneDriveRemote) is not empty.

Okay, that's all for now. It appears that my initial problem happened because I omitted the --resync flag. I hope I have given you enough to work with.

Thank you so much for this additional info -- it was very helpful and I believe I have now identified the cause, and steps to reproduce it. It is indeed a unicode normalization issue, as suspected.

In short: the filename on your Path1 (macOS local) uses NFD, and the filename on Path2 (onedrive) uses NFC. (This is why your lsl on Path2 came up empty -- there's no file with that exact name, just one that looks the same to us humans.)

Path1 (macOS local - NFD):
300px-SchroŐądinger_cat.png
33303070782d536368726fcc8864696e6765725f6361742e706e67

Path2 (onedrive - NFC):
300px-Schrödinger_cat.png
33303070782d53636872c3b664696e6765725f6361742e706e67

This only makes a difference for the oŐą character. If we strip away all other characters, we have:

Path1 (macOS local - NFD):
oŐą
6fcc88

Path2 (onedrive - NFC):
ö
c3b6

(As background context: most of the world uses NFC, but macOS prefers NFD for some reason, and will sometimes (but not always) convert NFC filenames to NFD invisibly.)

Rclone's copy and sync commands (including the ones used by bisync under-the-hood) automatically normalize to NFC, unless --no-unicode-normalization is specified. This is why you can usually rclone sync between mac and non-mac without issue -- rclone will recognize the NFC and NFD versions as the same filename.

The trouble is that there are three places where bisync does NOT do this:

  1. When building the list of files that need to be transferred during --resync
  2. When building the list of deltas during a non-resync, but only for so-called "Unusual sync checks"
  3. When comparing Path1 to Path2 during --check-sync

#1 and #2 end up being harmless, because although it will try to copy the NFC and NFD versions as different files, the underlying copy function it calls is smart enough to realize they are the same file. It will therefore leave the files alone (if already equal) or update one of them (if not equal), but never create a duplicate.

#3 is the problem you discovered. Because --check-sync never normalizes, it thinks the NFC and NFD versions are different files. It therefore thinks something went wrong during the bisync run, because at the end of the run the Path1 file list does not exactly match Path2. This is why it throws the error.

Here is a bare-minimum case to reproduce:

% rclone touch NFD/oŐą 
% rclone touch NFC/ö
% rclone bisync NFD NFC --resync
2023/09/02 18:23:08 NOTICE: bisync is EXPERIMENTAL. Don't use in production!
% rclone bisync NFD NFC         
2023/09/02 18:23:17 NOTICE: bisync is EXPERIMENTAL. Don't use in production!
2023/09/02 18:23:17 ERROR : -          Path1 file not found in Path2       - oŐą
2023/09/02 18:23:17 ERROR : -          Path2 file not found in Path1       - ö
2023/09/02 18:23:17 ERROR : Bisync critical error: path1 and path2 are out of sync, run --resync to recover
2023/09/02 18:23:17 ERROR : Bisync aborted. Must run --resync to recover.
% 

So, my opinion is that this is indeed a bisync bug (thank you for finding it!), and I will open an issue to track it. The long-term solution is probably that bisync should support normalization in the same manner as sync (i.e. always normalize when comparing between paths, unless --no-unicode-normalization has been specified.) In the mean time, as a short-term solution, I would recommend using --check-sync=false to bypass this issue.

Alternatively, you could convert all NFC filenames to NFD on both paths before using bisync for the first time. Assuming that Path1 is already NFD, and that Path1 and Path2 are already in sync, something like the following should do it (but I haven't actually tried it! Be sure to --dry-run first.)

rclone sync Path1 Path2 --no-unicode-normalization

A few other things worth mentioning in response to your comments:

Are you mounting the remote via rclone mount? If so, I'm not sure if I understand why you are also using bisync simultaneously. mount and bisync both do two-way sync, so it's typically only necessary to use one or the other, not both. And in fact, I'd recommend against using both, as it could potentially cause race conditions. You could find that they attempt to sync the same change at the same time, and whichever accomplishes it first confuses the other. This might well explain some of the other weird behavior you've been seeing.

To answer the question though: what I meant is "first time ever" (as between these exact paths), not "first time since mounting". (I must add: be sure to never bisync while unmounted -- it could cause serious data loss as bisync will think all the mount-side files have been deleted!)

Unless this is your "first bisync run" (as defined above), you should omit --resync here. The errors you cite are not caused by --resync, but by other things. The first error is caused by macOS automatically creating a .DS_Store resource fork file during your bisync run (this is the issue I mentioned in my first post). The second error is the NFC/NFD unicode normalization issue discussed above. Neither are directly caused by --resync (just masked by it.)

I see why that conclusion would seem logical, but it is correlation, not causation. It is not that the lack of --resync causes the problem, it's that once the problem exists, --resync corrects it (in the case of the .DS_Store error) or hides it (in the case of 300px-SchroŐądinger_cat.png.) It does so while also causing other problems (such as the potential for deleted files to get un-deleted), which is why it is not advisable to run --resync outside of the 3 specific scenarios I mentioned.

As above, I think this conclusion misidentifies the cause of the errors, and more importantly overlooks the dangers of using --resync on every run. To illustrate what I mean, try this simple experiment:

  1. Create a file named foo on one side.
  2. Run a --resync.
  3. Rename foo to bar on one side.
  4. Run a --resync.

RESULT: instead of both sides now having only bar, both sides now have BOTH foo AND bar. (Whereas, had you omitted --resync in Step 4, the rename would have worked as expected, resulting in only bar on both sides.)

Issue has been opened:

Glad to be of service. Thanks for your work on this project. Based on your comments I'll have some further questions, perhaps in another topic.

1 Like