I've been exploring bisync in depth over the past several months, and have come across a number of issues that I thought would be worth detailing here for the greater rclone community. Apologies for the length of this post -- rather than creating a separate post for each issue, I thought it might be more helpful to have everything in one place.
Let me start by saying that bisync (and rclone as a whole) is an amazing tool and I am very grateful for all of the thoughtful work that clearly went into its design. The following list is not meant as criticism but as a starting point for discussion about how to make the tool even better.
I've divided the list to try to distinguish between Suspected Bugs (things that are actually not functioning as the docs suggest they should) and Feature Requests (functioning as designed, but I wish they were different.)
Suspected Bugs
1. Dry runs are not completely dry
Consider the following scenario:
- User has a bisync job that runs hourly via cron and uses a
--filters-file
- User makes changes to the
--filters-file
and then runs--resync --dry-run
to test the new filters (--resync
is required after filter change, as a safety feature) - The results of the
--dry-run
are unexpected, so user decides to make more changes before proceeding with the 'wet' run - Before user has time to finish making and testing the further changes, the hourly cron job runs. The user expects that this run will simply fail, as a result of the above safety feature when bisync detects modified filters. But instead, bisync does NOT detect the modified filters, and proceeds with the new filters that the user had not intended to commit, causing potential data loss.
Had the user not run the --dry-run
, the safety feature would have worked as expected, preventing disaster. But because of the --dry-run
, a new .md5 hash file was created -- essentially 'committing' the new filters without having ever run a non-dry --resync
.
Notably, listing files are appended with a -dry
suffix during dry runs, to avoid them getting mixed up with the 'real' listings. But filter .md5 files have no such protection, and as a result, bisync can't tell the difference between a 'dry' .md5 file and the real thing.
To put this another way (in case the above was unclear):
If --resync
is required to commit a filter change, --resync --dry-run
should not be sufficient to commit a filter change (but it currently is.)
2. --resync
deletes data, contrary to docs
The documentation for --resync
states:
This will effectively make both Path1 and Path2 filesystems contain a matching superset of all files. Path2 files that do not exist in Path1 will be copied to Path1, and the process will then sync the Path1 tree to Path2.
However, there are at least two undocumented exceptions to this: empty folders and duplicate files (on backends that allow them) will both be deleted from Path2 (not just ignored), if they do not exist on Path1.
The docs indicate that a newer version of a file on one path will overwrite the older version on the other path, which makes sense. But it does not suggest that anything would get deleted (as opposed to just overwritten). It is not truly a "superset" if anything gets deleted.
I'm aware that rclone is object-based and does not treat folders the same way as files. However, in other rclone commands, this usually just results in empty folders being ignored -- not deleted. Furthermore, since bisync
(unlike copy
and sync
) does not support --create-empty-src-dirs
, we cannot get around this by including the empty folders in the copy from Path2 to Path1.
Lastly, to the extent that a user is aware of this issue and seeks to find out whether they're at risk before they do it, this is difficult because of the known issue concerning --resync
dry runs. Essentially: it's very hard to tell the difference between the deletes that can be safely ignored, and those that can't. A user might think they're safe, and then find out the hard way that they were wrong. (Not that anyone I know would make a mistake like that... )
For now, a possible workaround is the following sequence:
rclone dedupe --dedupe-mode interactive Path2
rclone copy Path2 Path1 --create-empty-src-dirs --filter-from /path/to/bisync-filters.txt
rclone bisync Path1 Path2 --filters-file /path/to/bisync-filters.txt --resync
Note that you would probably want to repeat this any time you edit your --filters-file
.
3. --check-access
doesn't always fail when it should
Consider the following scenario:
- User intends to bisync
Path1/FolderA/FolderB
withPath2/FolderA/FolderB
- User places access check files at
Path1/FolderA/FolderB/RCLONE_TEST
andPath2/FolderA/FolderB/RCLONE_TEST
as a safety measure to ensure bisync won't run if it doesn't find matching check files in the same places.
Path1
βββ FolderA
βββ FolderB
βββ RCLONE_TEST
Path2
βββ FolderA
βββ FolderB
βββ RCLONE_TEST
- User runs the following command, accidentally mistyping one of the paths:
rclone bisync Path1/FolderA/FolderB Path2/FolderA --check-access --resync
The access test does not prevent this, and the transfer proceeds, even though the paths have been mistyped and check files are not in the same places.
- User runs a normal bisync, with the path still mistyped:
rclone bisync Path1/FolderA/FolderB Path2/FolderA --check-access
The access test still does not prevent this, and the transfer proceeds, even though the paths have been mistyped.
Why? Because access tests are not checked during --resync
. Therefore, in step 3 above, bisync actually created two new RCLONE_TEST
files, thereby helping to defeat its own safety switch in step 4. The mangled directory structure now looks like this:
Path1
βββ FolderA
βββ FolderB
βββ FolderB
β βββ RCLONE_TEST
βββ RCLONE_TEST
Path2
βββ FolderA
βββ FolderB
β βββ RCLONE_TEST
βββ RCLONE_TEST
Is this user error? Yes. But preventing accidental user error is one of the main reasons this feature exists. And I think many new users, even having read the docs, would reasonably expect that their inclusion of --check-access
would prevent a mess such as this in Steps 3 and 4 above.
It's worth noting that the following also would have succeeded in step 3, even though there would be no RCLONE_TEST
file to be found anywhere in the second tree:
rclone bisync Path1/FolderA/FolderB Path2/FolderA/FolderC --check-access --resync
To the extent that not checking access tests during --resync
was an intentional design choice (as the docs sort of imply but never totally spell out), I'm not sure I understand the rationale. If --check-access
is intended as a protection against data loss, couldn't data loss just as easily happen during a --resync
? Why should that be exempt? If the idea was for --resync
to help us set the check file on the other side, we have plenty of better options for this, including touch
, copyto
, or simply running --resync
without --check-access
(which, while still not preventing the scenario above, would at least not give the user a false sense of security.)
Another simple example to show the logical inconsistency -- imagine that we have not created any check files on either side, and then we run:
rclone bisync Path1 Path2 --check-access --resync
This succeeds. But then we run a normal bisync:
rclone bisync Path1 Path2 --check-access
This fails. (As it should! But then it begs the question... why was the --resync
allowed to succeed?)
Possible solutions:
- Enforce
--check-access
during--resync
(meaning the file must already exist on both sides) - Prevent
--resync
from running if--check-access
has also been included
(Personally, I prefer the first one, so that I don't have to remember to actively remove --check-access
from my normal bisync command when adding --resync
. I love that adding --resync
is currently all I have to do!)
4. --fast-list
is forced when unwanted
The docs indicate that rclone does NOT use --fast-list
by default, unless the user specifically includes the --fast-list
flag. However, ListR
appears to be hard-coded on this line, meaning that bisync uses it regardless of whether you asked it to or not. In my case, it's significantly faster without --fast-list
, so I don't want to use it -- but there's no obvious way to disable it. I did eventually find that --disable ListR
seems to do the trick, but this isn't really documented anywhere, and it's also inconsistent with behavior in the rest of rclone (where --fast-list
is false
by default).
Another reason I prefer not to use --fast-list
is because I have tons of empty directories (for intentional reasons described in more detail below), and as a result, I get lots of false positives as described in this issue when --fast-list
is used.
5. Bisync reads files in excluded directories during delete operations
There seems to be an oversight in the fastDelete
function which causes files
to include every file in your entire remote. Not only is it not filtered for only the files queued for deletion, but it's also not even filtered for the eligible files specified by the user (in the --filters-file
or otherwise.) This means that even if you have a directory exclude rule, bisync will ignore it and loop through and evaluate every single file in that excluded directory. (I first noticed this because I use --drive-skip-gdocs
, and with -vv
I could see it skipping tons of individual gdocs in a folder that it wasn't supposed to be looking through. I also noticed that this happened only when there were deletions queued, and not when there were copies queued but no deletions.)
Unlike fastDelete
, the fastCopy
function right above it has code to (I think) filter for only the files we care about here. I am guessing that a similar filter was intended for fastDelete
. I added it in an experimental fork I've been testing, and it seems to have solved the issue.
6. Deletes take several times longer than copies
The cause of this is the same as #5 above, but I'm including it separately as it's a distinct and not-insignificant symptom. I saw a massive performance improvement once the (ironically named) fastDelete
function no longer had to loop through millions of irrelevant files to find one single deletion.
7. Overridden config can cause bisync critical error requiring --resync
When rclone detects an overridden config, it adds a suffix like {ABCDE}
on the fly to the internal name of the remote. Bisync follows suit by including this suffix in its listing filenames. So far, so good. The problem is that this suffix does not necessarily persist from run to run, especially if different flags are provided. So if next time the suffix assigned is {FGHIJ}
, bisync will get confused, because it's looking for a listing file with {FGHIJ}
, when the file it wants has {ABCDE}
. As a result, it throws Bisync critical error: cannot find prior Path1 or Path2 listings, likely due to critical error on prior run
and refuses to run again until the user runs a --resync
.
FWIW: my use case for overriding the config is that I want to --drive-skip-gdocs
for some rclone commands (like copy
/sync
/bisync
) but not others (like ls
). So I don't want to just hard-code it in the config file.
8. Documented empty directory workaround is incompatible with --filters-file
Bisync currently does not support copying of empty directories, and as a workaround, the docs suggest the following sequence:
rclone bisync PATH1 PATH2
rclone copy PATH1 PATH2 --filter "+ */" --filter "- **" --create-empty-src-dirs
rclone copy PATH2 PATH2 --filter "+ */" --filter "- **" --create-empty-src-dirs
However, this approach is fundamentally incompatible with using a bisync --filters-file
. In other words, it's only really useful if you're bisyncing your entire remote. There is no warning about this in the docs, and if a new user were to try this recommended approach without scrutinizing it carefully, they could inadvertantly create thousands of folders in directories they hadn't wanted to touch.
Feature Requests
(AKA: my subjective hopes and dreams for the future of Bisync.)
1. Identical files should be left alone, even if new/newer/changed on both sides
One of my biggest frustrations with bisync in its current form is that it will sometimes declare a change conflict where there actually isn't one, and then attempt to "fix" this by creating unnecessary duplicates, renaming both the duplicate and original in the process.
For example, say I add the same new foo.jpg
file to both paths, and the size, modtime, and checksum is 100% identical on both sides. The next time I bisync, I will end up with two files: foo.jpg..path1
and foo.jpg..path2
on both sides. What I would propose instead is that when bisync encounters one of these so-called "unusual sync checks", it should first check if the files are identical. If they are, it should just skip them and move on.
To put this another way: if a file is currently identical on both sides, bisync should not care how the files became identical. It should not matter whether the files were synced via bisync vs. some other means. We should not demand that bisync be the only mechanism of syncing changes from one side to the other.
I implemented a basic version of this in my fork by doing a simple Equal check, and it seems to be working well. Among the problems it solves is the documented Renamed directories limitation -- I can now just rename the directory on both sides, without the need for a --resync
, and bisync will naturally understand what happened on the next run. It is also now agnostic to how the files came to be identical, so I am free to go behind bisync's back and use copy
, sync
, or something else, if I want to.
2. Bisync should be more resilient to self-correctable errors
The way that bisync is currently designed, pretty much any issue it encounters, no matter how small, will cause it to throw up its hands and say "HELP!" (i.e. refuse to run again until the user runs a --resync
.) It's being intentionally conservative for safety, which I appreciate, but in some cases it seems overly-cautious, and makes it more difficult to rely on bisync as a scheduled background process (as I would prefer to), since I have to keep checking up on it and manually intervening (as even one errored bisync run will prevent all future bisync runs).
In particular, there are a number of issues that could be resolved by simply doing another sync, instead of aborting. Probably 9 times out of 10 that bisync asks me to intervene, all I'm doing is running --resync
, without having changed anything. In my opinion, it would be really useful to have a new flag to let users choose between the current behavior and essentially a "try harder" mode where bisync tries its best to recover and self-correct, and only requires --resync
as a last resort when a human's involvement is absolutely necessary.
Some of the issues I'm talking about include:
- network interruptions
- "source file is being updated" errors
- filesystem changes during bisync operation (more on this below)
- Access test failures and missing listing files (assuming you've fixed the underlying issue, why should the following run need to be a
--resync
as opposed to just a normal bisync?)
What I'm looking for is something philosophically similar to the Dropbox or Google Drive desktop client, which mostly stays out of your way and does its thing in the background, and almost never requires user intervention. For my purposes, I have a fairly high tolerance for leaving the filesystem in an imperfect state temporarily at the end of a bisync run, with the hope of correcting any issues on the next scheduled run. But I also acknowledge that there could be other use cases that require an all-or-nothing standard on each sync, and so that's why I would propose a new flag for this (with the current behavior remaining the default.)
3. Bisync should create/delete empty directories as sync
does, when --create-empty-src-dirs
is passed
So, I'll be honest here -- what I really want is for rclone as a whole to treat folders the same way as files, and treat empty folders the same way as non-empty folders. (As rsync does.)
But, in the meantime, the way that rclone sync --create-empty-src-dirs
currently works is usually good enough for my needs (lack of folder metadata support is a bummer, but at least empty folders are copied and deleted reliably.) And I wish that rclone bisync --create-empty-src-dirs
would behave in the same way. (Currently, it doesn't.)
This is one of the changes I made in my fork, as I have quite a lot of empty folders, and they are important to me. Among other reasons, I do a lot of music and video production, and I use apps like Apple Logic Pro which creates a "project" for you with its own internal folder structure. A single Logic project can have hundreds or thousands of files and folders in it -- lots of moving pieces, and if it ever can't find one that it's expecting to be there, the project can become corrupted. I haven't actually tested whether missing empty folders will cause Logic problems (I'd rather not find out), but even if it doesn't, it seems plausible that some other app at some point will have a similar problem, and it just strikes me as asking for trouble to go around deleting folders that some program presumably put there for a reason. (For example, from what I understand, empty folders are essential to cryptomator.) So bisync's lack of native support for them was a non-starter for me (and as noted above, the recommended workaround was also a non-starter for me because it can't be used with a --filters-file
.)
It should be further noted that even if you don't need to use a --filters-file
, the documented workaround still won't propagate deletions of empty folders, and --remove-empty-dirs
is also not a good solution, because it will simply delete all empty folders (even the ones you wanted to keep.)
TL;DR: empty folders are important to me, and bisync
's treatment of them should match that of sync
.
4. Listings should alternate between paths to minimize errors
When bisync builds the initial listings in the "checking for diffs" phase, it currently does a full scan of Path1, followed by a full scan of Path2. In other words, the order is this:
Path1/FileA.txt
Path1/FileB.txt
Path1/FileC.txt
Path2/FileA.txt
Path2/FileB.txt
Path2/FileC.txt
This might be fine for relatively small remotes, but it presents an inherent problem with larger ones, because it means a lot of time could pass between the time it checks a file on Path1 and the time it checks the corresponding file on Path2, and the file could have been edited or deleted in the meantime.
For example, suppose that Path1 has 500,000 files and that we accept the documented benchmark of listing 667 files per second (FWIW, I have not achieved anything close to this, but we'll accept it for the moment.) After bisync checks Path1/FileA.txt
, it will be another 12.5 minutes before it checks Path2/FileA.txt
. 12.5 minutes is plenty of time for files to change, even for a lightly used filesystem. Now imagine it's 1 million files -- that would be at least 25 minutes. 5 million -- over 2 hours. (And again, that's regardless of whether any files changed since the last run.)
What I would propose instead is that the checking order should be changed so as to alternate between the two paths, with the goal of minimizing the amount of time between checks of corresponding files on opposite sides. For example:
Path1/FileA.txt
Path2/FileA.txt
Path1/FileB.txt
Path2/FileB.txt
Path1/FileC.txt
Path2/FileC.txt
Or, perhaps per-directory instead of per-file, to minimize API calls:
Path1/FolderX/FileA.txt
Path1/FolderX/FileB.txt
Path2/FolderX/FileA.txt
Path2/FolderX/FileB.txt
Path1/FolderY/FileA.txt
Path1/FolderY/FileB.txt
Path2/FolderY/FileA.txt
Path2/FolderY/FileB.txt
This seems similar to the way that rclone check
already behaves (I haven't dug into the code to confirm) -- perhaps whatever it's doing there could be reused?
5. Final listings should be created from initial snapshot + deltas, not full re-scans, to avoid errors if files changed during sync
Given the number of files I'm syncing and the time that takes (as noted above), I very quickly realized that leaving out --check-sync=false
would be impractical. Otherwise, the entire bisync would fail with a critical error if even one file was created/edited/deleted during the sync -- which could last minutes or hours. But, as a consequence, I've now introduced the possibility that files will be missed, and that Path1 and Path2 will become "out of sync". So to address that, I scripted a scheduled full-check with email notification to me if it detects any problems. But this is clunky, and I very much agree with the poster of this issue that it would be better to create the final listings by modifying the initial listings by the deltas, instead of doing a full re-scan of both paths.
Since the OP of that issue has not yet answered @ncw's question from last month, I will also add: no, this does not sort itself out if you run bisync again (but I agree that would be preferable.) The critical error returned is "path1 and path2 are out of sync, run --resync to recover
", and from my understanding, this is by design.
Conceptually, what I'm proposing is similar to a "snapshot" model. We know what the state was at the start (the listings), and we know what we changed (the deltas). Anything new that happened after we took that listing snapshot, we don't care about right now -- we'll learn about it and deal with it on the next run, whenever that is.
6. --ignore-checksum
should be split into two flags for separate purposes
Currently, --ignore-checksum
controls BOTH of the following:
- whether checksums are considered when scanning for diffs*
- whether checksums are considered during the copy/sync operations that follow, if there ARE diffs
I would propose that these are different questions, and should be controlled with different flags. (Here's another user who seems to agree.) In my case, I would want to ignore checksums for #1, but not for #2 (because I have a lot of total files but very few of them change from sync to sync. But when they do, I want checksums used to ensure the integrity of the copy operations.)
*an aside about #1: my understanding is that currently, hashes do get computed and saved in the listings files, but they are not actually used when creating the deltas, as noted by the TODO in the code. Additionally, unless --ignore-checksum
is passed, hashes do get computed for one path even when they are useless because there's no common hash with the other path -- for example, a crypt remote. Computing all those unnecessary hashes can take a lot of time (I was tempted to file this one in the "bugs" list).
7. Bisync should be as fast as sync
Bisync
is (anecdotally) several times slower than its unidirectional cousin, sync
. This seems to be mostly attributable to the process of building full listings twice (before and after the transfers), both single-threaded (Path1 must finish before Path2 can start). This becomes more and more noticeable the more files you have -- to the point where I sometimes find myself "cheating" by running sync
instead of bisync
when I know I've only changed one side and I want to sync it quickly. I then let bisync
discover what I did later, at the normally scheduled time (which I can do only because of the change I described in #1 to avoid auto-renaming identical files.)
While I'm admittedly armchair quarterbacking a bit here, it seems that there's no fundamental reason that bisync
would have to be slower than sync
, by definition. After all, sync
also must check the entire destination, in order to know what it needs to copy/delete (and as I posited above, I don't believe the second bisync
listing is truly necessary, or desirable). I get that sync
is stateless and bisync
can't be, but I'm not sure why that should make much difference in terms of speed -- the step of loading the prior listing into memory takes relatively little time (the docs suggest ~30 sec for 1.96 million files) (and while we're on the subject, thought I'd mention that the other bullet points here still have some "XXX" placeholders.)
This is to say: while I don't have an easy solution to propose for this one, it seems at least theoretically possible to redesign bisync
to be as fast as sync
(or at least as fast as: load prior listings + sync
+ save new listings).
8. Bisync should have built-in full-check and notification features to help with headless use
While some kind of native email-notification-on-error feature would probably be a useful thing for rclone in general (not just bisync), there are two things that make bisync different than other rclone commands:
- It's only really useful if you run it more than once
- It's stateful, and an error in a single run causes all future runs to fail (absent user intervention)
I would also wager a guess that a large percentage of its users run it as a background process, via scheduled cron or the like (more so than for other rclone commands).
For these reasons, it's more important than usual to know if something went wrong, and harder than usual to tell. A lot of users will probably find themselves (as I did) hacking their own script together to do a regular full-check and notify them of any errors it finds. Otherwise, you have to keep checking the logs regularly (and who wants to do that?) or risk not knowing about a job that failed and therefore caused all subsequent jobs to fail. It would be great if such a feature were built in (as an optional flag), rather than requiring each user to reinvent the wheel themselves.
By "full-check", what I mean is essentially an rclone check
(or cryptcheck
, as the case may be), with the same src/dest and filters as bisync, for the purpose of detecting whether Path1 and Path2 are out of sync (especially important given how many users are probably using --check-sync=false
, as described above.) This seems like essentially what --check-sync=only
aspires to be, but it is insufficient in its current form (for me, at least) because it only compares files by name, and not by size, modtime, or checksum. (check
is also multithreaded and has more robust output options.)
The notification doesn't necessarily have to be an email notification, but I propose email because it's platform-agnostic and pretty much everyone uses it, and probably checks it more frequently than their bisync logs folder. I realize the need for an SMTP server makes this tricky (new backend, maybe?), but by the same token, that's also what makes this a big ask for the casual user who just wants to use bisync in set-it-and-forget-it mode.
--
If you made it this far, thanks for reading, and I'd love to hear your thoughts!
--
Run the command 'rclone version' and share the full output of the command.
rclone v1.62.0-DEV
- os/version: darwin 13.3.1 (64 bit)
- os/kernel: 22.4.0 (arm64)
- os/type: darwin
- os/arch: arm64 (ARMv8 compatible)
- go/version: go1.20.1
- go/linking: dynamic
- go/tags: cmount
Which cloud storage system are you using? (eg Google Drive)
Google Drive
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
rclone bisync /Users/redacted/Rclone/Drives/GDrive gdrive_redacted: -MPc --drive-skip-gdocs --check-access --max-delete 10 --filters-file /Users/redacted/Rclone/Filters/bisync_gdrive_filters.txt -v --check-sync=false --no-cleanup --ignore-checksum --disable ListR --checkers=16 --drive-pacer-min-sleep=10ms
The rclone config contents with secrets removed.
[gdrive_redacted]
type = drive
client_id = redacted
client_secret = redacted
scope = drive
export_formats = url
token = redacted
team_drive =
skip_shortcuts = true
A log from the command with the -vv
flag
I'm not sure if it's possible to provide a log that shows everything I talk about here, but here's a fairly typical one for me: https://pastebin.com/3tTvLbCS
(This is without my fix for the 'fastDelete
' issue described above. Note that it took 1 hour and 5 minutes to make one single deletion!)
If there's something else specific you want to see, let me know and I will try to capture it for you.