Bisync Bugs and Feature Requests

Sure thing. Sorry if this is more detail than you ever wanted about empty folders (lol) but I'm laying this all out here to really make the case for why I consider them important and am reluctant to use a tool that doesn't. (And I totally acknowledge that others may disagree, including Git and Backblaze.)

Firstly, to reiterate my main concern:

A few other points worth noting:

  • There's an open issue for this, with multiple users reporting that bisync causes issues with Cryptomator (I'm not a Cryptomator user myself)

  • I'm primarily a Mac user, and macOS has a concept of "packages" where certain directories are essentially considered a "file" in Finder and for most other Mac purposes, but still considered a "folder" by rclone. For example, the Logic Pro projects mentioned above are packages using the .logicx extension (slightly off-topic, but: this is one reason I really want directory metadata support -- rclone currently cannot sync the modtime of Logic projects, even though macOS essentially treats them as files. Add me to the growing list of users willing to chip in to sponsor this feature!) Actually, the way I discovered Bug #2 in my original post was from doing a --resync and seeing some 'packages' get deleted, because rclone considered them to be empty directories.

  • There are some scenarios in which rclone itself requires the existence of empty folders. For example, the mount point for rclone mount (unless using --allow-non-empty which is not supported on Windows and usually a bad idea anyway), and bisync itself, which will accept an empty folder as a root path (on --resync only) but error out if it doesn't already exist (unlike sync which will create it for you on the fly.) So it strikes me as somewhat contradictory for rclone to take a position of "empty folders don't truly exist, but also they're so crucially important that we'll sometimes error without them". Obviously these particular examples are easily correctable and not a big deal, but my real point is that if rclone itself sometimes errors when it can't find a folder it expects to be there, other applications could too. This is why I'm not so quick to assume there won't be any consequences if I go about deleting willy-nilly the thousands of empty folders in my drive created by other applications over the course of decades. Will all those applications work just fine without them? Maybe. But how can I be sure? It's also kind of an impossible thing to test for, given 1.) how many different apps I use; and 2.) how many different operations I'd have to test in each app (what if the issue it causes is not immediately apparent?) And even if I could somehow test this and definitively prove that no harm is caused, any one of these apps could change their behavior in a future version, and by relying on a tool that discards empty folders, I could be doing damage for a while before I discover the issue.

  • rclone sync already supports --create-empty-src-dirs. In my opinion, the eventual goal should be for bisync to be a bidirectional version of sync, with full feature parity.

  • rsync supports empty directories (and directory metadata, for that matter).

  • Aside from my fears about breaking apps, I sometimes also use empty folders in my workflow as placeholders that will be filled with files later, but which I want to create earlier so that 1.) I can apply naming conventions to all the folders at the same time with a mass-rename, and 2.) so that it will be apparent later if I missed one. For example, at the end of working on an audio project, I like to export each individual audio track in several different formats that might be needed later (since I can't assume that the DAW and every plugin I used will still be around and usable 20 years from now). Let's say it's an album with 20 songs, each song has 30 tracks, and I want to save everything in 5 formats. I would first create a hierarchy of empty folders like this:

/
├── 01 Song1_Name
│   ├── 01 Song1_Name - Format1
│   ├── 01 Song1_Name - Format2
│   ├── 01 Song1_Name - Format3
│   ├── 01 Song1_Name - Format4
│   └── 01 Song1_Name - Format5
├── 02 Song2_Name
│   ├── 02 Song2_Name - Format1
│   ├── 02 Song2_Name - Format2
│   ├── 02 Song2_Name - Format3
│   ├── 02 Song2_Name - Format4
│   └── 02 Song2_Name - Format5
└── 03 Song3_Name
    ├── 03 Song3_Name - Format1
    ├── 03 Song3_Name - Format2
    ├── 03 Song3_Name - Format3
    ├── 03 Song3_Name - Format4
    └── 03 Song3_Name - Format5

(etc. to Song20)

And then I would go about filling each folder with files (one at a time) by exporting from my DAW (a long process that I might split up over several days, bisyncing at various points in between). If I were to create each folder later at the time of use, 1.) it would take longer (for lack of mass-rename), 2.) it would increase likelihood of making typos or other inconsistencies in my naming conventions, and 3.) I might not notice if I missed one (for example, if I forgot to export format #3 for song #16.) Whereas having the empty placeholder folders there serves as a virtual "checklist" -- it's immediately obvious if I missed one. (I typically zip up each folder when I'm done, and so a remaining folder or abnormally small zip file would stick out like a sore thumb.)

This is all to say: empty folders sometimes have a role in my workflow, and so I don't want to use a data sync tool that will consider them worthless and ignore/delete them. In my view, it's not truly a mirror unless my empty folders from Path1 exist on Path2 (and vice versa).

I appreciate the idea, but this seems less clean to me than my current solution. Some possible concerns:

  • If the process were to get interrupted, it would leave all the temp files there.
  • It could add a lot of time and API calls to each sync if doing this on a non-local remote (I have tens of thousands of empty folders in Google Drive, maybe more)
  • I'm a bit wary of writing temp files into application folders for the same reason I'm wary of deleting them -- how can I be sure it won't break something in some app I might use at some point? Basically my first principle with all of this is: if an app put the files and folders there itself, I'd better not assume that I can go in and mess with it without consequences.
  • It would look quite noisy on the Google Drive side to have thousands of temp files created and deleted constantly -- it would make it difficult to see which files are actually changing.
  • I'm also not sure how possible it would be to perfectly clean up after myself (restore to original state) on either side (for example, I'd imagine that the act of creating and deleting the temp file could cause some of the parent folder metadata to change.)
  • Potential for conflicts if bisyncing multiple machines at the same time (I currently bisync my Google Drive with my desktop and my laptop, for example.)

By contrast, my current solution essentially just makes the existing copyEmptySrcDirs parameter controllable by the user instead of hard-coded to false.

There's something I'm still having trouble wrapping my head around, and maybe it will make more sense to me once I actually try it out, but: if bidirectional syncing is inherently stateful, how can syncrclone sometimes still operate (safely) without knowing the state?

I think I understand the part about considering everything "new" when we don't know the prior state, and I get why that would be fine for the first run -- because the user has control over when that happens, and presumably they would not be running that first sync at a time when that would create a mess. But what about a random future run when the user isn't around and it can't find (or can't trust) the prior listings for whatever reason (like an error/interruption on the prior run)? If everything is considered "new" in this situation, how does it avoid erroneously merging the two sides together and causing deleted files to re-appear?

Again, entirely possible that I'm just missing something and that my questions will answer themselves by just trying it out (which I plan to do!)

Very sorry to hear that. Hope he's ok. :frowning_face:

My sense is that it would not take all that much time/effort to patch a few bugs and get the current bisync into a more usable state, and so that is probably worth doing regardless. Whereas developing bisync2 or porting syncrclone would be a much bigger lift, and so maybe that should be more of a Phase 2 of this project.

I'm also thinking that a good next step for me would be to install syncrclone and try it out, so that I have a better understanding of how it works and I'm not wasting either of your time without having done my homework. :grinning: I'm sure it will also give me a better sense of the pros and cons between bisync's philosophy and syncrclone's.

I'm in the middle, on USA Eastern Time (UTC -4), but I'm very often up and working at odd hours (like right now, haha), so I'm sure I can accommodate whatever's best for both of you. :grin: In the meantime, I will work on getting that PR together for you, and getting myself up to speed on syncrclone.