I created a library for a separate project I’m working on, but the library can be used here. It’s called Spectra. It is basically a mock FS for testing purposes. It implements a go fs.FS data struct, when you do a list children method call it searches its own spectra.db (sqlite table) to see if the folder and its children already exist if so it serves them up, else it generates them via the spectra config parameters like minimum and maximum number of folders, files, etc, and max depth so it’s not infinite recursion.
I built this as a testing harness for another migration tool I had worked with, it works very well. Currently the DB backing persistence is great for persistence but hurts performance. At some point I’d like to add a memory buffered only version of the library but we can do that more later. But yeah basically it generates items for you to chew through even though it’s all mostly garbage data but it’s consistent data! It can even generate file binary info but it’s just basically some bytes to generate a checksum against to test ‘copying’ operations between nodes.
I don’t wanna spend tooooo long in here talking about Spectra because I have readmes and things for that already. But just figured I’d run it by you guys if you’d like it?
I already made the fork if you wanna test it out and take a look.
I was going to make the PR but saw to post in here first. Already tested Rclone on it, had it chew through a migration of like 5M nodes (basically like 10TB worth of data) and it worked flawlessly, no issues.
I don’t foresee this being as useful to the main users but could work as a good CI/CD type dev tool or dev testing tool if you wanna sandbox a migration without needing actual data to do so. Sort of a step up from typical unit test mocks.
here’s a link to the library directly if you wanna read more about it.
Rclone has some similar features built in. I don't know if you have come across the rclone test makefiles command? This makes randomly generated files into a file system heirachy with some controls over the directory structre, size and names.
I use it a lot for testing! If you want to generate really big files then you use the --sparse flag to make the files take up no disk space.
I understand spectra to do a similar thing, but the files are virtual - not created on the disk at all and there is the ability to write files also - is that right?
I took a quick look at your backend - nice work! I don't think we can merge something which depends on sqlite though as it requires CGO and we've decided not to allow CGO as it complicates the build too much. We do use bolt though for db stuff (which we package into lib/kv.
Good suggestions! I don’t like CGO nonsense either lol. I researched BoltDB and wrote the code to switch over to it instead of SQLite, it works better for our use cases with Spectra anyways.
There’s still a lot more to implement with Spectra as a library by itself that I plan on adding to it like for example simulating rate limiting, network packet loss, jitters and dropped connections, but for the most part it will work as a basic traversal tool.
Main benefit of it compared to other testing scripts would be that it doesn’t have to ACTUALLY generate millions of files / folders on a drive to simulate traversal between two drives, since it’s all mostly just fake data (but persisted to a DB for consistency and resumption), so it works out better I think than a lot of other alternatives.
Anyway, I made the switch over to Bolt DB and pushed my changes to the master branch of the fork if you wanna take a look sometime! I’m not in a major rush to get it merged, but just figured I’d let you know sooner than later that the work was done.
You may want to run a few more tests and explore it more yourself before we setup the full PR if you’d like. But otherwise I am good on my end at least.