Add flags to bundle/unbundle Git repos automatically

One issue I've found when syncing Git directories is they include a lot of files, which seems to slow down syncing significantly. For now, I usually just exclude the .git directory altogether. However, this is less than ideal for local-only repos.

I'm sure I could find a way to bundle my Git folders automatically and unbundle them when I want to work on them, however it would be ideal if there was a flag that did this automatically... basically something that could compress/decompress Git repos on-the-fly. Maybe use a special file extension similar to how --links works. There is probably a native way to do this through go-git as well so that you could unbundle them manually if needed.

See here for a related discussion on native submodule support in bundles:

https://lore.kernel.org/git/CAGZ79kbrmec=SDYShkRN0Bz_zuBJnbw7+obxMCezjEFW==OUJQ@mail.gmail.com/#t

I can't imagine I'm the only one with this challenge, so I'd love to hear others input on maybe other ways they've dealt with the issue or see if this is a feature worth requesting.

This is an example of a more general problems - directories with lots and lots of files in are very slow...

I could imagine an rclone backend which (using some heuristic) used something like tar or zip to bundle up certain directories.

I'm not sure I understand your idea though because I'm not familiar with git bundles

I suppose that would make it more robust, though I'm not that familiar with how the backends work.

I found out about Git bundling when I was looking for possible solutions for the issue. It essentially bundles a Git repo into a single file:

https://git-scm.com/book/en/v2/Git-Tools-Bundling

I could really care less if it is using the native bundling or a more robust backend option. I just hope there is a flag to unbundle or uncompress as well. If using compression with Tar, Gzip might make the most sense since it is already included with the compress remote, but I'm wondering if Zstandard might be an option as well, or if you've already done some benchmarks.

I'm wondering if a --compress-* flag would be too complex using the same structure as filtering, includes and excludes, and how naming would work... e.g. you have a directory named remote:bundle that you wan't bundled, but you already have an archive at remote:bundle.tar.gz.. would probably make the most sense to bundle it in the directory name, such as remote:bundle/bundle.rclone.tar.gz

I think for this particular use case, you'd probably better off with a bit of scripting to bundle and unbundle the git repo.

In the general case syncing to a .tar file is very difficult so an auto bundling backend would be very hard without some hints from the user I think.

Thanks for the update! I'm open to providing hints and can imagine a few ways of doing so, but understand your time is valuable. Unfortunately, I don't have the time or expertise right now to explore submitting a PR myself for something as challenging as this.

I would be happy to create a more general proposal for creating archives based on hints—I kind of imagine something similar to --include for archiving and how --links work for extraction—but I imagine you probably already have a solution in mind that would be better if you find the time.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.