A fork focusing on cloud storage services without public APIs

hoi8126 · February 19, 2023, 3:01pm

Hi Nick @ncw !

Congratulations on your success in creating and maintaining such a great project! I appreciate your enthusiasm as well as your patience over the years, and I decided to learn from you and start contributing to the open source community.

As a rclone user, I use rclone literally everyday and its performance and usability just amazes me all the time. However, in mainland China, most supported backends (except for the generic ones, like S3 and WebDAV) do not perform at their best because of the network issues, and due to the closed nature of many cloud storage vendors in mainland China, it's a shame that we cannot use rclone to manage those services.

I noticed that there is a project called alist which is similar to rclone but supports many cloud storage services in China. As far as I know, many of the APIs they use may have been obtained through packet capture or reverse engineering, which will be painful for you to maintain, also from Github issues I noticed that rclone maintainers are not likely to accept feature requests or PRs regarding to services without public APIs whether from the ease of maintenance or some legal related issues.

So, I decided to make a fork of rclone and tried to port alist storage backends to rclone. It is also for the purpose of learning the Go programming language. I will create branches that should be easy to merge with a little or no changes to the upstream rclone and try to keep up with upstream rclone changes (for security reasons mainly, new features is also important though). This post is also mainly to inform you about this fork I made.

My Github account was flagged somehow a few days ago and public profile is hidden. So temporarily I use this Codeberg repository to host my source code. Feel free to check it out if you have some free time, and I'm really looking forward to your suggestions (will be a gift for Golang newbies like me ).

Have a nice day!

Marcos5 · February 19, 2023, 3:16pm

I just signed up to say thanks. I too live and China and experience slow backends due to the internet

darthShadow · February 19, 2023, 9:34pm

It may be easier to do this (in terms of maintainability) via a plugin infrastructure for the backends:

github.com/rclone/rclone

Support gRPC backend plugins

opened 01:46PM - 23 Oct 21 UTC

ivandeex

new backend thinking build integration refactoring

### What problem are you are trying to solve? Rclone currently supports backe…nd plugins in the form of shared libraries. This limits support to Windows and MacOS. Even if we start officially releasing CGO-based rclone for Linux (#5090), shared plugins have to be recompiled for every new (or every other) rclone release to maintain compatibility of exported symbols. References: - https://pkg.go.dev/plugin - https://github.com/traefik/traefik/issues/1336#issuecomment-474883783 or "why shared lib plugins suck" ### How do you think rclone should be changed to solve that? We can borrow ideas from such projects as [hashicorp terraform](https://github.com/hashicorp/terraform) or [packer](https://github.com/hashicorp/packer). Both are static go executables but they have plugin ecosystems. Plugins are also go executables built with SDK released by the main projects. The core program runs plugins out of process by spawning executables and talks to them via `go-rpc/protobuf` API on a localhost TCP socket. The projects maintain plugin repositories (in the form of toml/json mapping of certified plugin name to download URL) so in many cases the main program will download requested plugin automatically. However, user can just drop the plugin executable in a predefined directory and it will be picked up. **Cons**: performance loss due to serialization, API should be designed very carefully (longer releases) 🐢 **Pros**: the core release lifecycle decoupled from extras, reduced size of main executable, plugins can link in 3rd party shared libs References: - https://github.com/hashicorp/go-plugin - https://github.com/hashicorp/terraform-plugin-sdk#terraform-core - https://packer-git-master-hashicorp.vercel.app/docs/plugins#installing-plugins - https://github.com/traefik/traefik/pull/2362 ### Implementation notes **Disclaimer: grpc plugins is a long-term project.** Versioned API needs careful engineering. I hope we can develop API for backends based on go interfaces for `fs.Fs`, `fs.Object`, `fs.config` and `rest client`. Context-based canceling and config passing will need special treatment as it crosses API boundary. Plugins will have a life-cycle. Console prompts like 2FA should be delegated via API to the core, similar to https://github.com/rclone/rclone/issues/5664#issuecomment-950218354 To reduce initial support burden we can delay accepting external plugins for later and start from **(1)** reimplementing existing less-used backends (seafile, sia, mailru, uptobox, zoho) and/or **(2)** developing brand new experimental backends (encfs, ipfs, kbfs) in the form of gRPC plugins. After "eating our own dogshit" we can release SDK v1, chuck acceptance test suite, and start accepting externally developed plugins. ### How to use GitHub * Please use the 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to show that you are affected by the same issue. * Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue. * Subscribe to receive notifications on status change and new comments.

ncw · February 20, 2023, 9:49am

Thank you for the praise

Cloud services without public APIs are a pain to maintain. We have two mega and mailru at the moment. I effectively maintain the mega backend and the library it depends on. Mailru and the library it depends on is maintained by another rclone maintainer. Its a lot of work... I'd accept backends made from reverse engineered API only if I was also getting an offer of maintenance for it too!

What I would do is make an alist backend which drives the alist code directly. This means that the alist devs can continue maintaining all those lovely integrations and we only have to maintain the alist backend.

I wasn't really clear on what exactly alist does - it might be that those docs haven't been translated into English. How do you use it?

hoi8126 · February 20, 2023, 11:30am

Man... I had thought that all of the non-generic backends (like Box) that rclone supports currently are all officially supported by their vendors... You're right. Maintaining those requires a lot of patience. Solute to your guys again!

Yes, alist does provide a way to interact with it through its API. The current alist upstream version is v3.x.x, it provides a way to interact with its current version and its older version (through an alist backend called alist_v2 and alist_v3, you can check the related code through hyperlinks). However, as for why I don't recommend to do this way, I will explain later.

As far as I'm concerned, alist is a cloud storage management tool, but unlike rclone, alist is a web application backed by Gin + SolidJS instead of a cli application like rclone. So in ways of using it, it may differ a lot. For example, rclone and alist both support basic management actions of cloud storages, like mkdir, file/dir renaming, random read through HTTP Range requests etc. But when it comes to uploading, rclone and alist differs a lot. If you want to upload data to an alist backend, you will have two options (as far as I know):

upload through the web UI.
upload through a WebDAV client (alist provides a way of accessing its storage through WebDAV protocol, you may see port 5244 before in rclone Github issues, that is the default port that alist listens on, and people are using rclone as a WebDAV client).

Neither of them provides data integrity check. That is the most important reason why I decided to contribute directly to rclone instead of alist. Technically, rclone is a rather large project growing over the ages, so alist may be a better starting point for Golang newbies like me, just for its smaller code base and probably fewer language barriers if I have some questions to ask (for alist users are mostly Chinese like me). Because of alist's web application nature and the WebDAV protocol itself, it simply has no way to provide similar file integrity checks as rclone, which means a lot to me. Also rclone provides file dudupe, bisync, bandwidth limit, VFS caching etc. so I choose to use rclone as a framework.

Both projects are my favourites. I'm not saying A is better than B. The two are actually focusing on different use cases, thus they cannot be compared. I think alist focuses more on "read". For example, it provides document (docx, pdf, ...) previewing, video external play (one-click export to VLC, IINA, ...), Markdown randering and so much more. In short, it extends the way of using the original cloud storage. Without alist, we might only be able to use their official client for a limited number of operations, plus the telemetry and larger memory footprints that we don't want.

hoi8126 · February 20, 2023, 12:32pm

As for why I decided to make a fork, one of the reasons is that some of the backends that I plan to support require special hacks. For example, here in fs/fshttp/http.go I introduce a new method called NewClientWithCustomUserAgent which you can get what it does from its name that it overrides the default rclone/<version> styled User-Agent as setting User-Agent in rest.Client does not work as expected. The custom User-Agent may serve different purposes according to different backends. Most of the times it only serves as a way of masquerading requests as if they come from the official clients because we don't know if the vendors' risk control policies have anything to do with these factors, we don't want our users to have their accounts banned for this reason, we're trying our bests to avoid that. In some circumstances, this change of User-Agent may be mandatory for the servers of some cloud storage vendors only accept requests from its official clients and block requests from anything else. This kind of hack is not what a normal app will do, so I try to avoid you guys from potential legal issues (I don't know if there are any, but best safe than sorry). I don't want additional features to interrupt your normal development process for non-technical reasons.

Another reason is the language barrier. What I've done mostly benefits Chinese users, so if they encounter some problems, they may turn to the issues for help. However, I wrote these lines in the forum mostly by myself and with a little help from DeepL, I cannot guarantee the same for the users. If they report directly to the upstream, chances are that there are a lot of Chinese in issue descriptions, which will be a lot of pain for you maintainers. So I decided to be a middle space between upstream and Chinese users for the ease of communication. I'll write my comments in the issues in both Chinese and English, that will be better for both sides. Also there may be some special ways of communication that you foreigners may find it hard to understand. That will be the place where I come up.

In short, I will try to keep commucation with upstream and keep branches always ready for merge to the upstream and documented inline in English. Some features that are more generic and that probably will benefit other backends that you upstream maintain, I will test them and send you a PR.

ncw · February 20, 2023, 2:35pm

Sounds like a great project from me

Marcos5 · February 21, 2023, 11:05am

Any chance on supporting Baidu Pan pan(dot)baidu(dot)com?

hoi8126 · February 21, 2023, 12:01pm

It is already in plan. Currently, I've implemented read-only access of Quark drive. Next I will try Aliyundrive and Baidunetdisk.

hoi8126 · March 5, 2023, 2:37am

Hi Nick!

On second thought, I will take your advice and implement an alist backend to get all the goodies that alist offers. This means that implementing one alist backend will have access to multiple backends that alist supports, which is great. However, the fork will still be maintained because alist does not do data integrity checks like rclone and to ensure data integrity, I think it is better to implement a native rclone backend rather than use alist as a middleware, more efforts will be needed though.

I will send you a PR when my alist backend is ready.

Have a nice day!

ncw · March 6, 2023, 10:13am

Sounds good - thanks

system · May 6, 2023, 6:14am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.