Brainstorming Rclone features

Nick et all,
First off -- cool product. I have downloaded current release.

Brainstorming idea

Is it in your business model to consider using versioned S3 repositories. IMHO, if S3 is the only provider that has versioned repos and not all S3 vendors have them the answer is NO. That is ok.

That said:
Jostification: With versioning you could solve a myriad of backup issues.

  • With Versioning RClone Mount could pick any point in time to mount the repository.

  • With versioning on -- rclone needs no "changes". Use mistakes that wipe out the destination S3 would be recoverable.

  • My S3 supplier is introducing something for S3 lifecycle in the next quarter :slight_smile:

  • Using Rclone instead of a full DB makes since to me. I would rather see the S3 contain "clone" copies. If there is complex software creating files this could fail at the Jost in-opportune time.

In my case here here is some snippets

cfg, err := config.LoadDefaultConfig(context.TODO(),
    config.WithCredentialsProvider(credsProvider),
    config.WithEndpointResolver(aws.EndpointResolverFunc(
        func(service, region string) (aws.Endpoint, error) {
            return aws.Endpoint{URL: endpoint}, nil
        })))
respVers, err := client.ListObjectVersions(context.TODO(), inputVer)
if err != nil {
    return err
}
printVersionOutput(*respVers)

// output the ListVersionsOutput object
func printVersionOutput(output s3.ListObjectVersionsOutput) {
log.Info(fmt.Sprintf("Bucket Name:%s", fmtStringNil(output.Name)))
log.Info(fmt.Sprintf("Delimiter:%s", fmtStringNil(output.Delimiter)))
log.Info(fmt.Sprintf("CommonPrefixes len:%d", len(output.CommonPrefixes)))
log.Info(fmt.Sprintf("DeleteMarker len:%d", len(output.DeleteMarkers)))
log.Info(fmt.Sprintf("Versions len:%d", len(output.Versions)))

versions := output.Versions
for inx, version := range versions {
    modTime := *version.LastModified
    log.Info(fmt.Sprintf("Inx:%d Key:%s Etag:%s Size:%d modTime:%s",
        inx,
        fmtStringNil(version.Key),
        fmtStringNil(version.ETag),
        version.Size,
        modTime.Format("Mon Jan 2 15:04:05"),
    ))
    log.Info(fmt.Sprintf("    VID:%s isLatest:%t\n",
        fmtStringNil(version.VersionId),
        version.IsLatest,
    ))
}

}

Produces the following output for my 3 file repo with a couple of versions:

INFO 2021-04-19 20:39:33 Delimiter:** nil **
INFO 2021-04-19 20:39:33 CommonPrefixes len:0
INFO 2021-04-19 20:39:33 DeleteMarker len:0
INFO 2021-04-19 20:39:33 Versions len:5
INFO 2021-04-19 20:39:33 Inx:0 Key:Sample/ Etag:"d41d8cd98f00b204e9800998ecf8427e" Size:0 modTime:Wed Apr 14 02:51:21
INFO 2021-04-19 20:39:33 VID:001618368681481908872-JTm2grK7XL isLatest:true

INFO 2021-04-19 20:39:33 Inx:1 Key:Sample/File1.txt Etag:"e35ed419723a71491433dcb889ec51ac" Size:101 modTime:Wed Apr 14 03:05:01
INFO 2021-04-19 20:39:33 VID:001618369500551868972-X6Xt1f5ytm isLatest:true

INFO 2021-04-19 20:39:33 Inx:2 Key:Sample/File1.txt Etag:"e7343397015af875d08d84179a21b2cd" Size:72 modTime:Wed Apr 14 03:03:42
INFO 2021-04-19 20:39:33 VID:001618369421669596521-CqsklW8ze9 isLatest:false

INFO 2021-04-19 20:39:33 Inx:3 Key:Sample/File2.txt Etag:"434d2ff65cfc2cd4e0a626fce4af86a6" Size:101 modTime:Wed Apr 14 03:05:00
INFO 2021-04-19 20:39:33 VID:001618369500471942471-bzwaSklISn isLatest:true

INFO 2021-04-19 20:39:33 Inx:4 Key:Sample/File2.txt Etag:"87d66c059c531036dcc6e84660234cbb" Size:72 modTime:Wed Apr 14 03:03:42
INFO 2021-04-19 20:39:33 VID:001618369422282553279-LLjRXY25cJ isLatest:false

With this it is easy to see how using od times one could set rclone mount and other tools to recover from the repo.

An idea an yes ok if this isn't your business model.

Allen Strand

If you take a look at the b2 docs you'll see it supports versions and rclone can cope with those.

Using the --b2-versions flag lets you see the old versions and access them.

Support very similar to this could be added to s3.

Is that a feature you'd like to see?

Would you be interested in adding it? Or maybe your company might be interested in sponsoring the work?

Nick,
Regarding below:

Possibly I could add it. As background, I spent my career designing, creating, directing and selling for large corporations using a wide variety of languages:-). Now retired, I am transitioning from Java to Go. I am very impressed at the vision of the core Go team, many helped create Java, that refuse to do some of the same mistakes again like inheritance and generic types:-). Now I am retired but still am keeping my fingers in the pot -- so-to-speak.

Requirements and issues

  • I am rolling your RClone release into a Docker container with YAML configuration that I can move to my Synology NAS.

  • I am not keen on a "backup" solution that obfuscates the file names and makes it impossible to manually get a file from my back-end. I have seen software out there that obfuscates the files but presents files in a nice GUI albeit only the command line is in Github;-(

  • My backend is Wasabi. They currently don't have software to prune a bucket and remove older versioned items programmatically i.e. S3 Lifecycle yet but are working on S3 Lifecycle.

  • Backup should make an attempt to handle ransomware.

  • The ideal solution would be a RCone mount config that specifies the back-end snapshot time and have the Apple Finder then look at that in a read-only mode.

My Next Steps

  1. I will be talking with Wasabi sales about their vision. Maybe they should be willing to help fund some of this.
  2. Since they don't have lifecycle, I am building a RCPrune module to implement LifeCycle.
  3. An offshoot of this may be a "batch" way to recover a snapshot at a point in time and fold that into my RCSync Docker image.
  4. Look at your B2 idea.

WHat is driving this

I need to backup my Synology and laptops to my S3 backend. I am interested in a method that facilitates a ransomware backup and other natural disasters. I am not trying to replicate Apple TimeMachine on Wasabi:-)

More later and thanks for taking the time to respond. I will give you more information as this unfolds.

Actually storing the revisions on S3 should be quite straight forward.

The (quite basic) --b2-versions flag should be quite easy to implement and if I was doing it I'd do that for --s3-versions first. The file name handling for --b2-versions is abstracted into a library already.

I think --b2-versions might be more useful than a snapshot at a given time as it shows all the old versions in a read only way. I guess snapshot at a given time would be useful for recovery from a given time. If we decide to do that then I'd want it to work for b2 also.

Rclone is one file to one cloud storage object which is limiting in some respects, but I really like it for data integrity purposes.

The b2 backend has a cleanup command to remove old versions - this should be implemented for the s3 backend too.

good point and very important.

these would be a really great set of features.

also, to present a point-in-time read-only snapshot of a main folder and a set of --backup-dir folders
to pick a date+time, and have rclone show a complete set of files for data recovery.

Yes RClone MUST be consistent!! I am not willing to give up consistency. If we cannot do a feature with consistency then don’t do it.

I am focusing on disaster recovery:

  1. Software, human mistakes or hardware failure
  2. Ransomware or data breaches

The software/human/hardware issue is why I think backup. Too often, in my career, someone changed something and all was hosed. The database no longer could be used. Whether management liked it or not, DevOps needed to roll back to a good known state. All would accept a Friday AM time if the change was Friday evening. Alternatively, telling management restore was impossible would be a dire career move.

Ransomware is now very political. To recover from that, DevOps must roll back to just before the attack with minimal additional data loss.

In my case, I want to make it easy for user mistake corrections. Example: accidentally deleting the entire tax folder before tax submission. Putting the system back to the last good backup is better than starting over.

Apple Time Machine does this well. You get a graphical GUI. When you find the date you "like" you can easily copy the file/folders somewhere. The Rclone mount could be "configured" to do something similar with minimal time. This view must be read only.

Again, Nick and others get to decide whether this is tin the RClone business plan. We must NOT do something that will cause more headaches than benefits.

My $.02.

Note that rclone can achieve this already with --backup-dir or --suffix for saving old versions of files..

  • Agreed this nicely solves an entire directory delete.
  • Agreed this is a great first step for a disaster situation.

Conceptually, the user could create a different backup-dir for each run of rclone and store stuff to their heart's content. This would be a tad difficult to sort out manually.

Also note -- I am not saying RClone is incomplete and must change. I am more interested in the RClone team carefully reviewing Product Management features for consistency and then implement to the Product Management vision which can change over time. FYI -- RClone continues to be one of those cool finds that I recommend.

I plan on using Wasabi S3 versioning. I will sort out restore to time-point somehow if ever needed. I have used that feature a few times on my Mac but normally backup is more likely used to correct idiot-user deleting something or total corruption of say a GO Lang source file.:slight_smile: Backup-dir solves one level individual file rollback.

1 Like