Support for returning an error from atexit handlers (and batch uploads)

We have implemented an experimental backend for rclone (remote is a digital preservation system, not open source yet) - and we have an unusual requirement: For "Put" operations, we need to gather all files first, before we can start uploading them, because we need to register a set of files first with the remote.

To implement this, we resorted to an "atexit" handler - that is, we "collect all Put operations" (e.g. from a large directory) first and then at "atexit" time do the actual upload; this works fine so far - however I found that it's not possible to return or set an error in an "atexit" handler.

Is that on purpose?

I'd imagine that atexit.Run could as well just return an error, e.g. here: rclone/cmd.go at c85fbebce6f7166350c79e11fae763c8264ef865 · rclone/rclone · GitHub?

In general, is there a better way to implement such kind of batch upload operations?

hello and welcome to the forum,

no idea is this is helpful but rclone can do batch uploads
https://rclone.org/dropbox/#batch-mode

Thanks for the hint! Yes, the dropbox backend gave us the inspiration to use atexit in the first place. It's similar, but not quite in that dropbox batcher runs continuously in the background - so a large number of "Put" ops maybe split up into a number of batches, whereas we need one large batch - which we only see when rclone is about to exit.

I think what you want to implement is the Shutdown method

// Shutdown the backend, closing any background tasks and any
// cached connections.
func (f *Fs) Shutdown(ctx context.Context) error {
        f.batcher.Shutdown()
        return nil
}

The atexit handler is more of a last resort thing, just in case the Shutdown handler didn't get called properly.

I'm not 100% happy with the lifecycle of backends - its evolved over the years and the lifecycle is a bit fuzzy.

I've written another backend (not released yet) which can upload to Zip files. This has exactly the same problem as the zip file needs to be finalized after the last file has been uploaded but if you don't finalize it you have a useless zip file.

1 Like

Thanks a lot for the context.

I tried to add a Shutdown method to the file system, but I am failing to see how (or when) this method is called in the regular application life cycle (also ensured _ fs.Shutdowner = (*Fs)(nil), added it to the features list, used Fill, etc. - but I cannot see it being called e.g. after a copy or sync operation - which is what I was expecting).

There must be something obvious I'm missing.

It might be that the call is missing! I'm not at my computer to grep the code right now but it should get called when then backends fall out of the cache.

Thanks again; maybe I am misunderstanding this a bit.

Shutdown is added in 11/2020, here: fs: add Shutdown optional method for backends · rclone/rclone@47aada1 · GitHub - various wrapper backends (like chunker, crypt, compress, ...) start to check for it and call Shutdown if the wrapped fs supports it. Looks all very good.

However, a non-wrapped fs that implements "Shutdown" never has this method called, unless it is wrapped.

For example; I added a dummy logging statement to Dropbox (rclone/dropbox.go at b9de37af80a6f794169a7029e460987149bcd28f · rclone/rclone · GitHub) and used a dropbox remote (w/o chunker, crypt, ...) and Shutdown is never called.

For our use case, it would be nice, if Shutdown would be called for both wrapped and unwrapped file systems - but not sure if that would follow the initial intention.

From a first glance, it seems that one way to add shutdown calls would be to add something like in most commands (not sure that's a good idea). Example from copy op:

cmd.CheckArgs(2, 2, command, args)
fsrc, srcFileName, fdst := cmd.NewFsSrcFileDst(args)
cmd.Run(true, true, command, func() error {
    ctx := context.Background()
    if srcFileName == "" {
        if err := sync.CopyDir(ctx, fdst, fsrc, createEmptySrcDirs); err != nil {
            return err
        }
    }
    err := operations.CopyFile(ctx, fdst, fsrc, srcFileName, srcFileName);
    if err != nil {
        return err
    }
    // Shutdown any remote, if it supports it.
    for _, f := range []fs.Fs{fsrc, fdst} {
        if f.Features().Shutdown != nil {
            if err := f.Features().Shutdown(ctx); err != nil {
                return err
            }
        }
    }
    return nil
})

I think there are two places Shutdown methods should be called

  • when a backend drops out of the cache (see fs/cache and lib/cache)
  • when rclone exits (so on Atexit handler)

The way this should work is we probably have just one routine which clears the backend cache, calling shutdown on each backend and returning any errors.

This could be called from cmd.Run itself which would be good so we can return the error back to the user.

It should also be called on Atexit as a last resort.

Shutdown needs to be called on each backend as it is removed from the cache due to normal cache expiry. Unfortunately this will require surgery to lib/cache also.

What do you think - sound OK?

Yes, this sounds good - thanks for considering this change.

As I see the surgery will most likely be around cache operations needing to return an error.

I started to implement a patch and will try to put it up on github shortly for review.

Great - look forward to it.

I put up a first PR regarding the first point:

when a backend drops out of the cache (see fs/cache and lib/cache)

For the record: The above PR (lib/cache: if cached value supports it, run Shutdown on removal by miku · Pull Request #6278 · rclone/rclone · GitHub) is merged (thanks!) and addressed the first requirement:

I think there are two places Shutdown methods should be called

  • when a backend drops out of the cache (see fs/cache and lib/cache)

I consider this question solved/closed, as this is a great step forward for implementing backends, that have extra work to do at shutdown time. The second idea:

  • when rclone exits (so on Atexit handler)

is still open (and from my perspective less urgent).

1 Like