Golang Filter for CopyDir

What is the problem you are having with rclone?

I am working off an older post to add filtering to the source remote when calling sync.CopyDir. It seems the filter.Opt struct has changed. I've guessed at the updated signature for adding an include filter rule filter.RulesOpt.IncludeRule. Sadly, my tests keep showing no filtering happening.

With the filter created and replaced in the config. Using, the updated context for CopyDir I am expecting no files to be found and copied, but all of the files present get copied.

Run the command 'rclone version' and share the full output of the command.

v1.62.2

Which cloud storage system are you using? (eg Google Drive)

The filtered source remote is SFTP while the destination is Google Cloud Storage.

The command you were trying to run (eg rclone copy /tmp remote:tmp)

	var DefaultOpt = filter.Opt{
		MinAge:  fs.DurationOff,
		MaxAge:  fs.DurationOff,
		MinSize: fs.SizeSuffix(-1),
		MaxSize: fs.SizeSuffix(-1),
		RulesOpt: filter.RulesOpt{
			IncludeRule: []string{"do-not-match-any-file.csv"},
		},
	}

	firmFilter, err := filter.NewFilter(&DefaultOpt)
	if err != nil {
		return nil, err
	}
	filteredCtx := filter.ReplaceConfig(ctx, firmFilter)

	// setup remotes fsource and fdest for the copy

	err = sync.CopyDir(filteredCtx, fdest, fsource, true)
	if err != nil {
		return nil, fmt.Errorf("could not copy dir: %w", err)
	}
	log.Printf("Done copying files.\n")

The rclone config contents with secrets removed.

#  SFTP Details
[sftp]
type = sftp
host = sftp-static.host.name
user = username
port = 22
key_file = sftp.pem
use_insecure_cipher = true
md5sum_command = none
sha1sum_command = none
shell_type = unix

# GCS Bucket
[gcs]
type = google cloud storage
project_number = 12345
service_account_file = gcs.sac
anonymous = false
object_acl = projectPrivate
bucket_acl = projectPrivate
location = us-east1

A log from the command with the -vv flag

Paste  log here

I worked to isolate sample code to post here to let others run it stand alone. In doing so the filtering started working. However, the source listing, to know what files are being copied is not being filtered.

Questions:

  • Is there a way to make the fs.List(filteredCtx, "") call be filtered?
  • I'd love to know if there is a way to get the files touched by the CopyDir operation?

Updated: to note the listing is working

Here is the standalone sample with filtering working for me but the source Listing is not filtered

package main

import (
	"context"
	"fmt"
	"log"

	_ "github.com/rclone/rclone/backend/googlecloudstorage"
	_ "github.com/rclone/rclone/backend/local"
	_ "github.com/rclone/rclone/backend/sftp"
	"github.com/rclone/rclone/fs"
	"github.com/rclone/rclone/fs/config"
	"github.com/rclone/rclone/fs/config/configfile"
	"github.com/rclone/rclone/fs/filter"
	"github.com/rclone/rclone/fs/sync"
)

func RcloneQuestion(rcloneConfigPath string, sourceRemote string, targetRemote string) ([]string, error) {
	ctx := context.Background()

	config.SetConfigPath(rcloneConfigPath)
	configfile.Install()

	var filterOpts = filter.DefaultOpt
	filterOpts.RulesOpt = filter.RulesOpt{
		IncludeRule: []string{"esb_*.csv"},
	}

	firmFilter, err := filter.NewFilter(&filterOpts)
	if err != nil {
		return nil, err
	}
	filteredCtx := filter.ReplaceConfig(ctx, firmFilter)

	fsource, err := fs.NewFs(filteredCtx, sourceRemote)
	if err != nil {
		return nil, fmt.Errorf("could not create source remote '%s': %w", sourceRemote, err)
	}

	fdest, err := fs.NewFs(filteredCtx, targetRemote)
	if err != nil {
		return nil, fmt.Errorf("could not create target remote '%s': %w", targetRemote, err)
	}

	entries, err := fsource.List(filteredCtx, "") // result entries are not filtered 
	if err != nil {
		return nil, fmt.Errorf("could not list source remote '%s': %w", sourceRemote, err)
	}

	log.Printf("Found %d files to copy.\n", len(entries))

	err = sync.CopyDir(filteredCtx, fdest, fsource, true)
	if err != nil {
		return nil, fmt.Errorf("could not copy source remote '%s' to target remote '%s': %w", sourceRemote, targetRemote, err)
	}
	log.Printf("Done copying files.\n")

	result := make([]string, len(entries))
	for i, entry := range entries {
		result[i] = entry.String()
	}
	return result, nil
}


The filtering is done in the walk layer, so use walk.List and friends if you want filtered listings.

With -v you'll see the files transferred and -vv you'll see all the files considered.

Thanks for the pointer. I ended up using walk.Walk to get the filtered source directory.

func walkSourceRemote(ctx context.Context, fsource fs.Fs) ([]string, error) {

	result := make([]string, 0)

	err := walk.Walk(ctx, fsource, "", false, 0, func(path string, entries fs.DirEntries, err error) error {
		for _, entry := range entries {
			result = append(result, entry.String())
		}
		return err
	})

	if err != nil {
		return result, fmt.Errorf("could not list filtered source remote directory: %w", err)
	}

	return result, nil
}

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.