Behaviour of move with –ignore-existing flag

What is the problem you are having with rclone?

When moving files to Google drive, I'd like to avoid uploading files which already exist on the remote. Thus I added the --ignore-existing flag to the move command. However, I'd like this command to remove such duplicates from the source. It seems this was the default behaviour of --ignore-existing until this discussion: Unexpected dangerous behaviour with: move --immutable and --ignore-existing flags

Maybe I'm not aware on how to set this up correctly. Otherwise it would be great if there were a switch/command giving more control over how local files already existing on the remote are handled.

What I'm loocking for is this:

source remote --(rclone)--> source remote
A A - A
B B - B
C - - C

Run the command 'rclone version' and share the full output of the command.

rclone v1.57.0

  • os/version: ubuntu 20.04 (64 bit)
  • os/kernel: 5.4.0 (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.17.2
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone move --drive-stop-on-upload-limit --drive-use-trash=false --verbose --no-traverse --no-update-modtime --transfers 3 --checkers 3 --contimeout 60s --timeout 300s --retries 3 --low-level-retries 10 --stats 3s -P --fast-list --ignore-existing --drive-chunk-size 256M

I think you want rclone move without the --ignore-existing flag. This checks to see if something exists in the destination, and if it does it deletes the source without moving it.

Is that what you want?

Thanks for the reply!
Yes and no. Sorry, I should have been more specific regarding my motivation for using the flag :grimacing:
I added the --ignore-existing flag to avoid overwriting files with identical names, which are larger on the remote than their respective local versions.
What I'm looking for is a way to make sure that only the largest available file version ends up on the remote, with all local variants removed. As google drive is fine with duplicates, it would also be ok when files with identical names but different sizes are stored on the remote. Come to think of it, I may be able to work around this by using --suffix and then sorting through files on the remote :thinking:

I see!

You could probably do this with rclone dedupe..

Let's say your existing files are at drive:files.

  1. First start by uploading potential new files to a new directory

    rclone move /path/to/new/files drive:incoming

  2. Now use the google drive web frontend to move all the files from incoming to files creating duplicate files in files

  3. Then use rclone dedupe to remove the duplicate names, choosing the largest one

    rclone dedupe --dedupe-mode largest drive:files

I think that procedure will work, with the unfortunate manual step in 2. However there is an internal function to do step 2 in rclone already which isn't exposed on the command line and could easily be.

Using --suffix would work but then you'd have the different files with different suffixes which you'd need to sort out with a script.

Thanks for the hints! I came up with another workaround:
By using --size-only --ignore-size, files should only be compared by name. Thus, larger versions already present on the remote are kept. Any smaller local files with the same name will be deleted without getting uploaded again. Naturally this only works if the larger version is stored on the remote.

Nevertheless, having a flag/switch to control how files with identical names but different sizes are handled would be awesome. And bringing back the option to have --ignore-existing delete files in case they are already stored on the remote - for some reason "move" on it's own doesn't quite seem to do this (maybe due to slightly different metadata, file dates, etc.).

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.