Can you disable copy (only delete files) when running sync?

What is the problem you are having with rclone?

I want files/dirs that is on destination to be deleted if they are not on source. Rclone sync does that, but I don't want missing files on destination to be copied from source. I am currently running this command: rclone sync source: dest: --delete-before --min-size 99P and according to the wiki page: --min-size SizeSuffix | Only transfer files bigger than this in KiB or suffix B|K|M|G|T|P (default off) so I would think files will get deleted on destination and only files bigger than 99 PB will be copied, but the "--min-size" also affects the files being deleted, and will only delete file bigger than 99 PB. Is there a way around this?

Run the command 'rclone version' and share the full output of the command.

rclone v1.61.1

  • os/version: ubuntu 18.04 (64 bit)
  • os/kernel: 4.15.0-196-generic (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.19.4
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync source: dest: --delete-before --min-size 99P

The rclone config contents with secrets removed.

[mujaysen]
type = drive
client_id = REMOVED
client_secret = REMOVED
scope = drive
token = REMOVED
team_drive = REMOVED
root_folder_id =

[mikkermijay]
type = drive
client_id = REMOVED
client_secret = REMOVED
scope = drive
token = REMOVED
team_drive = REMOVED
root_folder_id =

Hi Michael,

Perhaps --max-transfer=1B can do the trick? (I haven't tried)

1 Like

You can do this like this I think. First list the files in the source and destination.

rclone lsf --files-only -R source: | sort > src.txt
rclone lsf --files-only -R destination: | sort > dst.txt

What you want to do is remove files on the destination if they are not in the source.

You can find these with the comm tool

comm -13 src.txt dst.txt > to-delete.txt

When you are happy with the contents of to-delete.txt then

rclone delete --dry-run --files-from to-delete.txt destination:

Then remove --dry-run when confident.

1 Like

Thank you for your responses

@Ole Thank you. That did the trick! And as soon as it finds a file that needs to be copied, it stops. With max-size it would (if it worked) go through all the files before stopping

@ncw Thank you. I was thinking about doing something similar as this will stop immediately after deleting the files. I would just add --rmdirs to remove empty directories as well

Both solutions works. When working with many files and dirs, I think the second option is better, but with less files, it would be faster with just using --max-transfers=1B

2 Likes

Thanks, happy to hear, I like your idea and concept!

Here is the full command for anybody needing a similar trick:

rclone sync source: dest: --delete-before --max-transfer=1B
1 Like

Okay, so there is at least 2 solution to this problem

The first one, given by @Ole will first find and delete files. Then it will look for files needed to be copied, and as soon as that happens, the sync will stop. It is easy and reliable, we just add these flags: --delete-before and --max-transfer=1B to our sync operation like this:

rclone sync source: dest: --delete-before --max-transfer=1B

--delete-before: Will delete files before copying any files
--max-transfer=1B: Will stop the operation after 1 byte has been transferred

This will also mean, that if you got many files, it will keep looking for a file that needs to be transferred, and only when that happens it will stop the operation. So to actually ONLY remove files, and not spend the time looking for a file that can be transferred, we can instead use the solution given by @ncw which is a bit more complicated, and only works in Linux (might be able to be ported for Windows also)

We will first need to save a sorted list of all files in source and destination like this:

rclone lsf --files-only -R source: | sort > src.txt
rclone lsf --files-only -R destination: | sort > dst.txt

rclone lsf: List objects and directories in easy to parse format
--files-only: Only list files
-R: Recursive
| sort > src.txt: Pipe to sort, and save as txt file (Only on Linux)

Next we will compare the 2 text files. We want to find lines in 'dst.txt' that is not in 'src.txt', and for that we can use comm (Linux) like this:

comm -13 src.txt dst.txt > to-delete.txt

-13: Return the files that are in the 'dst.txt' but not in the 'src.txt'
> to-delete.txt: Save the result in a txt file

Now we will only need to delete the files in 'to-delete.txt' and to do that we use rclone delete with --files-from and --rmdirs flags like this:

rclone delete --dry-run --files-from to-delete.txt --rmdirs destination:

--dry-run: Test run. When satisfied, remove this flag to actually delete the files
--files-from to-delete.txt: Only delete files within 'to-delete.txt'
--rmdirs: Remove empty directories

These are not my own findings, and I would never have found this out without @Ole and @ncw - A big tank you to you

I have written the complete bash code based on the answer from @ncw

#!/bin/bash

# Where to save temp files?
temp_files_path="~/temp_files"
mkdir -p "$temp_files_path"
# Source drive + path
source_path="source:my/path"
# Destination drive + path
destination_path="destination:my/path"

echo "Listing files on source (${source_path})"
rclone lsf --files-only -R ${source_path} | sort > "$temp_files_path/src.txt"

echo "Listing files on destination (${destination_path})"
rclone lsf --files-only -R ${destination_path} | sort > "$temp_files_path/dst.txt"

echo "Comparing source files with destination files"
comm -13 "$temp_files_path/src.txt" "$temp_files_path/dst.txt" > "$temp_files_path/to-delete.txt"

echo "Files to delete from ${destination_path}"
echo "-----------------------------"
cat "$temp_files_path/to-delete.txt"
echo "-----------------------------"
read -p "Click on 'Enter' to delete the above files on ${destination_path}"

echo "Deleting the files on ${destination_path}"
rclone delete --files-from "$temp_files_path/to-delete.txt" --rmdirs ${destination_path}

# Clean up
echo "Removing temp dir"
rm -r "$temp_files_path"

echo "All done"
exit 0

You can just copy & paste the code, edit the 3 variables (temp_files_path, source_path, destination_path) to your needs and save as delete.sh. Then you need to run chmod +x delete.sh before finally running the file with this command: ./delete.sh. The program will list the files to be deleted, and waiting for your confirmation before starting to delete anything

2 Likes

Great explanation, just want to explain how rclone sync is a bit more advanced than you seem to think and therefore typically will be the fastest.

rclone sync will do exactly the same as this when using --delete-before:

Except rclone sync

  • use rclone lsl instead of rclone lsf to also list the size and time in the same sweep, so determining transfers doesn't take extra time on Google Drive and on most other remotes.
  • lists source and destination in parallel to save time (best case halving the time)
  • sort and compare file name, size and time per folder while waiting for the next folder list(s) from Google Drive.
  • deletes files in parallel with the the listing, sorting and comparison of folders.
    You can therefore experience that the deletes completes before all folders have been listed/checked for possibly remaining deletes.
  • performs everything in RAM - no need for temporary files

rclone sync will therefore typically be able to complete in a time that is less than the sum of time used by just these two commands:

rclone lsf --files-only -R destination:
rclone delete --dry-run --files-from to-delete.txt --rmdirs destination:

Assuming the destination (Google Drive) is the slowest remote to list the folders.

Thanks to Nick for this impressive and versatile sync engine!

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.