Flag for disabling concurrent checks & copy

When using rclone for local network syncs on directories with a lot of files (1 million+) copy and checking are interfering with another.

A check-only run (nothing or only small files have changed) takes about 30 minutes. Files copy over with 80 Mbyte/s. But as soon as both are interfering with one another, copying and checking slow down to a crawl (copying a 15 GB file takes over 8 hours).

Please add a flag to stop checks while files being copied or something similar.

What is the problem you are having with rclone?

Copy and check slow down to a crawl when they are simultaneously executed in a single rclone command over a directory with a huge amount (1 million+) of files.

What is your rclone version (output from rclone version )

rclone 1.51.0

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Linux

Which cloud storage system are you using? (eg Google Drive)

Local Network

The command you were trying to run (eg rclone copy /tmp remote:tmp )

rclone sync /local/folder /mounted/network/folder --transfers 1 --checkers 1 -v -P -l

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp )

Irrelevant

hello and welcome to the forum,
i am not clear what the exact problem is?

are you running multiple rclone commands at the same time?

do you know about these flags?
https://rclone.org/docs/#transfers-n
https://rclone.org/docs/#checkers-n

you should have created a post using the question template and answer those questions, so we can help you.
i have re-posted those questions for you.

What is the problem you are having with rclone?

What is your rclone version (output from rclone version)

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Which cloud storage system are you using? (eg Google Drive)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp)

I edited my first post.

are you running multiple rclone commands at the same time?

It's a single rclone instance that syncs a local folder (1 million+ files) with a local network folder. But while it copies a larger file, rclone runs the checks in the background. Both of these operations interfere with one another because the read-and-write heads of the source and destination disks are all over the place.

If I run the command on the sub-folders with the larger files in it (under 10.000 files), the copy itself runs at 80 Mbyte/s. Then I re-run the command on the directory I want to be synced and it finishes the 1 million+ checks in about 30 minutes. But if I have a large file that needs to be transferred, rclone will eventually find said file and start the copy process while the checks keep running. The transfer speed drops to 200 Kbyte/s, because the heads of the disks constantly switch between reading/writing the file that needs to be transferred and checking the existing files.

do you know about these flags?
https://rclone.org/docs/#transfers-n
https://rclone.org/docs/#checkers-n

Both flags, transfers and checkers, are set to 1

i am not a linux user and i do not use rclone for local copying.

now that you have posted the needed info, i am sure someone else can help you.

perhaps use two rclone commands.
rclone sync --size-only source dest
rclone check source dest

You could potentially slow it down.

--tpslimit-burst --tpslimit

But 2 threads (1 checker and 1 transfer) would be odd to impact your disk reads so much.

Are you copying from an olde-fashioned spinning HDD?

This would be relatively easy, but it would involve buffering info about the files which need copying in memory which isn't enormous, say 1k per file which needs to be transferred. This mode would need to do something when --max-backlog was reached - maybe start the transfers anyway.

Suggestions for names for the flag?

One thing I could do is make an output mode for rclone check which outputted files which needed transferring in a format suitable for the --files-from parameter.

You could then do something like

rclone check source dest --differing-files-output > filez
rclone copy source dest --files-from filez --no-traverse

If you were willing to do a bit of scripting you could munge the output of check as it is at the moment into a format suitable for --files-from - that might be useful as an experiment to see whether it is worth while implementing the flag or the flag above.

I'm fine with scripting and this seems reasonable. I will give it a try.

Thanks

Are you copying from an olde-fashioned spinning HDD?

Yeah, 16 TB SSDs don't come cheap. :nerd_face:

Let me know if it helps, I'm open to adding one or more flags to rclone to help with this, but I'd like to know if it will be worth while first!

Personally i'd think it would be useful. It would also be useful to be able to pipe the list in with a dash for standard in.

rclone check source dest --differing-files-output | rclone copy source dest --files-from - --no-traverse

But that may be more tricky.

That is a nice idea! --files-from - would be great for scripting. Want to please make a new issue on github with that idea in? It shouldn't be too difficult.

I added it.

Also would have helped here:

@RadarOReily
These scripts are written for something else, but do pretty much what you are describing. There are a few variations of difflist here, all using rclone commands and --files-from .

Coincidentally I also wrote just yesterday a bash script for a friend who wanted to feed 1 folder or 1 file at a time. Here is the simple version:

#!/usr/bin/env bash
# USAGE ./rc_one sync src: dest:  <= change sync to copy/move as needed

action=$1
src=$2
dest=$3

while read -r name; do
    rclone $action "$src$name" "$dest$name" -vP
done < <(rclone tree -d -i --level 1 --full-path --noreport $src | sort -r)

Remove -d if you want to process files, not just folders. And adjust or remove --level n depending on your needs (in his case he wanted to process one folder at a time, one level down).

And of course add other rclone flags like -vP .

I'm curious ncw/calisro - would --fast-list and/or something like --backlog=5000000 not help with the interference he describes? Specfically, I'm curious if --fast-list is useful or not with doing this kind of checking.

Separately, we submitted this issue to github a year or so ago for a similar challenge, suggesting a flag for rclone check that outputs names only. After which with your help I created the difflist/diffmove scripts. :wink: I'll link that github issue if I can find it.

--fast-list can help yes. However rclone now uses it automatically in quite a few places now, for instance if you do rclone ls or rclone lsf -R you'll be using --fast-list if the backend supports it.

Ah yes I vaguely remember that! Do link it if you find it!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.