Best options to copy hundreds thousands files using webdav

Hi

I'm trying to optimize the copy of hundreds thousands of small files using the webdav protocol. Source is the webdav server, destination is a local folder. Bandwidth doesn't seem to be a limited factor.
The server seems to be IO limited.
I've tried different options, using --check-first then --checkers=50 but with limited success.
Seems that parsing the source folder to check which files must be copied is so slow that the transfers can't be done during the night. I can check around 20 gb in 60 min, so it will take 30h to check the 600 go, not even talking about transferring the files.
Is there any optimization I could use to reduce this checking time?
Thanks
Fred.

Which server is the webdav server?

You can use more --checkers this will scan more directories at once, though I see you've tried that. If you think the server is IO limited then using less checkers might be a good idea (the default is 8) and --check-first is a good idea here too.

Are you copying these files just once or repeatedly?

We use IIS with IT hit WebDAV server (https://www.webdavsystem.com).
I will try wit( less checkers. I’ve tried check-first, didn’t change much
The copy occurs every night. Transferring should be fast as not many files changed each day but parsing the files is the real issue.

It does sound like you are IO limited. Has the server got HDD instead of SSD? Scanning directories on HDD is often quite time consuming.

Should have SSD but I'm wondering if it works correctly ...

It might be interesting if you time rclone lsf -R remote: and rclone lsl remote: to see how long they take. The first does a quick scan of each filename, the second reads the size and date which can take longer. They should both be relatively quick...

I stopped rclone lsf -R remote: after 2h running, far from finish. I would say it's slow ...

Sounds like your webdav server is very slow if it takes 2h just to list the directories :frowning:

Experiment with --checkers to find the value which is fastest for your server. You could try listing a subdirectory with a bit less stuff in it to do the experiment.

@ncw my understanding is that rclone will first parse all files to check which ones shoud be transferred and then will do the transfers.
Is there any option to check 1 file, transfer it, check the next one, transfer it and so on ?
That would make it slowier I think but should help me.
Thanks

Rclone does the checking and transferring in parallel. --checkers sets the number of directories scanned in parallel, so you could try setting --checkers 1 to slow that part down.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.