When copying or syncing a large amount of files and directories, I was wondering if there was some way to control whether the transfers operated depth-first, or breadth-first, or simply what the expected traversal method would currently be?
In a depth-first sync I would expect files 4,5,6 and 10,11,12 to be transferred before exiting out of foo/bar and starting on foo/buzz.
In a breadth-first copy I would expect 4,5,6 and 7,8,9 to be transferred before 10,11,12
I hope I'm explaining this clearly. I apologize if this is covered elsewhere, I did search docs and the forum but was unsuccessful. Thank you for helping me learn more!
The answer is that the order isn't defined. By default rclone will run 8 (controlled by --checkers) directory traversals at once, so anything could happen! If you set --checkers 1 then you'll get a breadth-first copy (roughly) but even that can be disrupted by concurrency within rclone.
If you care about which order the files get transferred then investigate the --order-by flag and if you want perfect ordering don't forget --check-first which does the entire directory traversal before uploading any files.
Thanks Nick! I appreciate your explanation. Is it conceivable that --check-first --order-by name could provide a depth-first transfer? Naively I am assuming this could ensure sub-directories would be traversed before the next parent's sibling, etc.
You're right, it's not depth-first if we define that as meaning transferring of the deepest most items first, but it seems at least defensibly "depth-first" as the traversal is deterministic and will stay confined to the parent until completion, right?
Thanks for the idea of contributing. My Go is a little amateurish but I'll have a noodle and see if I get to a PR.
To implement this you want to add a new case here - say "depth" and add a new sort function
I'd sort by number of / first and if they are equal, the full paths (the output of a.Src.Remote()) which will keep the sort order deterministic. You can count number of / with strings.Count.
That should be enough to get your started. There are tests in pipe_test and docs in docs/content/docs.md but fundamentally it should be a 6 line patch to the code.