An hour to check 240,000 files sounds very slow, if you already synced everything once before and have fast server with a fast low latency internet connection. I can check 50,000 files in less than 2 minutes.
Has (almost) everything been synced before? What is your latency? What is your internet speed? How many folders do you have? How deep is your folder structure?
It's hard to re-execute these because it takes so long for it to finish. I'm an hour and 30minutes in and the -vv output is just displaying what I have screenshot. I'll need to wait for it to finish before I can add the --log-file file.
Every now and then it says this:
but then it just keeps going, with no additional details.
Gigabit line, Dell r720.
Everything has synced as far as I can tell. I started using --fast-list after the initial sync after reading it would increase performance where there where a huge number of directories.
I decided to cut the current process and start it again with out --fast-list and now it's actually displaying file names and displaying comparison information, whereas before with --fast-list all it said was what is in the screenshot (I solemnly swear)
Edit: Folder Count is 109,000
Edit: It's been 5 minutes and we are on the C's out of of about 400 parent folders A-Z with a fairly even distribution. A this rate it will still take an hour+. On the bright side there's actual feedback now.
Edit: Current Check rate out of 240k:
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Checks: 1804 / 1804, 100%
Elapsed time: 1m0.1s
Google Drive seems to have a limit on the number of directory listings you can make per second.
I have roughly 3,000 folders that I can check in a little less than 2 minutes. Your 109,000 folders would then take roughly 1 hour if checked at the same rate (ignoring differences in number of files etc.)
This and other similar limits are typical for cloud storage having a fixed price per GB such as Google Drive, OneDrive, Dropbox, … The (relatively) low price per GB comes with a (somewhat hidden) limitation on your check and update speeds.
I use these Google Drive tuning parameters: --checkers=16 --drive-pacer-min-sleep=10ms in my syncs.
I suggest you try them individually and together; they may improve your speed somewhat (but don’t expect miracles).
hi, this is what i had suggested, did you try both values?
and about that last run, there were two errors.
it could be that rclone sync will try the entire sync for each error.
need to look into the debug log for retries
Sorry was going to try that but wanted to figure out how long it takes without the --fast-list modifier so I could have reference point. It took 5.5 hours.
Adding the mods that @Ole mentioned it's now working at 5x increased speed. I will then add your suggested modifier so I have a good spread. Thanks your help!
sorry to double post, but I would like to additionally report that there appear to be no more pacer mentions after enacting @ole's suggested modifiers. They're completely gone, but it's still trucking away.
@phatmandrake Try searching/filtering/grepping your log file for "ERROR :" then you will see the name of the troubling file, the error message and 3 times "ERROR : Attempt x/3 failed with ...".
The 3 attempts correspond to the default value of --retries.
Adding --drive-pacer-burst=200 had no noticeable affect.
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Errors: 2 (retrying may help)
Checks: 723339 / 723339, 100%
Elapsed time: 53m58.3s
2022/03/17 20:41:41 DEBUG : 34 go routines active
2022/03/17 20:41:41 Failed to sync with 2 errors: last error was: directory not found
Errors
ERROR : <SNIP> error reading source directory: directory not found
ERROR : <SNIP> Failed to copy: failed to open source object: Access is denied.
...
...
2022/03/17 20:41:41 DEBUG : Google drive root '': Waiting for checks to finish
2022/03/17 20:41:41 DEBUG : Google drive root '': Waiting for transfers to finish
2022/03/17 20:41:41 ERROR : Google drive root '': not deleting files as there were IO errors
2022/03/17 20:41:41 ERROR : Google drive root '': not deleting directories as there were IO errors
2022/03/17 20:41:41 ERROR : Attempt 3/3 failed with 2 errors and: failed to open source object: Access is denied.
It does look like it attempts a full sync at least 2 times because of these errors. What do I do about that
Start Time: 19:47:42
First Retry: 20:05:44
Based on this it finished in 18 minutes, which is a lot better.
Edit: Honestly based off the quota information I'm nowhere near the max queries per 100 second maybe I can up the checkers ??
Seems to be an issue with your access rights to a local directory/folder. I assume you know the name and have redacted your post.
You basically have two options:
give yourself the necessary access rights
exclude the folder from the sync by adding --exclude=”your/folder/**”
Ad 2) you can test your filter with rclone lsl. Please consider a --dry-run when adding/changing the filter to your sync.
The quota only shows your limit on requests in general, there are also other (hidden) limits such as the number of directory listings you can do per second (which is considerably lower).
So yes perhaps, you may get better results with --checkers=32, but not necessarily it may also cause Google to throttle and thereby lower your speed. It is a delicate balance. I suggest you test and trim in small steps when we have fixed the above error message.
I can't statically exclude any directories unfortunately. Is there a way to tell rclone to just ignore any errors and continue with all operations anyway? I'd rather something be missing than for it to not delete any files or directories because 1 or 2 are inaccesible during a sync. There are dozens of people working on the directory tree at any point in time so access errors are bound to happen periodically.
How is that any different from what it's doing now. Based on the log: It makes it sound like it's already not deleting anything.
2022/03/17 20:41:41 ERROR : Google drive root '': not deleting files as there were IO errors
2022/03/17 20:41:41 ERROR : Google drive root '': not deleting directories as there were IO errors
Well, it depends on the probabilities of clean run.
If chances are close to 0% of a successful attempt then there is no difference, if it is 50% then 3 attempts in a row will theoretically give you a 87.5% chance of success.
Thus far the errors are based on a permission issue, which will occur periodically and for ever retry until resolved (The user that I'm syncing as has full admin access, but NTFS permissions can get messed up to the point even an admin can't touch the file until permissions are restored). I'm really concerned about why RClone decides that no files can be deleted if it has issue with just one
If I embed RClone into a script I can probably come up with a way of resolving a permission error and maybe doing a targeted sync on the single failure.
So to clarify there is no way to get rclone to delete all files except for ones related to the one it says it's having issues with? just trying to understand that behavior.