just wanted to triple check regarding dropbox -> box copy - lots of data (many many TB) including tens of thousands of small files, which is where in my testing it's getting hung up
Try running with -vv and check to see if you are getting retries.
Dropbox is very fussy about transactions per second - most people recommend --tpslimit 12 which will limit you to 12 transactions per second - that might be 12 files per second. If you exceed the limits with dropbox it sends punitively long timeouts to rclone which slow things down much more than --tpslimit 12. Note --tpslimit will apply to source and destination so this might not be exactly the right number (you could try double).
Note that all the clouds are bad at lots of small files as each one will need an HTTP roundtrip which can take 1s or more.
You can try upping transfers which should help - dropbox doesn't mind lots of open connections. Not sure about box.
now I am pretty sure this is against best practices but I decided to rclone mount dropbox and box (tsp limit 10 on box) and just rclone "local" to "local" and will see how it goes
I've been testing with having either box or dropbox or both mounted - it doesn't seem to help much (except manage to build up a local cache which makes sense) will continue testing before I go down too many rabbit holes
I tried mounting both and started seeing file I/o errors and it hanging - and no throttling errors on the Mount log. But I ran it without debug so I am not sure yet what the issues is. Will do more testing . Too many factors
I’m testing with the lower end business box account also and I suspect they throttle that more than usual in some ways.
going back to basics, just re original sanity check - is there anything I'm missing here in terms of comparing a massive number of files? in most cases (I'm doing many individual users) I know that the initial sync isn't complete - I am basically replicating the settings I'm using on a different box account local (smb) to box and it seems like it can handle checking hundreds of thousands of files much faster - and DOWNLOADING dropbox shouldn't really throttle me - and I know when I use mount it certainly fills up the dropbox vfs cache quick.
I don't want to tie anybody down chasing ghosts I'm just wondering if it's coming down to having a lower end box account.
EDIT: I should have said, I just switched to using --check-first with these transfers I think it's possible that the tps limit is forcing check/transfers to step on each other, will update, thanks
Check first will lower transactions needed so that's a good idea.
I think it's probably a question of running the sync as many times as it takes to run clean. Hopefully with check first and as the sync gets to nearly completed it will finish off the last remaining files with no problems.
quick update - decided to do a check --missing-from-dst - had it run for about 24 hours - generated a file with 200k entries (which is hardly complete) and did --files-from-raw
doing the next "check" phase and had a thought - IS there a faster way that would actually work with (say) just diff etc if I do a listing of everything in source and dest ?
UPDATE, testing this (in my scenerio large dropbox -> box)
So far —files-from-raw seems to be cranking ok - still getting pacer errors but it SEEMS to be working faster than when I hardcode TPS limits (
And —no-traverse started the copy instantly - not sure if there is anything else to optimize the process
In parallel I’m doing a list of —Dropbox-impersonation user folders one by one into box user folders
And the same method of lsf + diff seems to work in getting the initial sync complete , not sure if there is a better / faster rclone specific mechanic to do this more efficiently - maybe sort the file list somehow (say by size) dunno just spitballing
I might try the main copy again using two mounts again and see how that goes