Max Efficiency When Downloading From Dropbox

I have a need to move ~30TB+ from Dropbox to S3. I’m running rclone v1.45 on a CentOS EC2 instance of type c5n.18xlarge. AWS show that instance having a network throughput of 100 Gbps. I also enabled Transfer acceleration on my S3 bucket and rclone is configured to use the specific endpoint; with VPC endpoints defined. I launch rclone with this command:

rclone -v sync MBLive:/David\ Bitton/ MBLive_S3:/mblive-s3/  --stats-one-line -P --stats 2s

where the source is Dropbox and the destination is S3. The stats are showing 92.461 MBytes/s. A) How do I know which end of the sync is my limiting factor, and b) how can I improve on this? Thanks.

1 Like

That seems like a pretty decent rate.

You can always try to increase transfers/checkers to see if that helps getting more performance.

Other than that, you can’t see either side so you wouldn’t be able to tell who’s at fault.

The only thing you can see is IO/traffic/etc on your host and try to look for any bottlenecks there.

1 Like

I’ll put money on the Dropbox download is the slower of the two operations. I also noticed that rclone is building the file list on the fly. So that, after every file copied, the count of files to copy goes up. Is there a way to have rclone perform the scan prior? It seems like this added step is adding to the perf. No?

For copy/sync, it looks ahead based on this:

 --max-backlog int                              Maximum number of objects in sync or check backlog. (default 10000)

You can bump that number up.

will --fast-list benefit me?

I think so, although I’m not 100% sure if dropbox can use it, but the S3 side does.

I restarted the sync and rclone copied the same files again. Wah?

If you run with -vv on your command line, it’ll tell you why it’s copying a file.

No more info than I had before.

ssh_workstation

Oh, BTW, --fast-list throws an error:

centos:~/ $ rclone -vv sync MBLive:/David\ Bitton/ MBLive_S3:/  --stats-one-line -P --stats 2s  --max-backlog 100000 --fast-list                                                                                                    [2:17:06]
2019/04/21 02:17:11 DEBUG : rclone: Version "v1.45" starting with parameters ["rclone" "-vv" "sync" "MBLive:/David Bitton/" "MBLive_S3:/" "--stats-one-line" "-P" "--stats" "2s" "--max-backlog" "100000" "--fast-list"]
2019/04/21 02:17:11 DEBUG : Using config file from "/home/centos/.config/rclone/rclone.conf"
2019/04/21 02:17:12 DEBUG : Dropbox root 'David Bitton': Using root namespace "5362150896"
2019-04-21 02:17:13 ERROR : : error reading destination directory: bucket or container name is needed in remote
2019-04-21 02:17:13 INFO  : S3 bucket : Waiting for checks to finish
2019-04-21 02:17:13 INFO  : S3 bucket : Waiting for transfers to finish
2019-04-21 02:17:13 ERROR : S3 bucket : not deleting files as there were IO errors
2019-04-21 02:17:13 ERROR : S3 bucket : not deleting directories as there were IO errors
2019-04-21 02:17:13 ERROR : Attempt 1/3 failed with 1 errors and: not deleting files as there were IO errors
2019-04-21 02:17:13 ERROR : : error reading destination directory: bucket or container name is needed in remote
2019-04-21 02:17:13 INFO  : S3 bucket : Waiting for checks to finish
2019-04-21 02:17:13 INFO  : S3 bucket : Waiting for transfers to finish
2019-04-21 02:17:13 ERROR : S3 bucket : not deleting files as there were IO errors
2019-04-21 02:17:13 ERROR : S3 bucket : not deleting directories as there were IO errors
2019-04-21 02:17:13 ERROR : Attempt 2/3 failed with 1 errors and: not deleting files as there were IO errors
2019-04-21 02:17:14 ERROR : : error reading destination directory: bucket or container name is needed in remote
2019-04-21 02:17:14 INFO  : S3 bucket : Waiting for checks to finish
2019-04-21 02:17:14 INFO  : S3 bucket : Waiting for transfers to finish
2019-04-21 02:17:14 ERROR : S3 bucket : not deleting files as there were IO errors
2019-04-21 02:17:14 ERROR : S3 bucket : not deleting directories as there were IO errors
2019-04-21 02:17:14 ERROR : Attempt 3/3 failed with 1 errors and: not deleting files as there were IO errors
0 / 0 Bytes, -, 0 Bytes/s, ETA -
2019/04/21 02:17:14 Failed to sync: not deleting files as there were IO errors
centos:~/ $    

You are a few versions behind. You should grab 1.47.

https://rclone.org/downloads/

ok, new rev but same error. I don’t specify the target bucket because it’s in the endpoint URL. that’s due to it being a transfer accelerated bucket. hmm.

You need to run it with debug or -vv so you can see what it copies:

[felix@gemini ~]$ rclone copy /etc/hosts GD: -vv
2019/04/20 22:24:24 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "copy" "/etc/hosts" "GD:" "-vv"]
2019/04/20 22:24:24 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2019/04/20 22:24:24 DEBUG : hosts: Couldn't find file - need to transfer
2019/04/20 22:24:26 INFO  : hosts: Copied (new)
2019/04/20 22:24:26 INFO  :
Transferred:   	       205 / 205 Bytes, 100%, 120 Bytes/s, ETA 0s
Errors:                 0
Checks:                 0 / 0, -
Transferred:            1 / 1, 100%
Elapsed time:        1.6s

2019/04/20 22:24:26 DEBUG : 4 go routines active
2019/04/20 22:24:26 DEBUG : rclone: Version "v1.47.0" finishing with parameters ["rclone" "copy" "/etc/hosts" "GD:" "-vv"]
[felix@gemini ~]$ rclone copy /etc/hosts GD: -vv
2019/04/20 22:24:30 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "copy" "/etc/hosts" "GD:" "-vv"]
2019/04/20 22:24:30 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2019/04/20 22:24:30 DEBUG : hosts: Size and modification time the same (differ by -356.227µs, within tolerance 1ms)
2019/04/20 22:24:30 DEBUG : hosts: Unchanged skipping
2019/04/20 22:24:30 INFO  :
Transferred:   	         0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors:                 0
Checks:                 1 / 1, 100%
Transferred:            0 / 0, -
Elapsed time:       500ms

2019/04/20 22:24:30 DEBUG : 4 go routines active
2019/04/20 22:24:30 DEBUG : rclone: Version "v1.47.0" finishing with parameters ["rclone" "copy" "/etc/hosts" "GD:" "-vv"]
[felix@gemini ~]$

I am:

2019/04/21 02:26:44 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "sync" "MBLive:/David Bitton/" "MBLive_S3:/" "--stats-one-line" "-P" "--stats" "2s" "--max-backlog" "100000" "--fast-list" "-vv"]

Can you share the whole log then? It shows in there why it copies a new file as you just shared the part when it was done copying.

ok, but now w/ the new version, i’m getting this output (without --fast-list)

2019/04/21 02:29:15 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "-vv" "sync" "MBLive:/David Bitton/" "MBLive_S3:/" "--stats-one-line" "-P" "--stats" "2s" "--max-backlog" "100000"]
2019/04/21 02:29:15 DEBUG : Using config file from "/home/centos/.config/rclone/rclone.conf"
2019/04/21 02:29:16 DEBUG : Dropbox root 'David Bitton': Using root namespace "5362150896"
panic: runtime error: comparing uncomparable type request.ErrInvalidParams

goroutine 158 [running]:
github.com/ncw/rclone/lib/errors.Walk(0x15e1160, 0xc000956a20, 0xc0006c75d0)
        /home/travis/gopath/src/github.com/ncw/rclone/lib/errors/errors.go:65 +0x1a9
github.com/ncw/rclone/lib/pacer.IsRetryAfter(0x15e1160, 0xc000956a20, 0x0, 0x0)
        /home/travis/gopath/src/github.com/ncw/rclone/lib/pacer/pacer.go:256 +0x7e
github.com/ncw/rclone/lib/pacer.(*S3).Calculate(0xc0009b2820, 0x0, 0x0, 0x15e1160, 0xc000956a20, 0xc0001b2400)
        /home/travis/gopath/src/github.com/ncw/rclone/lib/pacer/pacers.go:301 +0x39
github.com/ncw/rclone/fs.(*logCalculator).Calculate(0xc00099e200, 0x0, 0x0, 0x15e1160, 0xc000956a20, 0xc0009ae1e0)
        /home/travis/gopath/src/github.com/ncw/rclone/fs/fs.go:1144 +0x6b
github.com/ncw/rclone/lib/pacer.(*Pacer).endCall(0xc0009ae1e0, 0x0, 0x15e1160, 0xc000956a20)
        /home/travis/gopath/src/github.com/ncw/rclone/lib/pacer/pacer.go:188 +0xa1
github.com/ncw/rclone/lib/pacer.(*Pacer).call(0xc0009ae1e0, 0xc000578640, 0xa, 0x20, 0x123f460)
        /home/travis/gopath/src/github.com/ncw/rclone/lib/pacer/pacer.go:198 +0xc4
github.com/ncw/rclone/lib/pacer.(*Pacer).Call(0xc0009ae1e0, 0xc000578640, 0x40be29, 0xc0006c77e0)
        /home/travis/gopath/src/github.com/ncw/rclone/lib/pacer/pacer.go:216 +0x78
github.com/ncw/rclone/backend/s3.(*Fs).dirExists(0xc00021a160, 0x13f1c80, 0xc00021a298, 0xc0000f4700)
        /home/travis/gopath/src/github.com/ncw/rclone/backend/s3/s3.go:1420 +0xaf
github.com/ncw/rclone/backend/s3.(*Fs).Mkdir(0xc00021a160, 0x0, 0x0, 0x0, 0x0)
        /home/travis/gopath/src/github.com/ncw/rclone/backend/s3/s3.go:1443 +0x3ca
github.com/ncw/rclone/backend/s3.(*Object).Update(0xc0001cb3e0, 0x15debc0, 0xc00093e180, 0x7f284fb5fce0, 0xc00067c320, 0xc0003ba8d0, 0x1, 0x1, 0x40c6b8, 0x10)
        /home/travis/gopath/src/github.com/ncw/rclone/backend/s3/s3.go:1777 +0x6b
github.com/ncw/rclone/backend/s3.(*Fs).Put(0xc00021a160, 0x15debc0, 0xc00093e180, 0x7f284fb5fce0, 0xc00067c320, 0xc0003ba8d0, 0x1, 0x1, 0x40c6b8, 0x60, ...)
        /home/travis/gopath/src/github.com/ncw/rclone/backend/s3/s3.go:1405 +0x101
github.com/ncw/rclone/fs/operations.Copy(0x1614800, 0xc00021a160, 0x0, 0x0, 0xc000ae20ce, 0x21, 0x1613c00, 0xc00067c320, 0xc00011f8a8, 0xc00011f9b8, ...)
        /home/travis/gopath/src/github.com/ncw/rclone/fs/operations/operations.go:317 +0x14c4
github.com/ncw/rclone/fs/sync.(*syncCopyMove).pairCopyOrMove(0xc00098c8c0, 0xc0001d8600, 0x1614800, 0xc00021a160, 0xc00098c998)
        /home/travis/gopath/src/github.com/ncw/rclone/fs/sync/sync.go:295 +0x251
created by github.com/ncw/rclone/fs/sync.(*syncCopyMove).startTransfers
        /home/travis/gopath/src/github.com/ncw/rclone/fs/sync/sync.go:321 +0x9f
c

What’s your rclone.conf look like without passwords/keys?

[MBLive]
type = dropbox
token = {"access_token":"token","token_type":"bearer","expiry":"0001-01-01T00:00:00Z"}

[MBLive_S3]
type = s3
provider = AWS
env_auth = true
access_key_id = key
secret_access_key = secret
region = us-east-1
endpoint = mblive-s3.s3-accelerate.amazonaws.com

Can you do a rclone lsf dropbox: -vv and does that work?

yes:

centos:~/ $ rclone lsf MBLive: -vv                                                                                                                                                                                                 [22:37:12]
2019/04/21 22:37:23 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "lsf" "MBLive:" "-vv"]
2019/04/21 22:37:23 DEBUG : Using config file from "/home/centos/.config/rclone/rclone.conf"
BRANDING/
David's Team Space To-Do List.url
Get Started with Dropbox.pdf
Graphics/
PRODUCTS/
Portfolio/
_not yet sorted_/
2019/04/21 22:37:24 DEBUG : 4 go routines active
2019/04/21 22:37:24 DEBUG : rclone: Version "v1.47.0" finishing with parameters ["rclone" "lsf" "MBLive:" "-vv"]