Box.com monthly API limits

Wait? what? Wouldn't crypt-->chunker-->box hide information about chunking by encrypting the part of the filename that had the word chunk in it?

In other words the exact opposite of what you just said? Or am I missing some key detail?

edit: My goal is as you suggest, to conceal information about the chunking, but I'm a novice so I might be getting the order wrong in order to do so.

BIG-file is what we start with.

Now:

chunker-->crypt-->box

BIG-file.part1, BIG-file.part

23jkh3423jk4h23jk, 3423jkj234k23h4k23h

vs.

crypt-->chunker-->box

start again with BIG-file

lkjdal234klj234lk23j234k

lkjdal234klj234lk23j234k.part1, lkjdal234klj234lk23j234k.part2

I think it is a bit theoretical:) both remotes' chains will work - but first one gives away less what is going on.

1 Like

I don't think I understand your ----> that might be my issue. Specifically I must not understand the arrow notation.

I was thinking using my own notation. I should probably do:

crypt(chunker(box))

Because that would result in not revealing information about chunking, yes?

Using your convention I think:

chunker(crypt(box))

is better as it hides more.

but doesn't chunker(crypt(box)) show a large number of files with cleartext .rclone_chunk. in it? whereas reversing it would encrypt the rclone_chunk part?

I feel like there must be something I am misunderstanding :frowning:

On the flip side I am not sure it matters as box.com gave an error when I tried to open the account. I wonder if the phone number should have - in it or if gmail email addresses are blocked or some such? Or heck, are they just full? Anyone sign up recently notice anything specific they had to do on the sign up page?

or in rclone lingo, what I mean by chunker(crypt(box)):

[box-remote]
type = box

[crypt-remote]
type = crypt
remote = box-remote:crypt-dir

[chunker-remote]
type = chunker
remote = crypt-remote:

logic behind it is that the lowest layer (box) is simple storage and should not be aware of anything what upper layers do. Its only job is to store blob/file it is given - nothing else. And retrieve it when requested.

1 Like

That is also what I mean by () I don't understand though why having the crypt on the outside doesn't work best for hiding all the .rclone_chunk. cleartext though.

If I wrap the box in a chunker many files with .rclone_chunk. are creating, then wrapping all of that in a crypt would encrypt .rclone_chunk. although maybe your point is that it would encrypt each instance of .rclone_chunk. into the same thing and make the encryption weaker? I don't think that's right?

Although I guess what you're saying is... what matters isn't what I see on my end at the top... who cares if I can see a lot of cleartext .rclone_chunk.

What matters is what box.com would see, and by going chunker(crypt(box)) the ensures the final inside layer scrambles things from their viewpoint. Okay. That make sense. Glad I understand now.

TLDR: I had it backwards, whoops.

Shame their webpage gave a generic error message something went wrong when I tried to sign up.

with Chunker, can you still mount your cloud and see your files normally, or do they show split up? I've never seen or heard of chunker, so interested how this would work?

You are very right. Your way is perfectly valid and correct too.

My way just leave box only with encrypted blobs. When in your case you leave some info about chunking.

Does it matter? Depends. I like clear cut which layer does what without leaking anything what is not needed to do job to other layer.

Hey, if possible, can you guide me where to check these api calls on google? I've tried looking but can't find anything related

It is Box story here.... what google has to do with it

Lol, I was writing in other post and it was posted here

1 Like

It isn't mentioned on Box does this mean box.com does not support utf-16 and therefore does not support base32768? Yes? I assume?

I was thinking base32768 might solve the case insensitivity issue better than say obfuscate filenames would. And with box strictly limited to path length 255 base32/64 might be untenable, hmm.

Box supports base32768 and it has been tested unless you can prove opposite.

1 Like

My experience with box.com api calls with very large amounts of data (over 100tb) really comes down to rate limits - personally I use —tpslimit 12

rclone -v copy --no-check-dest --retries=1 --tpslimit 12 "source:folder" "chunkercryptbox:folder"

This results in me getting about 16-20MiB/s speed, is this typical for box.com? I can get 55-110MiB/s speed typically from dropbox or google.

I suppose the slowdown comes from the tps limit? It's also strange though as in the above command --transfers is set to 2, even though the default is 4. Does rclone intentionally swap from 4 transfers to 2 either due to seeing box or due to seeing the --tpslimit?

edit:
rclone -v copy --no-check-dest --retries=1 --tpslimit 12 --transfers 4 "source:folder" "cryptbox:folder"

Resulted in a speed of 23MiB/sec, this time on the server that is capped to 55MiB/sec but always gets exactly 55MiB/sec to google or dropbox because it's a server that's intentionally throttled down to 55MiB/sce or whatever.

edit2: I figured out why --transfers 4 didn't help, this session is still operating as if --transfers 2 was in effect. So one of these flags is just flat out overwriting --transfers and I don't know how to change it to stop that behavior.

edit3:
rclone -v copy --no-check-dest --max-size 4.9G --retries=1 --transfers 4 "source:folder" "cryptbox:folder"

This results in --transfers 4 working, so --tpslimit was what was automatically ignoring --transfers. Unfortunately my speed is still pegged to 20-22MiB/sec (actually more like 24-26MiB/sec so maybe this alteration helped)

edit4: rclone -v copy --no-check-dest --max-size 4.9G --retries=1 --transfers 8 --drive-chunk-size 128M "source:folder" "cryptbox:folder"

This time speed at 1MiB/sec conclusion I went too far in the other direction :slight_smile:

edit5: Speed back at 24MiB/sec with
rclone -vv copy --no-check-dest --max-size 4.9G --retries=1 --transfers 4 --drive-chunk-size 32M --tpslimit 14 "source:folder" "cryptbox:folder"

The default chunk size in rclone 8M and this flag tells it to use 32M strangely the -vv log is referencing both sizes?

edit6: Same command flags. Alternate server. This time without any client_id or client_secret. All previous commands have used a client_id and client_secret. This server has theoretical speed limits around 200MiB/s result: 24MiB/sec

This is mildly interesting, it seems like no matter what I do, I get about 17-24MiB/s to box.com despite having much higher speeds to places like google or dropbox. But despite box.com being so slow, it doesn't seem to complain about how many different source's of transfer I'm using at the same time. Although using multiple transfer sources like this is a giant, pain, it can't really be done afk, I'm using different subfolder targets here to ensure each transfer is not effected by the other in terms of the files it's choosing... hmm

I think box.com uses a mild api throttling system, as I've gone from 100MiB/s+20Mib/s+20Mib/s to 45+4+17 for no discernable reason.

Edit: And back up to 100+24+8, then back down to 40something+20something etc etc etc.

In theory this probably means there are ideal settings to ride the rail of exactly how much you can do on say a --tpslimit X before being throttled, but I've personally no clue where the line is, it doesn't seem to adapt particularly fast, maybe it's more like a minute average checked against the hour or some such.

Edit2: The machine throttled to 55MiB/s is the one with infinite bandwidth (300tb a month) and the machine which can max out at 2 gigabits, but only has 10tb free a month... somehow the slow machine is getting often 10MiB/s and the fast machine is getting 60-90MiB/s but.... the slow machine has access to a gigabit line as well it's just throttled specifically to 55MiB/s so it's odd it's so vastly underperforming. I wonder if --chunker reports bandwidth incorrectly somehow due to server-side renaming it features?

Edit3: Total speed is now down to 22+3Mib/s I cannot be sure if I'm hitting an api limitation; the data type has changed; or what I suspect might be most likely.. Is that something about using chunker as an overlay is slowing things down. Or the chunker overlay is making the reported bandwidth counter inaccurate, because for 1minute it was actually pegging me at 75Mib/s on a server that only has 55Mib/s maximum throughput. And that same command is still running but mostly around 3Mib/s (but I am not observant/diligent enough to check the reported bandwidth for every single minute in a day.)

All of this is very interesting, because, honestly, 22MiB/s is perfectly fine for me purposes. After data migration. But it's far too slow to complete the data migration fast enough to know if I'm wasting my efforts and should consider other options in less than 2 months.

Either chunker overlay is cutting my speed to box.com by 90% or increasing my tpslimit from 16 to 32 to 64 to 128 somehow has had the opposite of the desired effect. It's tough to tell because the readings are so wildly all over the place.

But I am really worried that chunker is the problem, worried because chunker is mandatory, but I think it all comes down to the tpslimit, and honestly the box.com api might need to be altered slightly to respond more dynamically to whatever signal is sent when they're throttling api calls. Hmm.

This feeling reminds me a lot of of googledrive's performance ~6 years ago before anyone knew the settings to avoid their apithrottling whatnot.

Time being though 16 transfers 128tpslimit 1MiB/s 4 transfers 64tpslimit 27MiB/s

I keep messing with the settings though because this server has 55MiB/s speed and another server can get 55Mib/s to box.com BLAHBLAHBLAH.

If anyone comes in here with stellar performance and clear explanation of configuration of setup, I'll stop rambling :slight_smile:

Edit: looks like when using chunker less is more, less transfers is better, probably because the transfers are multiplied by the chunks. Going from 16 to 4 moved speed from 2MiB/s to 42MiB/s on the remote server that never broke 26MiB/s before.

That should've been a no brainer to me but I guess I just didn't do the math

gdrive was very popular for long so had a lot of attention and tweaking. Box is rising star:) For sure its API implementation can be improved. There are many areas there that can be optimised. But as usual with open source project. You think it is important DYI and post PR. or wait for somebody else to do it. maybe tomorrow, maybe never.

I just be kicked out by Box, they sent me a email and since that I wouldn't access my account nor data

1 Like