How to migrate 400TB+ from Google Drive downloadQuotaExceeded

What is the problem you are having with rclone?

I'm able to get 500-600MB/s but quickly hit downloadQuotaExceeded after about 1.5TB. This form an education account. We're trying to help the user move to our research archive system as GDrive has many limitations that is causing issues for them. Note most files are large multiple GB.

Run the command 'rclone version' and share the full output of the command.

rclone v1.60.0
- os/version: redhat 7.9 (64 bit)
- os/kernel: 3.10.0-1160.36.2.el7.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.19.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone --drive-shared-with-me --transfers=32 -P sync source:/path/data data

The rclone config contents with secrets removed.

We did create a unique client_id and client_secret. I see it hit API errors about 30 minutes/1.5TB of transfer.

type = drive
client_id = <snip>
client_secret = <snip>
scope = drive.readonly
token = <snip>
team_drive = 

A log from the command with the -vv flag

available on request but errors or messages about about API limit.  Paths in logs are considered private.

We are trying to migrate in our enterprise google account around 430TB of data out of drive (yes it should not been used that way). We are using rclone to help the group who has shared the data in question with us so we can download it to our local storage system. That system is about 8PB and can reach around 8GB/s. The host running rclone has 40Gig network connection. Our WAN is also 40Gig.

We are getting over 500MB/s but hit API quota quickly. Other than making several API keys and splitting up the transfer what's a best practice for such a large transfer? It took years to build up this data but we don't have years to move it out.

I can't really comment on how Google monitor your traffic. There are limits to Google (typically 10TB download, and 750GB upload per 24 hours).

Is there a reason you're having the transfers pretty high? I'd suggest trying to take it down to 6.

I believe there was a test where someone 'probed' a file with a rclone mount which counted as a full download (despite not being downloaded).

Unfortunately, there's no magic to this as you have to suffer through Google's limits.

When I migrated, it wasn't a shared drive and I was able to get around 10TB a day which as noted above tends to be a quota for download as it's not documented. Google won't explain nor tell you what limit you've hit so since it's not documented and you may have hit another limit.

More API keys won't do anything as you are not hitting an API limit. I raised a support ticket when I was migrating and was told nothing and other than wait it out each day, that's all I got.

That's not right as that's been said a few times, tested and disproved. There is 'some' limit per file as well but it's also hard to prove since Google Support won't tell you either. I disproved this many months back by just heading a large file a few thousand times via a script.

This often gets confused as before rclone had 'chunked' reading, you would get the 'full' file each time on a mount and thus the legend lives on but that was many years back.

Ok thanks for the feedback so doesn't sound like a lot of options.
The transfer did resume after a day and got beyond 1.5TB will see where it goes.

Does sync on Drive require downloads again? Or can I force sync on just size and existence (pick up new file)? I'm worried about a run lasting that long 43 days minimum it appears.

So I would like to be able to run a final sync across everything and not cause a new download just to calculate checksums/sizes?

Is this something to worry about?

When I migrated to dropbox, I was a GSuite, no shared drives and I could get ~10TB per user moved daily as I was using 2 users so about 20TB per day.

Generally sync won't copy the same file again. A log file would clarify/show why something is recopied.

Sync means destination looks identical to source. If that's the goal, I'd test via --dry-run and ensure you get what you want and then run without since it deletes data. When I migrated, I did sync or copy as I moved all my data instead so was easier for my requirements.

Ok great thanks for that info.

We have a tool we wrote around rclone that gives a point and click interface but of our multiple PB in GDrive this one group had by far the most and breaks the assumptions in the tool. We like what Drive provides just getting people to use the appropriate tool and not get data hostage when it's needed as in this case, just wrong tool for what they are trying to do.

Great flexible tool BTW made this tractable. We used it for a large Box migration also.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.