GDRIVE - how to move 2.5tb file


#1

I’m on a gigabit broadband and I’ve pushed 40TB to Team Drives typically by setting “–max-transfer 748G” and cycling through multiple user credentials when the max-transfer is hit. To date that has worked well. Although, I’ve come up on a 2.5TB file which is causing me problems. It actually, took me several days to realize that my copy was effectively stalling on the same file. My scheme only works if the file is less than 748G. I also by time of the day vary the bandwidth using rc and core/bwlimit.

What’s the best way to deal with files which are larger than Google’s daily limit? In looking at the forums it seems like controlling the bandwidth is what many are using. Is that the preferred means for dealing with large files? Are there any other recommended approaches?


#2

Not sure I can think of too many options:

  • bwlimit and let it go slowly
  • break it up into smaller pieces and upload them

In general dealing with a single object that big isn’t good for backup and restore as it isn’t very manageable.


#3

Thanks. The 2.5TB file is actually a .dmg which contains a .sparsebundle. Previously I had been in the habit of converting sparsebundles to dmg files because the sparsebundle is effectively a folder with the data as individual files of 8MB each. They were painfully slow (less than 3MBs) to upload to to GDRIVE with rclone. The dmg appears as a single file and rclone copy to GDRIVE was typically getting me 30-50MBs.

Previously, the sparsebundles I had converted to dmg were smaller than the GDRIVE daily limit so I had not really focused on the fact that rclone can not stop and continue a transfer mid file. It was only when I had a file which was larger than the daily limit that I became aware of rclone’s behaviour being different than rsync in regard to stoping and continuing a transfer.

While the performance for sparsebundles was bad, the more critical limitation was that the 8MB individual files one can hit the Team Drive number of objects limitation. You can work around that by making it smaller. But if it’s smaller it would also be less than the daily limit. My large sparsebundles are actually TimeMachine backups so I can’t really go in and break them up into smaller number of files.

My suspicion is that bandwidth limiting is likely to be my best route if the copy is primarily an archive. In cases where I am using it as a backup then sparsebundle is superior because subsequent updates will be faster.


#4

Yes if you use the equivalent bwlimit to 750GB/day which is roughly --bwlimit 8.5M then you should be able to upload the file eventually…


#5

Google does not abort transfers when hitting the 750 GB limit. You can upload the file just fine. Just remove the max-transfer flag.


#6

I started using max-transfer due to 403 errors once I reached ~750gb. I thought the 403 error was most typically indicative of a 24 hour ban during which you wouldn’t upload. That’s the whole reason I started cycling through multiple executions of the same rclone copy command but with different G-Suite user credentials for each invocation.

Was my understanding wrong? Has something changed?


#7

Once you have transferred 750 GB no new uploads will start for 24 hours. But running uploads are unaffected and will finish.


#8

Thanks. I didn’t realize that. I previously had run into 403 errors when copy directory trees and wasn’t focusing on this being a single file and the daily limit not preemptively stopping a transfer. I just used rc to increase the bandwidth. I will confirm that it worked as you indicate.


#9

An update, as kuerbiskern suggested the rclone copy will continue on a single file copy beyond 750GB in a day. However, it does appear that Google is doing some form of throttling.

Specifically, after the earlier post I raised my bwlimit to 100M. At that point, I had been running for 32 hours at 8.5MBs and transferred 877GB. In looking at the log at 8.5MBs rclone actually uploaded 672GB in the first 24 hours and at the time I increased the bandwidth I was 219GB into the second day.

The change in bwlimit resulted in the transfer rate being consistently above 26MBs until 1TB was transferred. Since then it appears some throttling has been in place whereby the rate is decreasing as the amount transferred increases. Currently, I am 46 hours into the transfer. It transferred 891GB in the last 14 hours so surpassing my self imposed rate via bandwidth of 750GB/day. I have transferred 1.76TB and currently uploading at 11.1MBs. In the last hour, the time to completion has increased from 18 hours to 20 hours.

As long as nothing crazy happens with what I am speculating is Google imposed throttling the upload should still complete faster than when I was setting bandwidth to 8.5M.


#10

An update. I’ve done two more of these large file transfers. Same pattern in that somewhere above 750GB things keep running but slow down. Although, thus far it seems like it slows down to something like 10MBs so slightly higher than 750GB in a day.

One experience I had was that on one of the transfers I experienced the following error:

Failed to copy: googleapi: got HTTP response code 410 with body: Service Temporarily Unavailable

That was when I was already being throttled down by Google. It recovered gracefully and the rclone copy continued but the bandwidth increased to its pre-throttling rate of ~30MBs. It appears that whatever is doing the throttling was reset by the 410 error.