Rclone unable to download google presentation of 500MB size from google drive

What is the problem you are having with rclone?

I am unable to download and upload a google presentation of size 500MB using rclone -P copy.

Run the command 'rclone version' and share the full output of the command.

  • os/version: ubuntu 22.04 (64 bit)
  • os/kernel: 5.19.0-1022-aws (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.20.2
  • go/linking: static
  • go/tags: none

Are you on the latest version of rclone? You can validate by checking the version listed here: Rclone downloads
-->Nope

Which cloud storage system are you using? (eg Google Drive)

Downloading the files from Google drive and uploading it to AWS S3.

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone -P  copy remote_g:opFolder_new --drive-chunk-size 32M remote_s3:drive2cloudbackup/hello2

The rclone config contents with secrets removed.

Paste config hereCurrent remotes:

Name                 Type
====                 ====
remote_g             drive
remote_s3            s3

A log from the command with the -vv flag

The following is the error I get Failed to copy: failed to open source object: open file failed: googleapi: got HTTP response code 413 with body: <!DOCTYPE html><html lang="en"><head><meta name="description" content="Web word processing, presentations and spreadsheets"><meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0, user-scalable=0"><link rel="shortcut icon" href="//docs.google.com/favicon.ico"><title>Page Not Found</title><meta name="referrer" content="origin"><link href="//fonts.googleapis.com/css?family=Product+Sans" rel="stylesheet" type="text/css" nonce="S7unDWDqZ4q1YFbU41gKBQ"><style nonce="S7unDWDqZ4q1YFbU41gKBQ">/* Copyright 2023 Google Inc. All Rights Reserved. */ .goog-inline-block{position:relative;display:-moz-inline-box;display:inline-block}* html .goog-inline-block{display:inline}*:first-child+html .goog-inline-block{display:inline}#drive-logo{margin:18px 0;position:absolute;white-space:nowrap}.docs-drivelogo-img{background-image:url('//ssl.gstatic.com/images/branding/googlelogo/1x/googlelogo_color_116x41dp.png');background-size:116px 41px;display:inline-block;height:41px;vertical-align:bottom;width:116px}.docs-drivelogo-text{color:#000;display:inline-block;opacity:0.54;text-decoration:none;font-family:'Product Sans',Arial,Helvetica,sans-serif;font-size:32px;text-rendering:optimizeLegibility;position:relative;top:-6px;left:-7px;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}@media (-webkit-min-device-pixel-ratio:1.5),(min-resolution:144dpi){.docs-drivelogo-img{background-image:url('//ssl.gstatic.com/images/branding/googlelogo/2x/googlelogo_color_116x41dp.png')}}</style><style type="text/css" nonce="S7unDWDqZ4q1YFbU41gKBQ">body {background-color: #fff; font-family: Arial,sans-serif; font-size: 13px; margin: 0; padding: 0;}a, a:link, a:visited {color: #112ABB;}</style><style type="text/css" nonce="S7unDWDqZ4q1YFbU41gKBQ">.errorMessage {font-size: 12pt; font-weight: bold; line-height: 150%;}</style></head><body><div id="outerContainer"><div id="innerContainer"><div style="position: absolute; top: -80px;"><div style="margin: 18px 0; position: absolute; white-space: nowrap;"><a href="//support.google.com/docs/"><img height="35px" src="//ssl.gstatic.com/docs/common/product/presentations_lockup2.png" alt="Google logo"/></a></div></div><div align="center"><p class="errorMessage" style="padding-top: 50px">Sorry, unable to open the file at this time.</p><p> Please check the address and try again. </p><div style="background: #F0F6FF; border: 1px solid black; margin-top: 35px; padding: 10px 125px; width: 300px;"><p><strong>Get stuff done with Google Drive</strong></p><p>Apps in Google Drive make it easy to create, store and share online documents, spreadsheets, presentations and more.</p><p>Learn more at <a href="https://drive.google.com/start/apps">drive.google.com/start/apps</a>.</p></div></div></div></div></body><style nonce="S7unDWDqZ4q1YFbU41gKBQ">html {height: 100%; overflow: auto;}body {height: 100%; overflow: auto;}#outerContainer {margin: auto; max-width: 750px;}#innerContainer {margin-bottom: 20px; margin-left: 40px; margin-right: 40px; margin-top: 80px; position: relative;}</style></html>

Can you please run the command with debug -vv and share the FULL output?

There are some size limits for google docs/presentations.

The error in your log is error 413 which is

413 Content Too Large 

which says the file is too large to download.

According to this page: Google Docs Size Limitations | Workspace Tips the limit is 100MB which seems quite small. This page agrees Re: 100 mb size limit google presentation - Google Cloud Community

Can you download the file using the drive web interface?

No, not able to download it even from the web interface. So, are there other alternatives to download such too-large presentations separately as pdf? Solution could be something like running the copy command first to transfer all the files from google drive to AWS S3 bucket including all the normal sized google editor files, and other binary files. And to those files which give this 413 error, report them in a log file. Later in a second run, transferring only these errored files again with export format set to pdf. I noticed that these large google editor files can be downloaded easily in the pdf format.

You can tell rclone to export stuff as PDF with --drive-export-formats pdf that might get you something usable?

Yes, you are right but, that will export each and every presentation, and document as a pdf. Why disturb small-sized documents, and presentations that were smoothly getting downloaded as .docx and .pptx? So, is there a way that rclone can check if a presentation is not able to get downloaded as .pptx then, it gets downloaded as a pdf file ?

If not, then do we have a way where rclone can flag these large presentations into a log file so that in a second command we could download only these errored files with --drive-export-formats pdf ?

Also, it would be a quick job for you to add this feature that would be something like

if mimeType == "application/vnd.google-apps.presentation" {
    if resp.StatusCode == 413 {
        ext = "pdf"
    } else {
        ext = "pptx"
    }
} else if mimeType == "application/vnd.google-apps.document" {
    ext = "docx"
} else if mimeType == "application/vnd.google-apps.spreadsheet" {
    ext = "xlsx"
} else {
    ext = strings.ToLower(info.NameExt)
}

This if-else can be extended to google documents, spreadsheets, etc. Can this be considered as a request to have this feature wherein if a google editor file is not able to download due to 413 response code then, it is being downloaded as a pdf file? Or else, could you quickly guide me the file where I can make such edits in my forked repo?

This approach won't work because rclone needs to know the file extension before starting to copy the file.

What I would do is do an rclone copy or sync to download what you can without --drive-export-formats pdf

Then I would use rclone check --missing-on-dst missing-files.txt src: dst: to make a list of files missing on the destination.

I would then edit missing-files.txt to change all the extensions to pdf and save it as missing-files-pdf.txt

Then I would juse rclone copy with --files-from missing-files-pdf.txt and --drive-export-formats pdf to download the missing files as pdf and fill in the gaps.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.