Download a file from Google Drive that requires authentication

DefNotBen · May 9, 2021, 6:10am

I want to migrate data on my Google Drive's account to another drive, but the thing is that account recently had it's Drive API disabled and it's limited to it's organization, so the only way i can get the file to be migrated to another drive is by manually downloading the file while i'm logged into that account, otherwise i'd run into error 403 ( I even tried to share the file by link, it doesn't work too )

I did some research and found out that i can use "rclone copyurl" to download a file from an url to a drive, but when i use the download link that the site gave me, it'd redirect me the client to the login page. May i ask if it's possible to make rclone automatically fill in log in information on that page to reach the direct download link of that file? Thank you.

ncw · May 10, 2021, 4:33pm

rclone copyurl expects the authentication to be working - it won't fill in a web form for you.

It might be that you could get the cookies out of a working download session with your browser and pass them into rclone copyurl using the --header flag.

DefNotBen · May 10, 2021, 4:41pm

I'm not sure how to do this, can you instruct it for me step by step, or link me any guide / instructions on how to get cookies out of a particular download session? Thanks!

ncw · May 11, 2021, 3:30pm

Here is a nice tutorial on how to get the cookies

DefNotBen · May 11, 2021, 4:50pm

Alright, i know which cookie works for the download now, but now i don't know how to input it into --header flag like you mentioned. It wants to take a string that looks like Content-Encoding: gzip. Do i have to make a file that contains cookie information and put inside the rclone folder?

Sorry if these questions caused you any inconveniences, i'm fairly new in this.

ncw · May 12, 2021, 8:56am

You want to add a flag which looks like --header "Cookie: COOKIECONTENTS" you can repeat that flag if you have more than one cookie (or other header) to add.

DefNotBen · May 13, 2021, 11:36am

If i understand correctly, then what i need to add is--header "Cookie: COOKIENAME=COOKIECONTENT; COOKIENAME2=COOKIECONTENT2;..." ( Cookie - HTTP | MDN )

After i added the cookie as you instructed, rclone no longer downloads the login page, but it now starts getting redirected continuously and stopping download after it got redirected 10 times. This is the error that it's giving me now :

2021-05-13 18:29:51 ERROR : Attempt 1/3 failed with 1 errors and: Get "https://docs.google.com/u/2/nonceSigner?nonce=q144j914oceco&continue=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx20905250000/08527663066856848882/08527663066856848882/1P7CtQFkhAbY5y2shTsMwSbUHoSL7gDqk?e%3Ddownload%26nonce%3D4gkkq2ae0np7g%26user%3D08527663066856848882%26authuser%3D2%26hash%3Dq310m65kgnculkir3nq10u8tvh4lqlvs&hash=7idnu63704r1mfbgu3m9l63foh7relo5": stopped after 10 redirects
2021-05-13 18:29:53 ERROR : Attempt 2/3 failed with 1 errors and: Get "https://docs.google.com/u/2/nonceSigner?nonce=uabiv6mk80j5g&continue=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx20905250000/08527663066856848882/08527663066856848882/1P7CtQFkhAbY5y2shTsMwSbUHoSL7gDqk?e%3Ddownload%26nonce%3D0niq8mfv1099m%26user%3D08527663066856848882%26authuser%3D2%26hash%3Dor7uv1g40rlpdrhpbq6fp5qth1gda0et&hash=5ql73vh2q2ke6mv7nmqsopnm9brcc99k": stopped after 10 redirects
2021-05-13 18:29:55 ERROR : Attempt 3/3 failed with 1 errors and: Get "https://docs.google.com/u/2/nonceSigner?nonce=2kub3hd7mad9o&continue=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx20905250000/08527663066856848882/08527663066856848882/1P7CtQFkhAbY5y2shTsMwSbUHoSL7gDqk?e%3Ddownload%26nonce%3Dnh4nhqcnjq44g%26user%3D08527663066856848882%26authuser%3D2%26hash%3Dsjjtmo3mf365lvbnmv2pca9eeblhvqd4&hash=evlj68hstp5qlv0en7eeq4ab5mc0giun": stopped after 10 redirects

Is there any way to tell rclone to follow the redirects all the way?

ncw · May 14, 2021, 1:17pm

Looks like it is just being redirected to the same page - probably because it is missing something - try adding --use-cookies to the command line.

DefNotBen · May 14, 2021, 6:36pm

Still ended up with the same error.

2021/05/15 01:35:34 Failed to copyurl: Get "https://doc-14-b8-docs.googleusercontent.com/docs/securesc/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx?e=download&nonce=xxxxxxxxxxx&user=08527xxxxxxxxx&authuser=2&hash=2gs0bk8pacbxxxxxxxxxxxx": stopped after 10 redirects

ncw · May 16, 2021, 3:03pm

I expect you need some more cookies.. Try in the browser again and see if there are any you missed?

I'm not sure how this auth flow works.

Note also that the cookies will stop working after a little while (1 hour maybe).

DefNotBen · May 16, 2021, 3:36pm

I've grabbed all the cookies that are displaying in the browser ( Totaling about 25 out of 34 that says "in use" in the browser. ). There are also things like "Shared Workers", "Session Storage", "Indexed Database", etc. like the ones in this picture, which they don't display any content and i don't know how to add them in as well as how important they are to the download.

Also, i don't think Google's cookies can really expire that fast. If they expired, wouldn't it returned 403 instead of redirecting me continuously?

ncw · May 17, 2021, 11:31am

Did that link come from the drive interface? How did you get it?

Maybe you'd be better off using google takeout to dump all the drive data?

DefNotBen · May 17, 2021, 2:51pm

Maybe you'd be better off using google takeout to dump all the drive data?

It was the first alternative way of getting the file that i thought of, but they also had it disabled.

Did that link come from the drive interface? How did you get it?

I logged onto my Drive account and download the file, then i cancelled the download, open the cookies and grab every cookies' names and contents. After i grabbed and put them into rclone behind --header, i click on the download button again to get a new link ( the old link would start giving 403 after 3-5 minutes ), then i paste the new link into rclone.

ncw · May 18, 2021, 8:22am

The google drive interface is a mass of javascript and I wouldn't be suprised if there were some more hoops to jump through other than just adding cookies.

If you use google chrome then you can find the transaction that downloaded the file and use save as curl - it would be worth trying to see if you can get curl to work.

DefNotBen · May 18, 2021, 10:21am

I'm not sure what to do here, so i apologize and please bare with me if i did anything wrong here.
When i download the file, there are 2 requests popped up

1 on the dialogue confirming to download the file due to the file is too large for virus scan

1 after clicking to "Download anyway" button and before the browser receives the file download.

I copied the curl (cmd) of the latter one, removed all circumflexes, put them all into the same line, change -H into --header then proceed to place the download link and all the headers into the rclone copyurl command, the download still ended up being stopped after 10 redirects. I had to remove a proxy command and a header because i don't know how to put them in nor i think they were necessary, which are :

-X "POST" ^
-H "sec-ch-ua: ^\^" Not A;Brand^\^";v=^\^"99^\^", ^\^"Chromium^\^";v=^\^"90^\^", ^\^"Google Chrome^\^";v=^\^"90^\^"" ^

As for the first curl command, i ran it through an online tool which converts to HTTP request. The result returned with details of my file.

ncw · May 18, 2021, 1:35pm

I tried this an I managed to download a file using a curl command like this.

If you added the right things then the same would probably work with

$ curl 'https://doc-00-54-docs.googleusercontent.com/docs/securesc/vnjmc3opf376ui8kak0v67lxxxxxxxxx/q1bflrgl5ls942tqxxxxxxxxxxxxxxxx/162134xxxxxxx/173839140xxxxxxxxxxx/173839140xxxxxxxxxxx/1NDV_VKx3_Owc-WlfP6jexxxxxxxxxxxx?e=download&authuser=0' \
>   -H 'authority: doc-00-54-docs.googleusercontent.com' \
>   -H 'sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"' \
>   -H 'sec-ch-ua-mobile: ?0' \
>   -H 'upgrade-insecure-requests: 1' \
>   -H 'dnt: 1' \
>   -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.72 Safari/537.36' \
>   -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
>   -H 'x-client-data: CJC2yQEIorbJAQipncxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
>   -H 'sec-fetch-site: cross-site' \
>   -H 'sec-fetch-mode: navigate' \
>   -H 'sec-fetch-dest: iframe' \
>   -H 'accept-language: en-US,en;q=0.9' \
>   -H 'cookie: AUTH_k5e5ajmoxxxxxxxxxxxxxxxxxxxxxxxx=17383914xxxxxxxxxxxx|162134xxxxxxx|jv77jmjg0mnfo4inxxxxxxxxxxxxxxxx' \
>   --compressed

However that URL expired very quickly

I think you might have to actually download the files using a browser, or maybe with a script and a headless chromium...

DefNotBen · May 18, 2021, 3:00pm

I figured it out! After i started downloading, i copied curl and modified all the header data so that it'd fit into rclone, then i download the file again so the link and the cookies get renewed because these two were the only things got changed between the downloads, then i copied the link in and changed header data of the cookie in the rclone command, and now it's working!

Thank you for your help!

ncw · May 18, 2021, 3:05pm

Amazing persistence - well done

ivandeex · May 18, 2021, 3:20pm

Would be nice if rclone could read curl cookie-jar files

rclone ... --cookie-jar /path/to/cookies.txt`

https://curl.se/docs/http-cookies.html

However this leaves open the question of passing the whole set of headers between programs.
Probably a flag similar to curl -H @headers.txt, e.g.

rclone ... --headers-from /path/to/headers.txt

ncw · May 18, 2021, 3:29pm

I found this - don't know what format it uses