Google Drive to Shared Drives bulk move

Hi ,

So I've been reading a lot of posts in the forum but couldn't quite find what I'm looking for. I hope you can help me with my fact-finding mission. Instead of creating multiple posts, I thought it might be good to put all my questions in one post so as to not pollute the forum.

I want to move a large collection of files from GDrive to SharedDrives. These are project folders that have hundreds of files/folders. Each project folder will have to be moved to an individual shared drive. I'm using Google Workspace accounts.

So the issues/questions I have are following

  1. How to best count files from the source (Gdrive project folder) in order to be within SDrives limits of 400k files (is the limit related to the files or folders as well?). Ideally I would like to have a total count of the project folder that I can export to csv/txt. I read something about ls2 command but after testing this it doesn't seem to work (anymore).

  2. When moving files in the UI from GDrive to SDrive you're not allowed to move folders unless admin enables this feature for you. Can Move command move folders?

  3. Since files are already on GDrive I want to move them instead of copying them. In one of the posts I read that 750GB limit does not apply to move command, is that correct?

  4. If move command is performed will the URL-change-behaviour be the same as it is when using Drive UI when moving files from GDrive to SDrive, i.e. folder URL's change, file URL's stay the same? And will move have any impact on file-sharing permissions, I'm guessing not but not sure.

  5. Any tips on correct/good migration command? Like, should I use --drive-server-side-across-configs=true ?

My command will be pretty straightforward. Since I'm using a service account to store all that data I will be impersonating that account to perform all the actions. I'm using v1.52.3

rclone --drive-impersonate accountname@companyx.com move gdrive:projectx\ sdriveprojectx:\ --progress --create-empty-src-dirs --verbose --log-file logfileupload.log

Any feedback is much appreciated!

rclone size remote:path/to/dir will count the files and directories for you recursively.

Yes it can -- you will need --drive-server-side-across-configs and you will need to enable it however it gets enabled!

I don't think it does as all you are doing is changing some pointers in the storage backend rather than copying files. Though copying from gdrive to shareddrive may be different.

I think you are asking whether the folder or file IDs will change? I think that the file IDs will remain the same if you can move the folders. I think the folder IDs will remain the same too.

Yes you will need that if you want to move directories.

Do some small experiments first!

Thanks a lot @ncw!

I will test everything and will report back.

1 Like

So after some testing, this is what I have found:

  1. Moving files with --drive-server=side-across-configs works weird if you don't specify server_side_across_configs = true in your rclone config file under your shared drive mount. If you don't include this it moves everything but leaves folders behind that are not owned by the account (the one you're impersonating). If server_side... is included it all works (I would still do some checks and tests as I had a few times that files were copied instead of moved even though move command was selected)
  2. When moving files from Drive to Shared Drives file Drive ID's don't change, folder Drive ID's however will change - as I expected.
  3. In my first few tries I kept getting the error fileWriterTeamDriveMoveInDisabled it turned out to be related to a setting in Google Admin Panel. If you encounter this issue you can fix It by going to Shared Drive settings and under Migration Options check the box to allow editors to migrate files to shared drives (p.s. this is independent of the admin role you can set to allow a user to move any folder to a shared drive)

I'm still having two more issues:

  1. I can't use rclone size - I tried rclone size gdrive:testupload1\ and rclone size gdrive:testupload1/to/dir but both give an error "failed to size: directory not found" even though that folder is in the root of My Drive. I think I'm not correctly using the command.

  2. My second issue is that I want to upload subfolders. So from the example above I have a sub-folder Folder A in the folder Testupload1 which sits in the root of my drive. How do I specify a command to only copy/move the subfolder??

Your command looks OK, but note that you need / not \ in the remotes. Note also that google drive is case sensitive so testupload1 is a different directory to Testupload1.

Here is an example

ncw@dogger:~$ rclone lsf TestDrive:
A1/
GDocs/
README.md
src/
test/
$ rclone size TestDrive:test
Total objects: 1004
Total size: 159.414 MBytes (167157637 Bytes)
$ rclone size TestDrive:test/test-2612
Total objects: 1
Total size: 64 MBytes (67108864 Bytes)

To copy a subfolder you specify the path to it and you specify the same path in the destination (assuming you want to keep that directory structure).

rclone copy TestDrive:test/test-2612 destination:test/test-2612 

This copies the contents of TestDrive:test/test-2612 to the folder destination:test/test-2612

1 Like

So I was able to test rclone size path:\ but it still only shows the number of files and skips folders. So I tried to do a bigger folder that I know had a lot of data (100GB), after waiting for 10 minutes I stopped the command as this wouldn't work for me.

So I found out you can display the number of files, folders, and their size by just using Drive File Stream (or Drive for Desktop as its called now). This is a much quicker way to do it and it gives a complete overview of files, folders and total size without using any bandwidth ( you don't have to sync folders to your pc to do sizing, it works with Online-files as well).

Did you want it to count folders? That would be useful I think.

I expect it would have worked eventually!

I believe it would! It's just that treesize export provided an easier way to process the export file, for me.

So I'm playing around to get sub-folder move going but it doesn't seem to work the way I want it. This is my command:

rclone --drive-impersonate serviceaccount@domain.com move "gdrive:Projects/Project A" "SharedDrive_ProjectA:Project A" --drive-server-side-across-configs --progress --create-empty-src-dirs --delete-empty-src-dirs --verbose --log-file log.txt

So I want to move individual project folders from 'Projects' folder to the individual shared drives. And after move command is done I want rclone do delete empty 'Project ...' folder from 'Projects'.
When I run above command contents of Project A folder are moved instead of the Project A folder itself. This also causes contents of Project A folder to be deleted but Project A folder is not.

Testing this made think of duplicate names in the folder structure, can rclone perform actions based on file/folder ID's instead of names? That would be so much better as I have to make a huge path to get to Projects folder (it also eliminates issues with duplicated names).

hi,
not sure if this is what you need
https://rclone.org/drive/#copyid

If you don't want this, then put another bit on the end of the destination with the new directory name you want the contents of Project A in.

There is copyid as linked above. You can also set root_drive_id in the config to set the ID of the root directory to start from.

You're right, that would be the only option. And I was able to get a full file path of the Google Drive folders with Treesize so there shouldn't be an issue with duplicated folder names.

The only issue left that I'm facing is that I'm using --delete-empty-src-dirs which doesn't seem to work. I'm executing this flag as part of the command I shared earlier, after Project A content has been moved to a shared drive I want Projet A folder in the source to be deleted. In logs I'm not getting any errors so.. what could be the issue?

Just when you think you're finally done..

So I did a trial run on a google drive folder and I'm getting the following error now:
Couldn't move: googleapi: Error 403: Cannot move a file into a shared drive as a writer when the owner of the file is not a member of that shared drive., fileOwnerNotMemberOfTeamDrive

Account that's being impersonated for migration has Manager role on the shared drive and editor role on the source folder

This is only active if rclone doesn't just move the whole directory and I think you are just moving the whole directory aren't you?

What are you expecting that flag to do?

It is complaining about the owner of the file, not the user running the impersonation. Is the owner of the file a member of the team drive?

I'm not sure what rclone sees as the entire directory? I have a parent and a child folder, I'm moving the contents of the child folder to a SD and I want that child folder to be deleted after that.

You were right about owners not being present in the shared drive. My assumption was that because I can move any/folder to a SD from the UI I assumed rclone would be able to do the same without me having to first get info about all the file owners, etc. An added difficulty is that a lot of files are uploaded by external users, so even if I would be able to get list of current domain-owners it still doesn't move files owned by out-of-domain users. :frowning: I guess there's no other way to do this..?

Ah you mean delete the root folder of the transfer...

There was quite a long argument about this I seem to remember.

Rclone doesn't delete the root folder. Maybe it should with a flag?

You should be able to do a copy instead of a move which will involve download and re-upload. You could do the sync first then fill in any gaps with the copy.

It would be nice if it could, this is usefull when working with high volume of folders. I'm moving/copyig/etc. about 3k root folders that in total have about 500k sub folders.

Yes, I think my last option is to just do a copy. But as I understand it should not require download and reupload if I'm doing it server-side with a flag, right?

Also, don't you mean copy and then sync to fill in the gaps or am I missing something? My Idea was to do a full copy first and then do a delta migration a few days before the actual go-live.

If you want to please make a new issue on github about this on github then we'll have it as an official feature request!

I think I meant to write do the move first then fill in the gaps with copy... as the ones you can move will move across and the ones you can't you'll have to copy.

I think a server side copy will work but if it doesn't for some bizarre permissions reason (a possibility) then you can do download and upload.

If you want to leave a copy in the source then using sync repeatedly is probably the way to go. That will do a full copy the first time you run it then copy deltas. Beware sync will remove new stuff you upload to the destination to make it the same as the source. If you don't want that use copy instead of sync for first and subsequent syncs.

I think I meant to write do the move first then fill in the gaps with copy ... as the ones you can move will move across and the ones you can't you'll have to copy.

This is great! I didn't realize it worked like that! I think this is the only way to cope with the issue I have.

So I'm running a test copy server-side enabled and it's about 10GB of data but what I noticed from my previous tests is that transfer speeds vary a lot. I've had situations when I copied 800MB at 7Mbytes/s but now I'm stuck at around 2-2.4Mbytes. Is there a way to affect that speed? Am I doing something wrong?

my command:
rclone --drive-impersonate account@domain.com copy "gdrive:foldername" "shareddrive:foldername" --drive-server-side-across-configs --progress --create-empty-src-dirs --log-level DEBUG --log-file sd_foldername_log.txt

EDIT:
I just noticed following in debug log:

error reading source directory: list: failed to resolve shortcut: googleapi: Error 403: User Rate Limit Exceeded. Rate of requests for user exceed configured project quota. You may consider re-evaluating expected per-user traffic to the API and adjust project quota limits accordingly. You may monitor aggregate quota usage and adjust limits in the API Console: https://console.developers.google.com/apis/api/drive.googleapis.com/quotas?project=xxxx, userRateLimitExceeded

Looking at the GCP stats, I'm well below daily or user quota at around 600

Great

What you are doing looks OK

These errors do slow things down.

Are you using your own client_id and client_secret?

Yes, I am. Is there a way to double-check this?