Transfer 2TB of data from a Google Drive Edu account into Google One

Whoops - as if I wasn't already dragging this longer than I should. Sorry!

I'm running this command now:

rclone check edu:source personal:dest --fast-list

Update
I have run the above and got a shorter (but still pretty long) list of NOTICE lines. Between them, 36 Errors which look like this (I have replaced some sensitive info in the path with "sensitiveinfo":

2022/11/14 15:32:47 ERROR : sensitiveifo/Graphic Design/sensitiveinfo/sensitiveinfo/_collateral/1pager/About Us/8reasons/_assets/.DS_Store: file not in Google drive root 'dest'

In the same path, but different files have been copied...

2022/11/14 15:32:47 NOTICE: sensitiveinfo/Graphic Design/sensitiveinfo/sensitiveinfo/_collateral/PPT/8me_content PPT/_assets/Team Circle/sensitiveinfo.png: Duplicate object found in source - ignoring

Not sure what's the problem here at this point - I thought maybe was the tile of the file ".DS_store" but actually other types of files have failed, too and some other ".DS_store" have bee successfully copied.

that means there is a source file was not copied to the dest.
need to run the rclone copy again but this time at --fast-list

gdrive allows for multiple files to have the same name in the same dir
local file system does not allow for multiple files to have the same name in the same dir.
that creates a problem for rclone.
need to remove the duplicate(s) from the source.
try https://rclone.org/commands/rclone_dedupe/
rclone dedupe edu:source --dry-run --fast-list

Got it! Just so I know what I am doing.

Step 1: run rclone dedupe edu:source --dry-run --fast-list to delete duplicate files in edu
Step 2: run rclone copy edu:source personal:dest --drive-server-side-across-configs --drive-stop-on-upload-limit --fast-list again

Thank you!

sure, you are welcome.

correct.

1 Like

So, I have run this command (multiple times actually):

rclone dedupe edu:source --dry-run --fast-list

and "renamed" the files (they had the same name but they are actually different files.

And then ran this again (multiple times for good measure lol)

rclone copy edu:source personal:dest --drive-server-side-across-configs --drive-stop-on-upload-limit --fast-list

But it's not working - for some reason it doesn't see the "new" files - i.e. the ones with the same name that I have just amended - and still sees the previous one which look like duplicates...

Here's a few:

2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring
2022/11/14 17:26:20 NOTICE: iPad/Procreate/Opera senza titolo.procreate: Duplicate object found in source - ignoring

All of these have been renamed (can see them in my drive) in:

Opera senza titolo (1).procreate
Opera senza titolo (2).procreate
Opera senza titolo (3).procreate
etc..

Any thoughts?

please take a few minutes to read the documentation for each command and for each flag.

"Do a trial run with no permanent changes. Use this to see what rclone would do without actually doing it."

Important: Since this can cause data loss, test first with the --dry-run

Thank you! and sorry - I should have looked into that before making any questions. I really apologize, I know you're taking a lot of your own time to help me and I should know better.

Unfortunately, I'm still a bit confused. If --dry-run does "a trial run with no permanent changes", how come it actually changed the name of my files on the edu when I ran rclone dedupe edu:source --dry-run --fast-list ?

And still can't wrap my head around the fact that rclone copy edu:source personal:dest --drive-server-side-across-configs --drive-stop-on-upload-limit --fast-list didn't work... It doesn't have "--dry-run" in it...

Dry run doesn't actually "do" anything as it does not execute anything as it shows what "will" be done not what "has" been done.

 rclone dedupe GD: --dry-run -vv
2022/11/14 13:32:10 DEBUG : Setting --config "/opt/rclone/rclone.conf" from environment variable RCLONE_CONFIG="/opt/rclone/rclone.conf"
2022/11/14 13:32:10 DEBUG : rclone: Version "v1.60.0" starting with parameters ["rclone" "dedupe" "GD:" "--dry-run" "-vv"]
2022/11/14 13:32:10 DEBUG : Creating backend with remote "GD:"
2022/11/14 13:32:10 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2022/11/14 13:32:10 INFO  : Google drive root '': Looking for duplicate names using interactive mode.
2022/11/14 13:32:12 NOTICE: hosts: Found 2 files with duplicate names
2022/11/14 13:32:12 NOTICE: hosts: Deleting 1/2 identical duplicates (md5 9c62499226af0bc2827abded7cc62e91)
2022/11/14 13:32:12 NOTICE: hosts: Skipped delete as --dry-run is set (size 510)
2022/11/14 13:32:12 NOTICE: hosts: All duplicates removed
2022/11/14 13:32:12 DEBUG : 12 go routines active

The key there is this line:

2022/11/14 13:32:12 NOTICE: hosts: Skipped delete as --dry-run is set (size 510)

If I remove dry-run, it actually deletes the file as shown in the output below.

rclone dedupe GD: -v
2022/11/14 13:34:22 INFO  : Google drive root '': Looking for duplicate names using interactive mode.
2022/11/14 13:34:24 NOTICE: hosts: Found 2 files with duplicate names
2022/11/14 13:34:24 NOTICE: hosts: Deleting 1/2 identical duplicates (md5 9c62499226af0bc2827abded7cc62e91)
2022/11/14 13:34:24 INFO  : hosts: Deleted
2022/11/14 13:34:24 NOTICE: hosts: All duplicates removed
1 Like

with --dry-run, rclone does not actually rename the files, just shows what would happen when you remove --dry-run
"Use this to see what rclone would do without actually doing it"
so if you want rclone to do the renaming, then remove --dry-run

let's deal with one issue at a time,
first, get the dedupe completed.

1 Like

Thank you both!

So I was looking at the local mirror of the drive that was showing the below.

But actually - jojo you already kindly explained earlier how this behaves - I just didn't fully understand what you meant when you explained it :face_exhaling:

On the web version of the drive the file names are all the same, but on the local version they have (1), (2), (3), etc

Since the local version doesn't allow for multiple files to share the same name in the same dir.. I am assuming they got renamed automatically just locally... is that right?

Sorry, I feel like I'm catching up with old news, but now I understand what happened... Again it's never my intention to be disrespectful to whom is kindly helping me - I just take so much time to understand these concepts. :pray:

Anyway! Now I should probably run this (and select "rename")

rclone dedupe edu:source  --fast-list

and then this

rclone copy edu:source personal:dest --drive-server-side-across-configs --drive-stop-on-upload-limit --fast-list

Is that right?

correct, run that command

1 Like

Done! And double checked on the web version of Gdrive and they got renamed (just checking 1 folder, assuming it did the same to the rest). Now onto copying, with the below, is that right?

rclone copy edu:source personal:dest --drive-server-side-across-configs --drive-stop-on-upload-limit --fast-list
1 Like

good,

yes, that is right

1 Like

on second thought,

if you are running rclone copy ..., kill that command and run
rclone sync edu:source personal:dest --drive-server-side-across-configs --drive-stop-on-upload-limit --fast-list

1 Like

Running the latter now... The first didn't do much unfortunately (seems like it didn't copy anything). Finger crosse (and thank you a million!!!)

that is ok.

sorry for the confusion.
if the source changes, then better to run rclone sync, not rclone copy.

Gotcha - unfortunately still no changes though. It gives back a bunch of "NOTICE" ending in "Duplicate object found in destination - ignoring". I have refreshed the page in my personal account and I'm still stuck at 1.54 TB (edu is 1.8TB).

Should I run another of this you think?

rclone check edu:source personal:dest --fast-list

I checked the "ipad" folder where the files we were checking where and they have been copied. Maybe it needs some time to sync the size displayed in the web version of the drive?

then dedupe the dest.

  1. rclone dedupe personal:dest --fast-list --dry-run
  2. if the ouput looks ok, then
    rclone dedupe personal:dest --fast-list
1 Like

this is why i do not use gdrive.
Sometimes, for no reason I've been able to track down, drive will duplicate a file that rclone uploads

I've legit never seen a duplicate ever in my use of Google Drive. I can never make it happen unless I log into the WebUI and do it on purpose.