[Google Drive] rclone copy missing files on destination drive (Team Drive A to Team Drive B)

What is the problem you are having with rclone?

I've been using rclone for about 2 years by now (really love it!), but I have never been able to successfully copy all of my files from a Team Drive to another Team Drive.
I kept missing about a hundred of files, which I don't even know what files are missing in the transfer.
I've been repeating the exact same command for more than 5 times and still missing some files, despite rclone saying that I have everything copied over to the destination drive.

Run the command 'rclone version' and share the full output of the command.

rclone v1.58.1

  • os/version: Microsoft Windows 10 Pro 21H2 (64 bit)
  • os/kernel: 10.0.19044.1766 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.17.9
  • go/linking: dynamic
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Google Drive (a Team Drive to another Team Drive)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

I wrote my own simple Python script for rclone to be able to rotate between some service acounts, but the command is basically as below

rclone copy Temporary: PrivateTD1: -P --check-first --fast-list --transfers 4 --checkers 6 --max-transfer 745G --create-empty-src-dirs --drive-acknowledge-abuse --drive-stop-on-upload-limit=true --drive-server-side-across-configs --log-file=log.txt --log-level DEBUG --drive-service-account-file=path_to_service_accounts.json

The rclone config contents with secrets removed.

(Should I include the Team Drive ID and the Folder ID as well?)

[PrivateTD1]
type = drive
client_id = *****
client_secret = *****
scope = drive
token = *****
team_drive = *****
root_folder_id = 
service_account_file = 
server_side_across_configs = true

[PrivateTD2]
type = drive
client_id = *****
client_secret = *****
scope = drive
token = *****
team_drive = *****
root_folder_id = 
service_account_file = 
server_side_across_configs = true

[Temporary]
type = drive
client_id = *****
client_secret = *****
scope = drive
token = *****
team_drive = *****
root_folder_id = *****
service_account_file = 
server_side_across_configs = true

A log from the command with the -vv flag

I have removed the transfers log because it contain filenames that may not be safe to be shown.
complete log in my github gist

2022/07/05 18:43:25 DEBUG : rclone: Version "v1.58.1" starting with parameters ["D:\\Utility\\rclone\\rclone" "copy" "Temporary:" "PrivateTD1:" "-P" "--check-first" "--fast-list" "--transfers" "4" "--checkers" "6" "--max-transfer" "745G" "--create-empty-src-dirs" "--drive-acknowledge-abuse" "--drive-stop-on-upload-limit=true" "--drive-server-side-across-configs" "--log-file=log.txt" "--log-level" "DEBUG" "--drive-service-account-file=path_to_service_accounts.json"]
2022/07/05 18:43:25 DEBUG : Creating backend with remote "Temporary:"
2022/07/05 18:43:25 DEBUG : Using config file from "D:\\Utility\\rclone\\rclone.conf"
2022/07/05 18:43:25 DEBUG : Temporary: detected overridden config - adding "{ZJse-}" suffix to name
2022/07/05 18:43:25 DEBUG : fs cache: renaming cache item "Temporary:" to be canonical "Temporary{ZJse-}:"
2022/07/05 18:43:25 DEBUG : Creating backend with remote "PrivateTD1:"
2022/07/05 18:43:25 DEBUG : PrivateTD1: detected overridden config - adding "{ZJse-}" suffix to name
2022/07/05 18:43:25 DEBUG : fs cache: renaming cache item "PrivateTD1:" to be canonical "PrivateTD1{ZJse-}:"
2022/07/05 18:43:25 INFO  : Google drive root '': Running all checks before starting transfers
2022/07/05 18:43:32 DEBUG : Google drive root '': Disabling ListR to work around bug in drive as multi listing (3) returned no entries
2022/07/05 18:43:32 DEBUG : Google drive root '': Recycled 3 entries
2022/07/05 18:43:33 DEBUG : Google drive root '': Re-enabling ListR as previous detection was in error

----- removed transfer logs -----

2022/07/05 18:43:43 INFO  : Google drive root '': Checks finished, now starting transfers
2022/07/05 18:43:43 DEBUG : Google drive root '': Waiting for transfers to finish
2022/07/05 18:43:43 INFO  : There was nothing to transfer
2022/07/05 18:43:43 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Checks:              9836 / 9836, 100%
Elapsed time:        18.8s

2022/07/05 18:43:43 DEBUG : 34 go routines active

Total files on each drives

Temporary (source)
----------
Total objects: 9.933k (9933)
Total size: 2.984 TiB (3281416834376 Byte)

PrivateTD1 (destination)
----------
Total objects: 9.837k (9837)
Total size: 2.985 TiB (3282195126567 Byte)

PrivateTD2 (destination)
----------
Total objects: 9.837k (9837)
Total size: 2.985 TiB (3282195126567 Byte)

Check result

rclone check Temporary: PrivateTD1: -P --check-first --fast-list --one-way --log-file=rclone_check.txt --log-level DEBUG

check log on github gist

Dedupe result (--dry-run)

rclone dedupe Temporary: --dry-run -P --check-first --fast-list --log-file=rclone_dedupe.txt --log-level DEBUG

dedupe log on github gist

Without seeing a full log, it's nearly impossible to guess what's going on.

Some fixes in the beta currently may help but just a guess.

thank you for the fast reply and I'm sorry for redacting the logs (which I know I shouldn't be redacting it)
I've created a github gist containing the complete transfer logs in here

Did you check for duplicates with rclone dedupe?

What do the missing files look like. Do they have anything in common?

1 Like

I've added a check and dedupe result above.
Could you please take a look into it?

I had to add --dry-run while doing dedupe, because I don't want to loose any files and I'm not sure what files might have duplicates, as I haven't touched the drive for months and some of the files never been touched again for more than a year.

It's tough to update posts as it's very hard to follow as it's best to just reply.

If you have duplicates, you'd have to decide what you want to keep or not keep. It's interactive so up to you as ncw can't answer what to keep for you.

rclone dedupe

Those duplicates would be the errors and why it doesn't match.

1 Like

You know what? I yolo'd it and ran dedupe on the source drive, as Nick suggested
aaaaaaand... I seems to have 1:1 file numbers and even total size as the destination drives

thanks for helping me out! you guys are amazing!

2022/07/05 23:38:36 DEBUG : rclone: Version "v1.58.1" starting with parameters ["rclone" "size" "Temporary:" "-vv" "-P" "--fast-list" "--check-first"]
2022/07/05 23:38:36 DEBUG : Creating backend with remote "Temporary:"
2022/07/05 23:38:36 DEBUG : Using config file from "D:\\Utility\\rclone\\rclone.conf"
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:        15.2sTotal objects: 9.836k (9836)
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:        15.5s
2022/07/05 23:38:52 DEBUG : 20 go routines active

2022/07/05 23:38:52 DEBUG : rclone: Version "v1.58.1" starting with parameters ["rclone" "size" "PrivateTD1:" "-vv" "-P" "--fast-list" "--check-first"]
2022/07/05 23:38:52 DEBUG : Creating backend with remote "PrivateTD1:"
2022/07/05 23:38:52 DEBUG : Using config file from "D:\\Utility\\rclone\\rclone.conf"
2022-07-05 23:39:00 DEBUG : Google drive root '': Disabling ListR to work around bug in drive as multi listing (2) returned no entries
2022-07-05 23:39:00 DEBUG : Google drive root '': Recycled 2 entries
2022-07-05 23:39:01 DEBUG : Google drive root '': Re-enabling ListR as previous detection was in error
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:        15.2sTotal objects: 9.836k (9836)
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:        15.3s
2022/07/05 23:39:08 DEBUG : 18 go routines active

2022/07/05 23:39:08 DEBUG : rclone: Version "v1.58.1" starting with parameters ["rclone" "size" "PrivateTD2:" "-vv" "-P" "--fast-list" "--check-first"]
2022/07/05 23:39:08 DEBUG : Creating backend with remote "PrivateTD2:"
2022/07/05 23:39:08 DEBUG : Using config file from "D:\\Utility\\rclone\\rclone.conf"
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:        14.6sTotal objects: 9.836k (9836)
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:        14.9s
2022/07/05 23:39:23 DEBUG : 20 go routines active

Google Drive and a few other providers are annoying in that regard as any OS / file system on an OS doesn't allow you to have duplicates, but Drive does which is strange.

Happy you to got it worked out!

1 Like

I feel like these duplicates only appear after I use rclone and I don't know what caused it.
Is it actually possible for rclone to cause duplicate?

Possible is quite a loophole for a word so I'm sure it's possible, but generally unlikely.

I used drive for years and never had a duplicate at all but my use case is copying from a local source to a drive. You get into duplicate situations perhaps more likely copying drive to drive as the rules are "loose".

Generally, rclone will error out on duplicates though from what I've seen. Like in your logs:

2022/07/05 18:43:42 NOTICE: Anime/Super Cub: Duplicate directory found in source - ignoring
2022/07/05 18:43:42 NOTICE: 00TEMPORARY00/OLD FILES FROM BEFORE 2020-09-20/Shiro39 (Private)/Anime Vectos/Wallpaper/exit.png: Duplicate object found in source - ignoring
2022/07/05 18:43:42 NOTICE: 00TEMPORARY00/OLD FILES FROM BEFORE 2020-09-20/Shiro39 (Private)/Anime Vectos/Wallpaper/penguin.png: Duplicate object found in source - ignoring

I pulled out a few examples. So how they got in the source would be a guessing game as rclone generally can't copy duplicate files so something else maybe?

I'm still not sure yet what caused the duplicates.
I pushed the files from a ubuntu vps, and there's no duplicate on the original files.
then I try to copy the whole drive to another drive, I'm having duplicates.

this is still a guessing game for me, but I'm happy enough that my current issue is sorted out

anyway, thanks again!
I won't reply further so the thread can be closed soon (or maybe you can close it, if that's possible)

Rclone does make duplicates occasionally. I think it is to do with eventual consistency problems, as in rclone uploads some files then lists the directory and the files aren't there for a second or two. The duplicates are more common with Shared drives than the main drive.

I suspect it is a google problem rather than an rclone problem, but either way, that is why rclone dedupe was born.

1 Like

So yesterday I uploaded a file from my local storage to one of the Team Drive.
Next, I rclone synced this TD to another TD.

When I checked for the number of files, there was a difference of 1 or 2 files.
Then I tried to run dedupe with --dry-run, resulted that indeed I have duplicate.
But the weird thing is that the duplicate is from a file that has been sitting there for months ago, and not of the newly uploaded file.

I had no more duplicate after I ran the dedupe as you've suggested above, that answered my issue in this topic.
So that's weird that I suddenly have duplicate but the duplicate wasn't from the file I uploaded locally.

Well, I guess I should always be running dedupe after performing a copy / sync, be it local upload or server-side...

1 Like

I've never had rclone make a duplicate in my use as it's generally something else doing it from my experience. I'm not saying it's not possible, but what else is using the drives?

I don't think there's anything else using the drives other than rclone...

After I ran the dedupe as Nick has suggested above, I no longer have any duplicates and the number of files match between drive A and B... exactly the same.

But then:
--> ran dedupe as Nick suggested above
--> no more dedupe at all on all of the drives, be happy
--> uploaded a file to one of the drive
--> rclone synced it to another drive
--> duplicate appeared out of nowhere

The solution to this... right, dedupe yet again.

You'd have to share a log file to see what's going on to see where the issue is.

That via rclone? What command?

That's the log file we'd want to see with the specific file pointed out that made a duplicate.

That via rclone? What command?

rclone copy D:/file_name.ext DriveName: -vv -P --fast-list --check-first --transfers 1 --checkers 4

That's the log file we'd want to see with the specific file pointed out that made a duplicate.

I'm so sorry, but I forgot to capture the log because I thought my issue has been solved by doing dedupe.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.