Destination local HDD size more than GoogleDrive source?

What is the problem you are having with rclone?

Downloading Google Drive data to Hard disk but sizes do not match.

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.2
- os/version: Microsoft Windows 11 Pro 21H2 (64 bit)
- os/kernel: 10.0.22000.2176 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.21.3
- go/linking: static
- go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

I ran this and it completed

rclone copy gdcdhd: D:\GoogleDriveComputers\DadHardDrive -v --drive-acknowledge-abuse

but ran this and sizes were different

rclone size gdcdhd:
rclone size D:\GoogleDriveComputers\DadHardDrive:

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[gdcdhd]
type = drive
client_id = XXX
client_secret = XXX
scope = drive
root_folder_id = XXX
token = XXX
team_drive =

A log from the command that you were trying to run with the -vv flag

rclone size gdcdhd:
Total objects: 27.155k (27155)
Total size: 175.069 GiB (187978423564 Byte)

rclone size D:\GoogleDriveComputers\DadHardDrive
Total objects: 27.158k (27158)
Total size: 175.553 GiB (188498255116 Byte)

Why is it that the destination has more files than the source?

Who knows....

Run

rclone check gdcdhd: D:\GoogleDriveComputers\DadHardDrive

to find out what these files are.

Maybe windows creates some files for its own usage? Maybe some files were at the destination before?

Also note that rclone copy does not delete anything if already present at destination. Maybe you should use rclone sync if you want to make destination the same as the source?

ran that and it found 3 files which names had .partial in it. I thought rclone didnt keep partials. i.e if it a transfer was terminated half way it would delete the entire file.

I created a new folder just before running rclone copy so there shouldnt be anything in it. Is it ok to use rclone sync now after using rclone copy? Would it remove anything extra or would it redownload everything again?

Good that you identified the culprit. .partial is used by default on local storage nowadays. It can be disabled by using --inplace flag. Normally they should be deleted - unless there was some problem.

Yes it is ok. Nothing will be downloaded (if not needed). Whatever is already on the destination will be only checked against the source. Only missing files will be downloaded. Anything not present at the source will be deleted. Simply sync will make sure that source==destination.

You can always run it first with -vv --dry-run flag to test what rclone is going to do.

I see. Thank you for the advice and your help. I guess I should have started with rclone sync instead huh...

1 Like

back again with another problem... I ran sync and number of files and sizes of both didnt match. I narrowed it down to one folder but I can't figure out what files are missing. I ran sync and check and it synced completely and checked with 0 differences found but files are missing on destination. I did --differ and --missing-on-dst and still the same.

rclone size gdc:Shared/Personal
Total objects: 356 (356)
Total size: 434.150 MiB (455239094 Byte)
rclone size D:\GoogleDriveComputers\Shared\Personal
Total objects: 352 (352)
Total size: 433.886 MiB (454962504 Byte)

rclone check gdc:Shared/Personal D:\GoogleDriveComputers\Shared\Personal --missing-on-dst missingfiles -vv -P
...
snipped
...
2023/11/08 08:30:39 NOTICE: Local file system at //?/D:\GoogleDriveComputers\Shared\Personal: 0 differences found
2023/11/08 08:30:39 NOTICE: Local file system at //?/D:\GoogleDriveComputers\Shared\Personal: 352 matching files
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:               352 / 352, 100%
Elapsed time:        43.1s
2023/11/08 08:30:39 INFO  :
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:               352 / 352, 100%
Elapsed time:        43.1s

Given that rclone check confirms that both source and destination have the same number of files it let me believe that "missing" four objects are empty directories which are not synced by default. Add --create-empty-src-dirs to your sync and try again.

And if there are still differences you can find them by running:

rclone lsf -R gdc:Shared/Personal | sort > source-sorted
rclone lsf -R D:\GoogleDriveComputers\Shared\Personal | sort > dest-sorted
comm -23 source-sorted dest-sorted

OK I will try that... also do you know why output from --differ doesnt get saved to a text file from the command below? I get a difference.txt in the folder where rclone is but always comes up empty.

rclone check gdc:Shared/Personal D:\GoogleDriveComputers\Shared\Personal --differ difference.txt

I have tested --differ myself and indeed it always comes empty... Maybe something is broken:)

I think it can be reported on github as a bug.

The first 2 lines worked but the comm line doesnt

'comm' is not recognized as an internal or external command,
operable program or batch file.

I ran the first 2 and opened it in excel and compared what the differences were. I'm guessing that's what that 3rd command is for? Turns out Google Drive has duplicates of the same file with the same name in the same folder. Did not know that could happen. It only appears once in the mounted Google Drive folder. Anyway I removed those and there was 1 extra file missing but I think I got it close enough.

I read that rclone copy or rclone sync verifies the file by comparing the md5 hash of the source, is this correct? if it is, is it safe to assume that if files were working previously for me on Google Drive, the copied/synced files would be working as well? (Working as in non corrupt). I did open some and they worked fine ...

Ouch - you are on Windows. It is Linux command. But yes - anything would do to compare two files. Excel is handy too:)

Scourge of Google drive - duplicates. BTW you can clean them using rclone - have a look at rclone dedupe. It was actually implemented because of Google.

If both source and destination support hashes (which is the case in your sync) then when new file is written its hash is checked:

$ rclone sync drive: . -vv
...
2023/11/08 15:18:11 DEBUG : test.txt: Need to transfer - File not found at Destination
2023/11/08 15:18:11 DEBUG : Google drive root 'test6': Waiting for checks to finish
2023/11/08 15:18:11 DEBUG : Google drive root 'test6': Waiting for transfers to finish
2023/11/08 15:18:15 DEBUG : test.txt: md5 = e1930b4927e6b6d92d120c7c1bba3421 OK
2023/11/08 15:18:15 INFO  : test.txt: Copied (new)
...

if you run sync again file is confirmed as being identical based on size and mtime:

$ rclone sync drive: . -vv
...
2023/11/08 15:21:04 DEBUG : test.txt: Size and modification time the same (differ by -552.502µs, within tolerance 1ms)
2023/11/08 15:21:04 DEBUG : test.txt: Unchanged skipping
...

To force using checksums all the time provide --checksum flag

$ rclone sync source: . --checksum -vv
...
2023/11/08 15:22:30 DEBUG : test.txt: md5 = e1930b4927e6b6d92d120c7c1bba3421 OK
2023/11/08 15:22:30 DEBUG : test.txt: Size and md5 of src and dst objects identical
2023/11/08 15:22:30 DEBUG : test.txt: Unchanged skipping
...

Thank you for your help. I really appreciate it. After I've done the md5 check run one more time I believe I am truly done copying my files over from google Drive, successfully. Used takeout before this and had 22 50GB files to download and on my 20th it would not allow me to download it because I have "downloaded it too many times". Anyway thank you for your help again!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.