Multiple directories created during upload to Google Drive

Hello Everyone,

When uploading files to Google Drive from parallel threads we face an issue where multiple directories with the same name are created. For example for an upload which just completed

# rclone ls gdrive: | grep GGG000S
     1094 GGG000S-20180823_122050/GGG000S.img.meta.enc
       63 GGG000S-20180823_122050/GGG000S.img.meta.md5
     1094 GGG000S-20180823_122050/GGG000S.img.meta.enc
       63 GGG000S-20180823_122050/GGG000S.img.meta.md5

    # rclone ls gdrive:GGG000S-20180823_122050/
    227352447 GGG000S.img.000001
         1089 GGG000S.img.enc
           82 GGG000S.img.map
           62 GGG000S.img.md5
        20480 GGG000S.img.meta.000001
           12 firstupload
         4096 job16.log

So there are two GGG000S-20180823_122050 created with some of the files in the first directory and some in the other directory but with the same name.

One thing which seems to work is

  • rclone mkdir
  • sleep 60
  • rclone copy gdrive:
  • Continue with regular uploads

This behavior seems specific to Google Drive and for e.g Google Clould does have this issue. Also it does not seem to be because of the parallel uploads since we have seen this behavior even if we first upload a file before uploading the others in parallel.

Is this something which is a known issue with Google Drive ?

Our rclone copy commands are issued as

rclone gdrive://

For example
rclone copy /tmp/GGG000S-20180823_122050/GGG000S.img.000001 gdrive:/GGG000S-20180823_122050/

Yes this is a known issue with Google Drive - see https://github.com/ncw/rclone/issues/28

It is something to do with eventual consistency on drive one thread creates the directory and the other tries to read the directory, doesn’t see it because of eventual consistency and creates another.

Because of this rclone has the rclone dedupe command which will fix duplicate files and directories for you with a variety of resolution strategies.

I noticed that even Google photos on android creates duplicate images on drive so it isn’t just rclone!

I think this behaviour is worse on TeamDrives - are you using one of those?

Issue is visible with any Google Drive.

Our problem is slightly different to issue #28 but most likely due to the same reason. We do not see multiple instances of the same file but just the directory. So some files are under one directory and the others are in another.

Is there something similar to a filesystem flush/sync which can help force sychronization ? ]

Does rclone dedupe have an option which just the first pass where its merges the directories and then exits.

On google drive? I don’t think so alas!

No, but it could quite easily. If you’d like to see that please make a new issue on github - maybe you could help implement it?