My stupidity or a bug?

I use Google Drive for work, but there is no native sync client for Linux. Given the stability and awesomeness of rclone, I recently decided to set up my own version.

I do not like working directly on the mount because, even with vfs-cache, programs some time choke opening or saving documents and the Cache frontend caused a hard lock a couple of times. I cannot take chances like that with my work.

I landed on using mergerfs to overlay a local cache on top of an rclone mount and a cron job to sync everything up daily. I really, really do not want it to touch the native Google Docs files, so I mounted it using this command (the product of trial and error):

/usr/local/bin/rclone mount gdrive: /mnt/rclone/gdrive --ask-password=false --allow-other --drive-skip-gdocs --buffer-size 256M --drive-chunk-size 32M --log-level INFO --log-file /var/log/gdrive.log --timeout 1h --umask 002 --vfs-cache-mode=writes

The meaty part of the sync script looks like this:

### Set Rclone defaults

logger "*** Copying ${CACHE} -> ${REMOTE} ***"
${RCLONE} copy "${CACHE}" "${REMOTE}"
logger "*** DONE ***"
logger "*** Copying ${REMOTE} -> ${CACHE} ***"
${RCLONE} copy "${REMOTE}" "${CACHE}" \
  --exclude='Backup/**' \
logger "*** DONE ***"

When I tested it using rclone ls inside the script, it completely ignored the Google Docs files, as desired, however, after running the script I noticed that all of my (hundreds) of Google Docs/Sheets/Slides files now have .docx, .pptx, .xlxs files along side them.

I cannot be sure if the mount created them or the sync script, because I didn't notice until I went to attach a file to an email.

The expected behavior was that rclone, both in the mount and the sync script, would just ignore the Google Docs native file formats entirely...

Did I do something stupid here? Is there a bug with rclone?

I'm not really sure...

I would guess that at some point you ran the mount or sync without --drive-skip-gdocs and the exported versions of the google docs (the ones with .xls etc) got copied then.

Maybe mergerfs cached the .xls or something like that - sorry I don't know much about how mergerfs works.

Either that or it is a bug in rclone. However I think the --drive-skip-gdocs flag and the env var are working according to my tests.


Ok, then at least I am using the flags correctly for the expected outcome (so the answer is likely my stupidity). I will keep the script as-is and have a look to see if new documents continue to create doppelgangers.

It looks like rclone dedupe --dedupe-mode=first will always delete the doppelganger since the Google Docs version shows up with a .docx/.pptx/.xlsx extension, but a -1 file size. So, in theory, I can run that against my entire drive and then sync it back to the cache to remove them locally... --dry-run isn't a huge help here because of the sheer number of files (very easy to miss one, important exception) so I'm afraid to run it. But is that by design --- i.e., will --dedupe-mode=first reliably keep the native Docs version intact because of the negative file size?

I think what you really want is --dedupe-mode=smallest - I don't think you can rely on the order.

I've implemented that here - can you give it a go? (uploaded in 15-30 mins)

I tested it out on one sub-directory and it seems to do the trick --- thanks for rescuing me from my stupidity :slight_smile:

No problems and good thinking using dedupe - not sure I would have thought of that!

I've merged dedupe smallest to master now which means it will be in the latest beta in 15-30 mins and released in v1.51

Thanks! I can confirm that it works in the latest beta, however FYI, it does not show up in the help:

      --dedupe-mode string   Dedupe mode interactive|skip|first|newest|oldest|rename. (default "interactive")
  -h, --help                 help for dedupe

Well spotted! I've added it in (and largest which was also missing).

