My stupidity or a bug?

I use Google Drive for work, but there is no native sync client for Linux. Given the stability and awesomeness of rclone, I recently decided to set up my own version.

I do not like working directly on the mount because, even with vfs-cache, programs some time choke opening or saving documents and the Cache frontend caused a hard lock a couple of times. I cannot take chances like that with my work.

I landed on using mergerfs to overlay a local cache on top of an rclone mount and a cron job to sync everything up daily. I really, really do not want it to touch the native Google Docs files, so I mounted it using this command (the product of trial and error):

/usr/local/bin/rclone mount gdrive: /mnt/rclone/gdrive --ask-password=false --allow-other --drive-skip-gdocs --buffer-size 256M --drive-chunk-size 32M --log-level INFO --log-file /var/log/gdrive.log --timeout 1h --umask 002 --vfs-cache-mode=writes

The meaty part of the sync script looks like this:

### Set Rclone defaults

logger "*** Copying ${CACHE} -> ${REMOTE} ***"
${RCLONE} copy "${CACHE}" "${REMOTE}"
logger "*** DONE ***"
logger "*** Copying ${REMOTE} -> ${CACHE} ***"
${RCLONE} copy "${REMOTE}" "${CACHE}" \
  --exclude='Backup/**' \
logger "*** DONE ***"

When I tested it using rclone ls inside the script, it completely ignored the Google Docs files, as desired, however, after running the script I noticed that all of my (hundreds) of Google Docs/Sheets/Slides files now have .docx, .pptx, .xlxs files along side them.

I cannot be sure if the mount created them or the sync script, because I didn't notice until I went to attach a file to an email.

The expected behavior was that rclone, both in the mount and the sync script, would just ignore the Google Docs native file formats entirely...

Did I do something stupid here? Is there a bug with rclone?

I'm not really sure...

I would guess that at some point you ran the mount or sync without --drive-skip-gdocs and the exported versions of the google docs (the ones with .xls etc) got copied then.

Maybe mergerfs cached the .xls or something like that - sorry I don't know much about how mergerfs works.

Either that or it is a bug in rclone. However I think the --drive-skip-gdocs flag and the env var are working according to my tests.


Ok, then at least I am using the flags correctly for the expected outcome (so the answer is likely my stupidity). I will keep the script as-is and have a look to see if new documents continue to create doppelgangers.

It looks like rclone dedupe --dedupe-mode=first will always delete the doppelganger since the Google Docs version shows up with a .docx/.pptx/.xlsx extension, but a -1 file size. So, in theory, I can run that against my entire drive and then sync it back to the cache to remove them locally... --dry-run isn't a huge help here because of the sheer number of files (very easy to miss one, important exception) so I'm afraid to run it. But is that by design --- i.e., will --dedupe-mode=first reliably keep the native Docs version intact because of the negative file size?

I think what you really want is --dedupe-mode=smallest - I don't think you can rely on the order.

I've implemented that here - can you give it a go? (uploaded in 15-30 mins)

1 Like

I tested it out on one sub-directory and it seems to do the trick --- thanks for rescuing me from my stupidity :slight_smile:

No problems and good thinking using dedupe - not sure I would have thought of that!

I've merged dedupe smallest to master now which means it will be in the latest beta in 15-30 mins and released in v1.51

Thanks! I can confirm that it works in the latest beta, however FYI, it does not show up in the help:

      --dedupe-mode string   Dedupe mode interactive|skip|first|newest|oldest|rename. (default "interactive")
  -h, --help                 help for dedupe

Well spotted! I've added it in (and largest which was also missing).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.