Sync ONLY gdocs (as ".link.html) from Gdrive to local

Greetings!

As stated in the title, I am trying to sync only Gdocs (.link.html extension) from the cloud (GDrive) to the local storage.

To export Gdocs with the .link.html extension, I use the flag --drive-export-formats link.html.

However, I have been unable to find a way to include only GDocs in the sync process.

As far as I know, in this scenario, they cannot be filtered with --include, because Gdocs don't have a specific extension (or any extension whatsoever) while stored in GDrive.

Sorry for asking questions again, and thanks a lot for your attention.

rclone v1.53.2
Ubuntu 20.04 / 64-bits

You can probably filter them by size since Google docs appear as size -1

Try

--max-size 0

And see if that works

Hey there, sir Craig-Wood!

That was quite a fast answer.

I tried it, but, unfortunately, everything ended up being excluded.

So, zero transfers, and 8 checks. Couldn't find out what was checked, thought, but it surely was not the Gdocs.

This is/was the command: rclone sync gdrive: ~/A-Drive -i -P --max-size 0 --create-empty-src-dirs --backup-dir=A-Rcloneback-date -I --drive-export-formats link.html --log-file ~/a-rclone-log-info1 --log-level DEBUG

[Note: the "date -I" is actually between < ` >, which got excluded due to formatting]

However, upon checking the debug log, it appears that every Gdocs file was listed with the .link.html extension.

Just a really wild guess, but maybe the conversion (from "Gdoc" to .link.html) takes place before the --max-size flag and, as such, the links are excluded, because they actually take up some bytes?

If this is the case, I could try setting a reasonably low --max-size. It could end up messing with some other files, though.

And thanks a lot for your attention, by the way!

PS: where could I report typos in the docs?

I think you are right. Using --max-size 0 would work without using --drive-export-formats link.html

Try using --max-size 500b which sets the max size to 500 bytes.

1 Like

Here are the results of my tests (item 3 contains the actual working command).

NOTE: To clear the clutter, I removed the flags -P --log-file ~/a-rclone-log-info --log-level DEBUG from the commands below, since I used these in all of them.

NOTE(2): Even though documents were still Gdocs (not converted to anything), the log listed them as .docx, .xlxs, etc when using ls.

=======================================

  1. rclone ls gdrive: --include "*.link.html"
  • Total files in the DEBUG log: 26.315

  • Every file shows up in the DEBUG log as "Excluded from sync (and deletion);

  • As such (and as could be expected), Gdocs weren't included, since they still were not .link.html files;

  • Nothing displayed/printed in the terminal other than the elapsed time and the amount of transferred files (zero);

  1. rclone ls gdrive: --max-size 0
  • Total files in the log: 26.127 (188 less than the log above);

  • 178 files shown/printed in the terminal (not shown in the log), including only Gdocs (listed as .docx, xlxs, etc) and a single .png file (zero bytes; just a corrupted file);

  • So, in relation to the test n. 1, a total of 10 files weren't listed in the DEBUG log nor in the terminal;

  • Not sure if 178 is my actual amount of Gdocs.

  1. rclone sync gdrive: ~/A-Drive -n --create-empty-src-dirs --backup-dir=A-Rcloneback --drive-export-formats link.html --include "*.link.html"
  • I assume --create-empty-src-dirs is useless in this case, due to -include "*.link.html"

  • This seems to be the way to go. It sounds like my wild guess may be right: the Gdocs were converted to .link.html and only then the --include flag took place, which would allow the expected sync;

  • "Skipped" shown in the log: 177 unchanged; 4 moves; 3 copies;

  • From the terminal: 183 checks; 2 deleted; 2 renamed; 3 transferred;

  • Not sure how to interpret all the numbers, since they don't seem to match each other's results (from items 1, 2 and 3)

=======================================

So, this may be the solution.

Thanks for your help!

If you want me to provide some extra information and/or test some other options, let me know!

Yes, it looks like that will work.

The gold plated solution would be to have a new option --drive-gdocs-only or similar. This wouldn't be very difficult if someone wanted to work on it as there is already a skip gdocs option.

1 Like

Yep, it worked!

That gold plated solution of yours would be nice, indeed. Unfortunately, I know almost nothing about programming ;-;

On the bright sight, though: since I haven't been able to find a similar question elsewhere, I assume few people would actually use that flag.

Maybe including a topic in the documentation about it would be enough, for the time being.

Note for those who may eventually have the same question and stumble across this post: the solution is --drive-export-formats link.html --include "*.link.html"

1 Like

That is a great solution and as efficient as any.

1 Like

Thanks you so much! I will write it down in a more organized fashion and choose it as the accepted answer.

PS: I randomly stumbled across your "Snake Puzzle Solver". That was quite a creative way to solve the puzzle haha

The solution is to use the flags --drive-export-formats link.html [1] AND --include "*.link.html" [2] (note the quotes, which should be included) alongside the desired operation (sync, for that matter).

NOTES

  • You should avoid mixing any two of --include, --exclude or --filter flags in an rclone command. The results may not be what you expect. Instead, use a --filter... flag [3]
  • As always, test your command with the flags -n / --dry-run or -i / --interactive [4]

[1] Google drive
[2] Rclone Filtering
[3] Rclone Filtering
[4] Global Flags

1 Like

Thanks for the writeup. (And snake puzzle solver - just the way my brain works - once I have the idea that a computer could solve it then thats it!)

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.