Use rclone on google colab

x0b · June 24, 2020, 9:50pm

I've often used rclone on Colab because the default drive connector has a timeout problem.

Create a new code cell, paste the installation command and execute it.

! curl https://rclone.org/install.sh | sudo bash

Read the documentation https://rclone.org/docs/ to understand rclone - how the config system works, etc.

If you have an existing rclone config, you can write the config file directly into a cell:

!mkdir /root/.config/rclone/
config = """
RCLONE_ENCRYPT_V0:
8nKbmBes1LhfViVPk1b4rfmXeDLwjOgFRCQSKQJARLt43kY6hyutSoKXZ+YelRSlNGStD1wNmA3scjoiDgdYEDpdx/DQsLUzytbwsOc3cbnZyWaywPvSFqkG
"""
with open('/root/.config/rclone/rclone.conf', 'w') as file:
  file.write(config)

Afterwards, you can call rclone like any other native command:

!rclone copy a b

A few things from experience:

Keep an eye on the traffic - I've heard from people that had their account banned for excessive colab usage. Personally, I have only received temporary GPU bans though, so who knows.
If you want persistence, do not use rclone mount - instead, schedule repeated runs of rclone copy, usually with much better performance.
Colab resets and recycles your docker instance when not in use (or >12h). If you save your file system at the end of your notebook, you can restore it in the beginning of your notebook the next time you use it.