How to optimize very large numbers of union - Google Drive?

What is the problem you are having with rclone?

I'm working on a project which require me to union 1200 Google Drive account.
Each account has their own service_account_file, root_folder_id defined, and for every 100 account they share same client_id and client_secret. I find that copying/moving local file to this union take 4-5 minutes overhead before the actual transfers is started. I tried with --fast-list and it has little to no effect. Is there a way to optimize speed for very large numbers of unions like this one?

Run the command 'rclone version' and share the full output of the command.

rclone v1.63.1
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 5.15.0-75-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.20.6
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone move . drive-store:
rclone move . drive-store: --fast-list

The rclone config contents with secrets removed.

[drive0001]
type = drive
scope = drive
client_id = ########################################################################
client_secret = ###################################
root_folder_id = ###################
service_account_file = /root/.config/rclone/drive0001.json

###

[drive1200]
type = drive
scope = drive
client_id = ########################################################################
client_secret = ###################################
root_folder_id = ###################
service_account_file = /root/.config/rclone/drive1200.json

[drive-store]
type = union
upstreams = drive0001:store  ########## ########## ########## drive1200:store
action_policy = ff
create_policy = mfs
cache_time = 0

A log from the command that you were trying to run with the -vv flag

test_rclone_move_drive-store.txt - Pastebin.com
test_rclone_move_drive-store_fast-list.txt - Pastebin.com

1 Like

Not sure anything can be done with your extreme case:)

for every new file you choose remote with most free space - it means that rclone has to query 1200 remotes - it obviously takes time.

For operations involving multiple files transfers you should increase cache_time - I would increase it even beyond default 120s. Setting it to 0s means that free space checks happen for every new file instead of every few minutes.

Ahaha, yes, it's a little bit extreme.

Transferred:        2.930 GiB / 2.930 GiB, 100%, 19.853 MiB/s, ETA 0s
Transferred:            6 / 6, 100%
Elapsed time:      6m38.3s

Hmmm... last time I tried using default option for both create_policy and cache_time it keep giving me this: Failed to copy: googleapi: Error 403: The user's Drive storage quota has been exceeded., storageQuotaExceeded
But now that I tried again, it acts better, not sure what is different. One I am sure of is I had root_folder_id this time for every remote included.

I guess I'll have to experiment even more. Thanks!

Yes but this is because you run out of space on one of the remotes. default epmfs and mfs you used behave in different way. Now probably the same path was created on some other remote where there is still some space left.

It is impossible to advice what policies are correct not knowing what you want to achieve.

If you use random policies results might be not exactly what you want:)

My recommendation - test on few remotes first and only when happy with results extend it to all 1200 remotes.

It's simple, I want best performance from this union :grin:

Ah I remember now, last time some of the account did not have the target directory yet. Now every account has same folder structure.

I'm curious. Within that 120s, if there is newly uploaded file, will the free space count reduced by this file size?
If it is counting, then I understand making it longer will benefit my use case.
If not, then I think it will give me storageQuotaExceeded error every time one of the account's storage quotas get used up, making any other upload started along with the last succussed one invalid.

If you want performance than you have to maintain on every union member enough free space to accept e.g. 120s writing.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.