What is the problem you are having with rclone?
Rclone produce a lot of duplicate folders at Google Drive root.
I'm trying to sync folders from Box to Google Drive (Google Workplace Shared Drive).
Sync works fine but at some point it start creating a lot of duplicate folder on the shared drive root.
I know it is a known issue as Google Drive API allows for duplicate file name.
The use of rlcone dedupe
if recommended in the documentation to deal with it.
Unfortunately it does not suit my needs as i have not been able to find a dedupe strategy keeping the original folder and particularly the original folder id. My folders id are tracked in remote systems and used by APIs.
I've tried to nest it deeper as some comments suggested on the rclone forum.
The folder at the root of the shared drive still get duplicated.
Ex:
MySharedDrive:
/folder1
fileA
fileB
/folder1
fileA
fileB
MySharedDrive:
/folder1
/folder2
fileA
fileB
/folder1
/folder2
fileA
fileB
I have also tried to:
- use
copy
instead ofsync
- play with the different options:
--check-first
--checksum
, creation date options, etc...
None of them seems to fix it.
Last hope: I will try to use the root_folder_id
option with different nesting levels to see if it makes any difference.
I'm willing to invest some time to provide, if possible, a fix.
As I'm new to the rclone code base, could someone provide me with hints on where to start investigating ?
The Google Drive V3 API states that the id
of a file is writable
.
Could we use that to generate a consistent ID and avoid duplicate folder ?
Could this be a listing issue at the drives root ?
Any help is greatly appreciated.
Best regards
Run the command 'rclone version' and share the full output of the command
In production:
rclone v1.61.1
- os/version: debian 11.6 (64 bit)
- os/kernel: 6.1.13 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.19.4
- go/linking: static
- go/tags: none
Which cloud storage system are you using? (eg Google Drive)
- Google Drive with Google Workspace enterprise and some Shared drives
- Box
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
rclone sync 'box:/FOLDER_A' 'SHARED_DRIVE_NAME:/FOLDER_A' --config=rclone.conf --use-json-log --log-format="date,time,longfile" --tpslimit 10 --tpslimit-burst 15 --retries 5 --retries-sleep 1s --create-empty-src-dirs --fast-list --header="as-user: 123456789"
The
as-user
header is specific to Box authentication
The rclone config contents with secrets removed.
[SHARED_DRIVE_NAME]
type = drive
service_account_file = .secrets/gcp_secret_key.json
team_drive = TEAM_DRIVE_ID
impersonate = some.google.workplace.domaine.user@google.fr
[box]
type = box
box_config_file = .secrets/box_sercret_key.json
token = {"access_token":"ACCESS_TOKEN","token_type":"bearer","expiry":"2023-03-08T14:58:44.980Z"}
box_sub_type = enterprise
A log from the command with the -vv
flag
I have not been able to identify a log part where it starts creating duplicate folder.
If you have any insight on what i should be looking for, please let me know.
I had to redact all the remotes, drives and files name as they are personal information.
rclone sync 'box:/FOLDER_A' 'SHARED_DRIVE:/' --config=rclone.conf --tpslimit 10 --tpslimit-burst 15 --retries 5 --retries-sleep 1s --create-empty-src-dirs --fast-list --header="as-user: 14492321437" -vv
2023/03/07 19:02:13 INFO : Starting transaction limiter: max 10 transactions/s with burst 15
2023/03/07 19:02:13 DEBUG : rclone: Version "1.60.1" starting with parameters ["/nix/store/sy0dc99a1gshkpp96sqcabcy74gbgz4r-rclone-1.61.1/bin/rclone" "sync" "box:/FOLDER_A" "SHARED_DRIVE:/" "--config=rclone.conf" "--tpslimit" "10" "--tpslimit-burst" "15" "--retries" "5" "--retries-sleep" "1s" "--create-empty-src-dirs" "--fast-list" "--header=as-user: 123456789" "-vv"]
2023/03/07 19:02:13 DEBUG : Creating backend with remote "box:/FOLDER_A"
2023/03/07 19:02:13 DEBUG : Using config file from "/home/USER/.../rclone.conf"
2023/03/07 19:02:14 DEBUG : fs cache: renaming cache item "box:/FOLDER_A" to be canonical "box:FOLDER_A"
2023/03/07 19:02:14 DEBUG : Creating backend with remote "SHARED_DRIVE:/"
2023/03/07 19:02:15 DEBUG : fs cache: renaming cache item "SHARED_DRIVE:/" to be canonical "SHARED_DRIVE:"
2023/03/07 19:02:26 DEBUG : 05_FILE_NAME.pdf: Size and modification time the same (differ by 0s, within tolerance 1s)
2023/03/07 19:02:26 DEBUG : 05_FILE_NAME.pdf: Unchanged skipping
2023/03/07 19:02:32 INFO : FOLDER_B/04_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:32 INFO : FOLDER_B/0_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:33 INFO : FOLDER_B/01_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:33 INFO : FOLDER_B/03_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:36 INFO : FOLDER_B/05_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:37 INFO : FOLDER_B/05_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:38 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:38 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:42 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:42 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:43 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:43 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:46 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:48 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:48 INFO : FOLDER_B/06_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:49 INFO : FOLDER_B/07_FILE_NAME.pdf: Copied (new)
2023/03/07 19:02:52 INFO : FOLDER_B/08_FILE_NAME.pdf: Copied (new)
[...]