No problem to report.
Curious if there is a way to list empty folders/directories recursively?
I know I can use rmdirs but that method is extremely slow as it iterates them 1 at a time.
If I can somehow just get a list of Empty folders I can purge them 1,000 times faster using Google Drive API
I am familiar with lsf function, but cannot see a way to only list empty folders
I was hoping lsf would somehow provide this.
Is there any easy solution to do this?
I normally just do that on a mount.
find /path/to/dir -empty -type d -delete
and be done with it.
Ah, I forgot to say I am trying to get Gdrive Folder IDs so I run them in Batches through the API
The API is super fast, yesterday I was able to purge over 4 million files in a few hours
ncw
(Nick Craig-Wood)
August 16, 2023, 6:08pm
4
You could do an rclone lsf -R
then write a little script to work out if there were files in any given directory. Probably only a few lines of python.
If rmdirs deleted the files in parallel would that work better? Or is it the scanning that takes the time?
BlockquoteIf rmdirs deleted the files in parallel would that work better? Or is it the scanning that takes the time?
Yes, if we could do something like this it would be awesome
rclone rmdirs remote: --transfers 100 --drive-use-trash=false
My goal would be to remove as many empty directories as possible in parallel without having to send to trash
Would have to figure out how many calls I can do at 1 time per SA.
rmdirs is great but I am trying to remove say about 1 million empty folders scattered across hundreds on TD
Since each TD has its own SA users I could easily batch run rmdirs on each TD simultaneously, using the max calls per SA
ncw
(Nick Craig-Wood)
August 17, 2023, 10:10am
6
Give this a try
v1.64.0-beta.7241.3b62a5242.fix-rmdirs-concurrent on branch fix-rmdirs-concurrent (uploaded in 15-30 mins)
The parameter to control how many get deleted at once is --checkers
though not --transfers
. So use --checkers 100
to delete 100 directories at once.
This deletes all the directories at a given level in the file system tree in parallel before deleting the next level up.
Awesome ty very much will try it out in the morning when I my head is on straight!
ncw
(Nick Craig-Wood)
August 17, 2023, 1:59pm
8
Oops, that one had a bug - try this one!
v1.64.0-beta.7241.6ab1e62e7.fix-rmdirs-concurrent on branch fix-rmdirs-concurrent (uploaded in 15-30 mins)
ncw
(Nick Craig-Wood)
August 17, 2023, 5:14pm
9
This is great!
I ran a few tests first on garbage folders I had made on several drives and ran like a charm!
I also found I can run --checkers 200 without issue
for those using windows here is the command I ran on 500 TD at the same time using Windows Terminal.
With Terminal I opened 4 instances, each instance had 10 tabs, and 16 tiles per tab.
The command line I used was:
rclone -vP --drive-use-trash=false rmdirs REMOTE: --checkers 200 && cls
the && cls at the end means if there were no errors like wrong config etc it auto clears the screen so you know it finished without errors.
Windows NOTE:
execute this command as Admin from cmd promt
netsh int ipv4 set dynamicport tcp start=1025 num=64511
This will increase the amount of simultaneous tcp connections you can run at the same time
Without it you will get errors like this:
connectex: Only one usage of each socket address (protocol/network address/port) is normally permitted.
Only if you are doing hundreds of remotes at the same time that is
ncw
(Nick Craig-Wood)
August 18, 2023, 11:05am
11
Great - thanks for testing and the tips.
I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.64
Can you look at the issue where lots of empty directories make --fast-list not work, and there is no override?
I can't run this command
rclone copy drive: cf: --transfers 25 -vP --stats 15s --fast-list --checkers 35 --size-only --multi-thread-streams 0 --no-traverse
Because it disables --fast-list thinking there is a bug because the directories are empty, this causes google drive to rate limit it so much that it takes ~20min for this folder.
There is ~5k subfolders, and they are empty because I only use google drive now for new files, but I need to have the same folder layout as the other remote.
ncw
(Nick Craig-Wood)
August 19, 2023, 2:17pm
13
@random404
you can disable the feature like this
diff --git a/backend/drive/drive.go b/backend/drive/drive.go
index b85ce4a3c..0ddc92e5e 100644
--- a/backend/drive/drive.go
+++ b/backend/drive/drive.go
@@ -1891,7 +1891,7 @@ func (f *Fs) listRRunner(ctx context.Context, wg *sync.WaitGroup, in chan listRE
// drive where (A in parents) or (B in parents) returns nothing
// sometimes. See #3114, #4289 and
// https://issuetracker.google.com/issues/149522397
- if len(dirs) > 1 && !foundItems {
+ if false && len(dirs) > 1 && !foundItems {
if atomic.SwapInt32(&f.grouping, 1) != 1 {
fs.Debugf(f, "Disabling ListR to work around bug in drive as multi listing (%d) returned no entries", len(dirs))
}
Let me know if that works for you and I can make a flag.
1 Like
It works!
I just tested a command and it took 4min and no signs of even starting to transfer or check anything, just google drive limits.
Then I run rclone with this flag changed and the transfer took 29s to complete
Please make this flag!!
Thanks as always and please check your inbox, I need to talk to you in private
for some reason this change even speed ups dropbox transfers too. But I'm not sure why
root@V01:~# rclone copy 1: dropbox: --transfers 25 -vP --stats 15s --fast-list --checkers 35 --size-only --multi-thread-streams 0 --no-traverse
2023-08-22 07:28:33 INFO : Signal received: interrupt
2023-08-22 07:28:33 INFO : Dropbox root 'qnlh8qpss1cebflf5r0f9hgn8c': Committing uploads - please wait...
2023-08-22 07:28:33 INFO : Exiting...
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 2m43.2s
root@V01:~# ./rclone copy 1: dropbox: --transfers 25 -vP --stats 15s --fast-list --checkers 35 --size-only --multi-thread-streams 0 --no-traverse
2023-08-22 07:29:15 INFO : There was nothing to transfer
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Checks: 672 / 672, 100%
Elapsed time: 32.8s
2023/08/22 07:29:15 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Checks: 672 / 672, 100%
Elapsed time: 32.8s
2023/08/22 07:29:15 INFO : Dropbox root 'qnlh8qpss1cebflf5r0f9hgn8c': Committing uploads - please wait...
ncw
(Nick Craig-Wood)
September 8, 2023, 8:20am
16
I stuck that into a flag which you'd use with --drive-fast-list-bug-fix=false
- I think that should work for you.
v1.64.0-beta.7338.408f79920.fix-drive-list on branch fix-drive-list (uploaded in 15-30 mins)
system
(system)
Closed
October 8, 2023, 8:20am
17
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.