Rclone just use 4 processes

What is the problem you are having with rclone?

I am working with Rclone to list and copy millions of objects from Azure Blob Storage to AWS S3, but when I run the rclone lsl or rclone copy command I found that it creates 4 or 5 processes on my machine. Is there a way to increase the number of processes my instance will use?

The above in order to finish the activity faster.

Run the command 'rclone version' and share the full output of the command.

rclone v1.57.0

  • os/version: amazon 2 (64 bit)
  • os/kernel: 4.14.252-195.483.amzn2.x86_64 (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.17.2
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Azure blob storage
AWS S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copyto azure:cars/ --max-age 2021-07-01 --min-age 2021-08-01 AWS:cars -P

The rclone config contents with secrets removed.

nothing

A log from the command with the -vv flag

2022/01/05 19:29:35 DEBUG : 1667238/1667238_10_n.jpg: Excluded from sync (and deletion)
2022/01/05 19:29:35 DEBUG : 1667238/1667238_10_s.jpg: Excluded from sync (and deletion)
2022/01/05 19:29:35 DEBUG : 1667238/1667238_10_t.jpg: Excluded from sync (and deletion)
2022/01/05 19:29:35 DEBUG : 1667238/1667238_10_v.jpg: Excluded from sync (and deletion)

I think you have threads and processes confused.

A single rclone command will spawn one process. That process can spawn more threads.

To increase parallelism, you can use:

      --checkers int                         Number of checkers to run in parallel (default 8)

That helps with checking files.

      --transfers int                        Number of file transfers to run in parallel (default 4)

That increases the number of transfers.

Increase to meet your needs.

Why not include that? I'm confused?

You have that remote configured.

and that remote.

Are you not using a rclone.conf?

When I run my command adding these flags --transfers 8 --checkers 12, I have the same amount of threads.

image

How could I create more threads for my process?

You'd have to share what you run, how are you identifying 'threads' and a debug log.

You have a snippet of a log above, no command, output, a barely legible htop (I think) screenshot which might be reporting threads or not. I have no idea.

My base command of a lsl:

felix@gemini:~$ rclone lsl DB: >> /dev/null

Get process ID:

felix      47097   47049  7 15:10 pts/1    00:00:04 rclone lsl DB:

Run

top -H -p 47097

That lists out threads and without changing anything I get way more than 4 threads.

top - 15:10:29 up  7:27,  2 users,  load average: 0.76, 0.49, 0.66
Threads:  17 total,   0 running,  17 sleeping,   0 stopped,   0 zombie
%Cpu(s):  9.3 us,  1.4 sy,  0.4 ni, 87.5 id,  1.3 wa,  0.0 hi,  0.2 si,  0.0 st
MiB Mem :  31995.6 total,    279.6 free,   7029.9 used,  24686.1 buff/cache
MiB Swap:   8192.0 total,   8061.5 free,    130.5 used.  24408.1 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  47099 felix     20   0  748828  47848  29980 S   1.0   0.1   0:00.15 rclone
  47108 felix     20   0  748828  47848  29980 S   1.0   0.1   0:00.19 rclone
  47100 felix     20   0  748828  47848  29980 S   0.7   0.1   0:00.13 rclone
  47109 felix     20   0  748828  47848  29980 S   0.7   0.1   0:00.11 rclone
  47116 felix     20   0  748828  47848  29980 S   0.7   0.1   0:00.14 rclone
  47097 felix     20   0  748828  47848  29980 S   0.3   0.1   0:00.20 rclone
  47102 felix     20   0  748828  47848  29980 S   0.3   0.1   0:00.10 rclone
  47104 felix     20   0  748828  47848  29980 S   0.3   0.1   0:00.09 rclone
  47106 felix     20   0  748828  47848  29980 S   0.3   0.1   0:00.16 rclone
  47107 felix     20   0  748828  47848  29980 S   0.3   0.1   0:00.08 rclone
  47110 felix     20   0  748828  47848  29980 S   0.3   0.1   0:00.06 rclone
  47098 felix     20   0  748828  47848  29980 S   0.0   0.1   0:00.06 rclone
  47101 felix     20   0  748828  47848  29980 S   0.0   0.1   0:00.00 rclone
  47103 felix     20   0  748828  47848  29980 S   0.0   0.1   0:00.14 rclone
  47105 felix     20   0  748828  47848  29980 S   0.0   0.1   0:00.00 rclone
  47111 felix     20   0  748828  47848  29980 S   0.0   0.1   0:00.00 rclone
  47113 felix     20   0  748828  47848  29980 S   0.0   0.1   0:00.14 rclone

I've never seen a need to monitor 'threads' for rclone as it might be better to ask what problem you are trying to overcome.

My base command of a lsl:

rclone lsl AWS:cars >> /dev/null

process:

top -H -p 4395

image

My command with flags

rclone lsl AWS:cars --transfers 8 --checkers 12 >> /dev/null

`
proccess

top -H -p 4383

`
image

Both command have the same amount of threads

Again, "what is the problem you are trying to solve?"

I want to run my command but I would like it to use more threads, in my case both commands use 5 threads, I want to run my command with 10 o more threads.

I'm still not sure what you are asking.

Why do you want it to use more threads?

Perhaps lets take a step back as I think there is still some confusion.

If you want more parallelism for a copy, that's the initial question you posed, which I answered.

If you are trying to get more parallelism for a "ls" command, there isn't any magic there as in API call happens, it asks for a directory, it get results.

If you split that into 100 "threads", it would be god awful slow (not that you can anyway).

So back to my question, what problem are we trying to solve?

Slow speeds on a copyto command? That's more checkers/transfers.

Understood, In case I want to use the copy command I would just need to add the --transfers and --checkers flags.

But I was wondering, why does your lsl command use more threads? I saw your output and you got 17 thread, how is it possible?

Yep and that'll give you performance improvements.

Trying to compare two different systems for thread count on a process isn't thing I'd go down. I'd imagine my remote was doing some recursion maybe or I took it at a different time than yours and it was listing / printing out and doing another directory or something.

Dropbox could be more chatty with DNS or something and could be quite a number of different factors. If your goal is performance, just work through checkers/transfers as you really can do nothing to influence number of threads as that's well under neath the covers other than increasing the flags I've shared.

I was testing transfers and checkers flags but I do not see good results,
This is my command,

rclone copyto azure:cars/ AWS:cars --checkers 12 --transfers 8 -P

process

image

My command continues using 5 threads.

You still seem to be stuck on threads :frowning:

If you want you troubleshoot performance issues, we need a debug log and a full one.

What is not good results?
What performance you are expecting?
What is your internet speed rated at?
What are you trying to acheive?

Without specifics and debug logs, there is nothing to do. I can't stress enough, stop worrying about threads.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.