Rclone rc High memory usage

Hi
I am using rclone rc version 1.49 to list files within 3 days max-age in a linux machine and copying them into ObjectStore.
The application sends rclone list command to a remote machine every 5min, after processing retrieved list of files it send rclone rc sync/copy command to copy only the directories that we know do not exists in ObjectStore. The frequency of copy command is only couple of times per hours and each directory is about 20M.
On every occasion the rclone-rc process was using a huge amount of system resources, particularly system memory and caches, well over 60% of system ram (nearly 10 Gb!), another 15 Gb of swap space.

any help to debug such memory leakage will be highly appreciated

That's an old version. You'd want to upgrade and check this link for helping debug a memory related issue:

https://rclone.org/rc/#debugging-rclone-with-pprof

Thanks for the reply. As we are in prod env changing version is not that easy :wink:
Do you think upgrading rclone will resolve the issue? Actually rclone on Windows has no memory issue. We only have problem with linux one.
Also I read about mmap flag. Is the fix in the latest version has something to do with mmap flag. Do we need to enable it to see the difference.

There have been a number of leaks fixed over the last releases so it's definitely likely!

While we are testing new version of rclone I have another question.
We have set JobExpireDuration to 1 day. Do you think this may contribute to such high memory usage?

I am getting error while trying go pprof as described in the above link.
https://ip:port/debug/pprof/goroutine?debug=1: Get https://ip:port/debug/pprof/goroutine?debug=1: x509: cannot validate certificate for <ip> because it doesn't contain any IP SANs

How can I make it not to check the certificate?

Thanks

That’s something on your side as it’s just a public website link.

Are you using a proxy or something else in the middle?

The browser is probably trying to redirect the HTTP traffic to HTTPS which won't work since rclone doesn't include any SSL certificate for the rc. You need to change it to to the http variant, not the https variant.

How many jobs? I'd guess a job might take 1k of RAM to store.

Hi Nick
As soon as running the rclone it starts to create a job per second even if we do not send any rc command. I used job/status to see what are these jobs and they look like core/stats output

rclone rc job/status --json '{"jobid":27438}'  --rc-addr=http://user:pas@ip:1235
{
	"duration": 0.000009134,
	"endTime": "2020-02-04T13:04:51.7177Z",
	"error": "",
	"finished": true,
	"group": "job/27438",
	"id": 27438,
	"output": {
		"bytes": 0,
		"checks": 0,
		"deletes": 0,
		"elapsedTime": 0,
		"errors": 0,
		"fatalError": false,
		"retryError": false,
		"speed": 0,
		"transfers": 0
	},
	"startTime": "2020-02-04T13:04:51.717691Z",
	"success": true
}

rclone generates jobs like this every sec!!
Yesterday after an hour of running rclone I had about 1500 jobs and this number kept growing.
Now, I reduced JobExpireDuration to 2h and currently having only 170 jobs.

But still we have memory problem only on linux machines memory usage just keeps growing linearly until rclone uses about 60% of memory (nearly 10 Gb!, and another 15 Gb of swap space) and the server crash.
We upgraded rclone to v1.51 but no difference

Can you please run with -vv and share the debug log ? We can see what is adding the jobs.

here is the log file from this command:

rclone rcd --rc-addr=ip:port --rc-user=user --rc-pass=pass -vv

2020/02/04 13:58:42 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:42 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

2020/02/04 13:58:42 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:42 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

2020/02/04 13:58:47 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:47 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

2020/02/04 13:58:47 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:47 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

2020/02/04 13:58:52 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:52 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

2020/02/04 13:58:52 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:52 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

2020/02/04 13:58:57 DEBUG : rc: "core/stats": with parameters map[]

2020/02/04 13:58:57 DEBUG : rc: "core/stats": reply map[bytes:0 checks:0 deletes:0 elapsedTime:0 errors:0 fatalError:false retryError:false speed:0 transfers:0]: <nil>

As you see it is just core/stats !

Are you running something against it to cause that? Are you checking stats every 5 seconds?

nothing at all I just ran rclone rcd command then sitting and watching it :wink:

Something is hitting it and running the core/stats command though as shown from the logs. You aren't sure what's doing that?

I ran rclone rcd on 3 different machines for testing and getting same result. I am sure that I am not connecting to rclone remotely in any way.

That's probably step 1 as something is running core/stats every 5 seconds. That should not cause high memory usage (I tested as I ran an infinite loop and did a few thousand of them and memory didn't move).

You'd need to grab a debug log when the high memory usage is going on so we can see what is running.

This will take about a day until reaching crash point. I will set log level and will let you know when memory usage is high
Thanks

here is rclone memory usage. The points that is dropped to zero was the time that we restarted it.
As you cansee it just goes up. It is not even responsible for doing any heavy task. Just sending us list of files within a specific age (about 10000 files) every 5 min. And copying about 100 folders (each 20Mb) during a period of a day.

after setting job expire duration to 2h we still have upward trend but it is not as bad as it used to be.

here is the flags that we set.

rclone rcd \

        --rc-htpasswd=rclone-rc.htpasswd \
        --rc-addr=${RCLONEIP}:5572 \
        --checkers=8 \
        --transfers=4 \
        --rc-cert=rclone-rc-certificate.pem \
        --rc-key=rclone-rc-key.pem \
        --log-file=rclone.log \
        --log-level=INFO \
        --update \
        --use-server-modtime

We also tried to reduce Buffer-size, and using use-mmap flag but no success
What is your thought

Thanks :slight_smile:

Can you share the debug log and the memory dump?

https://rclone.org/rc/#debugging-memory-use