Historically sync memory usage was 40-50% of the system memory. Now it is hitting 100% and rclone will exit without finishing.
The system running rclone is dedicated to this function. It is Ubuntu 20.04 Server, no GUI, with 32GiB RAM.
The source has 30M files, which is quite large, but has been working well for us for many months. We are not sure why we are seeing such high memory usage now.
We normally use 32 checkers, 32 transfers, and 32 azure blob concurrency but we scaled that down today to see if it would help. (It did not.)
Run the command 'rclone version' and share the full output of the command.
rclone v1.66.0
os/version: ubuntu 20.04 (64 bit)
os/kernel: 5.4.0-182-generic (x86_64)
os/type: linux
os/arch: amd64
go/version: go1.22.1
go/linking: static
go/tags: none
Which cloud storage system are you using? (eg Google Drive)
Source - SMB
Destination - Azure Blob
The command you were trying to run (eg rclone copy /tmp remote:tmp)
We have tried reducing checkers/transfers/concurrency values multiple times and rerun the command, but we still get out of memory condition. We got so far as:
Guessing it's the track-renames options that are doing us in. But I don't really understand, since we've been using it for months. The number of objects has slowly been increasing, but nothing like 2x in the last 24 hours. So it's odd that memory usage has more than doubled in that timeframe.
The full debug is too enormous (and most of it is successful processing), but I updated the original post with what I saw when rclone was being run by our script. The last line will say "Killed" and rclone exits with error code 137. I'm guessing it was killed by the kernel/OS due to memory exhaustion.
Edit to add:
Now that I think about it, it's probably rclone that ended itself since it set an exit code. The "Killed" message is probably because it was a background process (& at the end) and the script was doing a wait for it to finish. Also if the OS killed the process I don't know what the exit code would be.
Does that work if I'm not running rclone in "rc" mode?
And yeah, crossing some threshold seems logical on the surface, but we have only incrementally been increasing file count over time. Suddenly RAM usage more than doubled.
Edit to add: never mind, looked and it does require "rc" mode. I'll have to rework some things to test in that mode. But thanks for the tip.
Anyone able to answer the above questions? Curious if there is a way to project the memory requirements for the track-renames option.
I do think this was the root issue - a high level folder was possibly renamed that contains millions of files, which ballooned the memory needs for tracking renames.
Not sure I follow. I would guess --dry-run will consume memory similarly to a production run, with regards to the track-renames feature. So I would expect it to crash at some point as system memory is exhausted.
What I'm hoping to find out is some statistic like for every 1000 files renamed we need xx MB of RAM (or whatever).