I'm trying to copy 450k files from one S3-like service to another, but rclone memory consumption is getting it killed.
After about 10min it's already consuming +30GB of RAM, but checked only about 60k files. My VPS has around 60GB of RAM.
I saw the response above and also tried it. It goes much slower, but with constant memory usage for about 1h, then memory starts to raise until the process gets killed. It seems that memory starts to raise when copying from a folder with about 40k files.
I found many posts in this forum complaining about this problem when dealing with many millions of files. This indeed can be a problem in the future for my use case, but is it expected to be a problem even for half a million files?
Maybe I'm missing something, but it should be possible to copy files with constant memory usage, no?
Run the command 'rclone version' and share the full output of the command.
rclone v1.60.0
os/version: arch (64 bit)
os/kernel: 5.18.12-arch1-1 (x86_64)
os/type: linux
os/arch: amd64
go/version: go1.19.2
go/linking: dynamic
go/tags: none
Which cloud storage system are you using? (eg Google Drive)
Contabo (Ceph) to IDrive E2.
The command you were trying to run (eg rclone copy /tmp remote:tmp)
It does use a consistent amount as it's based on objects so more objects == more memory. It's written up much better ncw in a few of those links you've already shared so I won't try to rehash it.
This sounds strange to me, that is more than 500 KB RAM per file. I would worst case expect 2KB RAM per file. How did you determine these two numbers?
I checked rclone RAM usage through Glances ( Glances - An Eye on your system ) and top.
And rclone displayed checked files count in the terminal, because of -P.
What is the highest number of objects (files and folders) in a folder?
It does use a consistent amount as it's based on objects so more objects == more memory. It's written up much better ncw in a few of those links you've already shared so I won't try to rehash it.
Sorry, I couldn't find the explanation.
I guess it does this to reduce remote requests, specially when doing a sync. But even then, it seems to me it could be done with minimal and constant memory usage. I guess I'll try to implement it in Python with Boto3 later, I'm already using this lib on my app anyway...
This will be a bit tricky, because with 1 checker it takes +1h to crash, but I'll try it later. But I can say it uses lots of VIRT and RES, didn't checked SHR.
Nope, I do have all those files in a single folder. Is that bad?
This doesn't match my expectation and I (currently) cannot explain why this happens.
Here is a quick calculation to let you understand why I am puzzled:
Let's say rclone is checking the folder with 208,000 pdf files, then it will need to collect an entire listing from both remotes before comparing them. Let's say that is 430,000 objects in total, to make calculations easy. (and perhaps only half of this if you haven't transferred anything yet)
There is nothing else being checked concurrently (because checkers=1) and there are no transfers ongoing (because --dry-run). So the majority of the 43G RAM used are being used by (something related to) the 430,000 objects.
That is rclone uses 100K RAM per object, where we usually have a rule of thumb of 1-2K.
I will now do some research and small scale tests to understand this better, and possibly call for more expertise.
One of the things I will investigate is --mem-profile, but I haven't tried it before - feel free to try it.
This one was made without checkers=1. I'm not sure why it's "only" using 5GB. I'm running one with checkers=1 and I'll try to compare the value shown at top with the one in the profile.
The difference between the top value and the amount of ram Go thinks it is usuing is normal and it is to do with memory fragmentation and memory which hasn't been released back the the OS yet.
This is direct from the AWS SDK.
I note you are using Ceph so I guess this could be some sort of compatibility issue?
Could you generate an svg from the memory profile and attach it? That has a lot more info in it.
Did you try any previous versions of rclone? It might be worth trying some older versions to see if they have the same problem - this will tell us whether it is a problem with a specific version of the SDK.
I looked through recent bugs in the SDK and I couldn't see any with memory issues.
You can also do this to see how much memory each object takes on average. Point it at a subdirectory that you know will finish.
$ rclone test memory s3:rclone
2022/10/25 15:41:53 NOTICE: 62 objects took 175248 bytes, 2826.6 bytes/object
2022/10/25 15:41:53 NOTICE: System memory changed from 35239176 to 35239176 bytes a change of 0 bytes
Hi! Sorry for the delay! I'm still waiting for a confirmation from their support, but it seems IDrive E2 has a bug...
When listing files from that folder with +80k files it keeps listing the same files forever. That's why rclone size also doesn't work. The same problem happens with boto3. So I really think it's a problem in their API implementation, not in rclone.