Memory leak when using `sync`/`lsf` with S3 and millions of files

pwnapple · August 17, 2022, 6:23pm

What is the problem you are having with rclone?

Increasing memory usage when using sync/lsf with a directory that has millions of files.
I'm trying to follow the same steps here to transfer files from S3 to Wasabi, but the lsf command keeps running out of memory.

Run the command 'rclone version' and share the full output of the command.

$ rclone --version
rclone v1.59.1
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 5.15.0-1017-aws (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.18.5
- go/linking: static

Which cloud storage system are you using? (eg Google Drive)

AWS and Wasabi S3

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone lsf \
  --max-age 2022-08-17T17:09:00Z \
  --absolute \
  --transfers 16 \
  --no-gzip-encoding --ignore-checksum \
  --rc --rc-no-auth \
  --use-mmap --s3-memory-pool-use-mmap \
  aws:tarteel-session-data/sessions > object_names.txt

I've also tried with/out --use-mmap --s3-memory-pool-use-mmap, --transfers, and --no-gzip-encoding --ignore-checksum

The rclone config contents with secrets removed.

[wasabi]
type = s3
provider = Wasabi
endpoint = s3.us-west-1.wasabisys.com
location_constraint = us-west-1
acl = private

[aws]
type = s3
provider = AWS
region = us-west-2
location_constraint = us-west-2

A log from the command with the `-vv` flag

2022/08/17 18:18:41 DEBUG : --max-age 1h9m41.985581351s to 2022-08-17 17:09:00.000015691 +0000 UTC m=-4181.968370533
2022/08/17 18:18:41 DEBUG : rclone: Version "v1.59.1" starting with parameters ["rclone" "lsf" "-vvv" "--max-age" "2022-08-17T17:09:00Z" "--absolute" "--transfers" "16" "--no-gzip-encoding" "--ignore-checksum" "--rc" "--rc-no-auth" "--use-mmap" "--s3-memory-pool-use-mmap" "aws:tarteel-session-data/sessions"]
2022/08/17 18:18:41 NOTICE: Serving remote control on http://localhost:5572/
2022/08/17 18:18:41 DEBUG : Creating backend with remote "aws:tarteel-session-data/sessions"
2022/08/17 18:18:41 DEBUG : Using config file from "/home/ubuntu/.config/rclone/rclone.conf"
2022/08/17 18:18:41 DEBUG : aws: detected overridden config - adding "{CDlB7}" suffix to name
2022/08/17 18:18:42 DEBUG : fs cache: renaming cache item "aws:tarteel-session-data/sessions" to be canonical "aws{CDlB7}:tarteel-session-data/sessions"

Here's a profile of what's going on:

github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.XMLToStruct and github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.(*XMLNode).findNamespaces are the two nodes with consistently increasing memory usage.

My machine has 32GB of RAM and I always kill the command before it runs out of memory.
I can see there's a lot of downloading going on via btop and occasional CPU spikes (which is when I think the GC runs, but not effect on memory).

go tool pprof -text http://127.0.0.1:5572/debug/pprof/heap
Fetching profile over HTTP from http://127.0.0.1:5572/debug/pprof/heap
Saved profile in /home/ubuntu/pprof/pprof.rclone.alloc_objects.alloc_space.inuse_objects.inuse_space.031.pb.gz
File: rclone
Type: inuse_space
Time: Aug 17, 2022 at 6:10pm (UTC)
Showing nodes accounting for 8985.20MB, 99.79% of 9004.31MB total
Dropped 100 nodes (cum <= 45.02MB)
      flat  flat%   sum%        cum   cum%
 6134.86MB 68.13% 68.13%  8350.46MB 92.74%  github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.XMLToStruct
 1872.59MB 20.80% 88.93%  1872.59MB 20.80%  github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.(*XMLNode).findNamespaces (inline)
  570.08MB  6.33% 95.26%   570.08MB  6.33%  github.com/rclone/rclone/backend/s3.(*Fs).newObjectWithInfo
  253.50MB  2.82% 98.08%   253.50MB  2.82%  encoding/xml.(*Decoder).name
      68MB  0.76% 98.83%       68MB  0.76%  encoding/xml.CharData.Copy (inline)
   64.66MB  0.72% 99.55%   634.74MB  7.05%  github.com/rclone/rclone/backend/s3.(*Fs).listDir.func1
   12.50MB  0.14% 99.69%      266MB  2.95%  encoding/xml.(*Decoder).rawToken
       9MB   0.1% 99.79%      275MB  3.05%  encoding/xml.(*Decoder).Token
         0     0% 99.79%   253.50MB  2.82%  encoding/xml.(*Decoder).nsname
         0     0% 99.79%  8358.46MB 92.83%  github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run
         0     0% 99.79%  8358.46MB 92.83%  github.com/aws/aws-sdk-go/aws/request.(*Request).Send
         0     0% 99.79%  8358.46MB 92.83%  github.com/aws/aws-sdk-go/aws/request.(*Request).sendRequest
         0     0% 99.79%  8358.46MB 92.83%  github.com/aws/aws-sdk-go/private/protocol/restxml.Unmarshal
         0     0% 99.79%  8358.46MB 92.83%  github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.UnmarshalXML
         0     0% 99.79%  8358.96MB 92.83%  github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjectsV2WithContext
         0     0% 99.79%  8995.20MB 99.90%  github.com/rclone/rclone/backend/s3.(*Fs).List
         0     0% 99.79%   570.08MB  6.33%  github.com/rclone/rclone/backend/s3.(*Fs).itemToDirEntry
         0     0% 99.79%  8995.20MB 99.90%  github.com/rclone/rclone/backend/s3.(*Fs).list
         0     0% 99.79%  8358.96MB 92.83%  github.com/rclone/rclone/backend/s3.(*Fs).list.func1
         0     0% 99.79%  8995.20MB 99.90%  github.com/rclone/rclone/backend/s3.(*Fs).listDir
         0     0% 99.79%  8358.96MB 92.83%  github.com/rclone/rclone/fs.pacerInvoker
         0     0% 99.79%  8995.20MB 99.90%  github.com/rclone/rclone/fs/list.DirSorted
         0     0% 99.79%  8995.20MB 99.90%  github.com/rclone/rclone/fs/walk.walk.func2
         0     0% 99.79%  8358.96MB 92.83%  github.com/rclone/rclone/lib/pacer.(*Pacer).Call
         0     0% 99.79%  8358.96MB 92.83%  github.com/rclone/rclone/lib/pacer.(*Pacer).call

I was hoping rclone would come to the rescue for my transfer but looks like I'm gonna have to resort to just writing my own script or something...

Animosity022 · August 18, 2022, 12:24am

See:

pwnapple · August 18, 2022, 2:32am

Yes, I read that, and it makes sense.
I used that to decide which instance size to use to run rclone (96vCPU, 512GB mem to play it safe .)

Unfortunately, even after leaving the lsf command running for a few hours, I don't get any results.
It was faster for me to write the below Python script and use boto3 to list all the keys then download/transfer them than it was to wait for rclone to finish.

For ~17M files it took about an hour (1:01:34) vs. rclone is still running...?

import boto3
from tqdm import tqdm

# Estimate from the S3 console
NUM_OBJECTS = 17292115

def list_objects_parallel(bucket, prefix):
    objects = []
    pbar = tqdm(total=NUM_OBJECTS)
    s3_client = boto3.client("s3")
    paginator = s3_client.get_paginator('list_objects_v2')
    pagination_config = {
        # Max API provides anyway
        "PageSize": 1000,
    }

    # Loop based
    for page in paginator.paginate(Bucket=bucket, Prefix=prefix, PaginationConfig=pagination_config):
        results = [f"{key['Key']}\n" for key in page['Contents']]
        objects.extend(results)
        pbar.update(len(results))

    return objects


if __name__ == "__main__":
    bucket_name = "tarteel-session-data"
    key_prefix = "sessions/"

    all_keys = list_objects_parallel(bucket_name, key_prefix)
    with open("all_keys.txt", "w") as fd:
        fd.writelines(all_keys)

While this seems to be a very powerful tool, it looks like this is a big limitation that should probably be documented somewhere (single directories with millions of files basically don't work...)

Animosity022 · August 18, 2022, 2:42am

Feel free to submit a pull request to update the documentation.

pwnapple · August 18, 2022, 2:59am

Sure!
A few questions though:

Would this page be an appropriate place? Amazon S3
Any insight why this is happening (large memory usage and freezing?)
Doesn't have to be too technical, just a high level understanding for my own reference so I can better use this tool in the future when needed.

Animosity022 · August 18, 2022, 3:08am

So in my readings, I don't use S3, but that tends to be where folks have large number of objects as I've only seen it for that.

A PR will allow some other folks to review/add comments as well so it's alway very much appreciated! @ncw is superb at that

pwnapple · August 18, 2022, 1:50pm

Sharing also the script I used to copy files in case someone finds it useful

This took around 2hrs and 15 min. for 15TB of data.

from concurrent.futures import ThreadPoolExecutor, as_completed
from functools import lru_cache, partial
from itertools import chain
import multiprocessing

import boto3
from botocore.errorfactory import ClientError
from tqdm import tqdm


@lru_cache()
def aws_s3_client():
    return boto3.client(
        "s3",
        region_name="us-west-2",
        aws_access_key_id="***",
        aws_secret_access_key="***",
    )


@lru_cache()
def wasabi_s3_client():
    return boto3.client(
        "s3",
        endpoint_url="https://s3.us-west-1.wasabisys.com",
        region_name="us-west-1",
        aws_access_key_id="***",
        aws_secret_access_key="***",
    )


def chunks(l, n):
    """Yield n number of striped chunks from l."""
    for i in range(0, n):
        yield l[i::n]


def move_file(key, bucket, src_client, dest_client):
    try:
        dest_client.head_object(Bucket=bucket, Key=key)
    except ClientError as e:
        original_object = src_client.get_object(
            Bucket=bucket,
            Key=key,
        )
        dest_client.put_object(
            Bucket=bucket, Key=key, Body=original_object["Body"].read()
        )


def process_chunk(chunk, bucket, max_workers=32):
    aws_s3 = aws_s3_client()
    wasabi_s3 = wasabi_s3_client()
    _move_file = partial(move_file, bucket=bucket, src_client=aws_s3, dest_client=wasabi_s3)
    failed_moves = []
    keys = chunk[0]
    position = chunk[1]

    with tqdm(total=len(keys), position=position) as pbar:
        with ThreadPoolExecutor(max_workers=max_workers) as executor:
            futures = {
                executor.submit(_move_file, key): key 
                for key in keys
            }
            for future in as_completed(futures):
                if future.exception():
                    failed_key = futures[future]
                    failed_moves.append(failed_key)
                    pbar.set_postfix_str(f"Failed: {failed_key}")
                pbar.update(1)
    return failed_moves

if __name__ == "__main__":
    bucket_name = "tarteel-session-data"
    # Adjust based on output from `nproc` and your CPU's threads per core (`lscpu`)
    num_proc = 126
    max_workers = 32
    filename = "all_keys.txt"

    with open(filename, "r") as fd:
        lines = fd.readlines()
    lines = [line.strip() for line in lines]
    line_chunks = list(chunks(lines, num_proc))
    # Add and index to help with positioning progress bars
    idx = range(len(line_chunks))
    iterables = list(zip(line_chunks, idx))

    _process_chunk = partial(process_chunk, bucket=bucket_name, max_workers=max_workers)
    with multiprocessing.Pool(num_proc) as p:
        result = list(p.imap(_process_chunk, iterables))

    result = list(chain(*result))
    if len(result) > 0:
        failed_moves = [f"{item}\n" for item in result]
        with open("failed_keys.txt", "w") as fd:
            fd.writelines(failed_moves)

system · September 17, 2022, 1:51pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.