Rclone takes 18 minutes to start copying then uses 8GB+ of memory

We currently use swift for production storage and s3 for backups. We are trying to use rclone to run nightly backups from swift to s3 but are running into high memory usage. Some of our servers are transferring tens of thousands of files. The rclone process starts then does nothing for 18 minutes at which time it uses up to 5GB of memory. Once the transfer starts we’re seeing 8GB+ of memory use. I have reviewed some other posts regarding memory usage and running with --cache-chunk-no-memory doesn’t seem to help. I’ve even tried getting a list of files first then transferring using “–files-from” but memory usage still climbs to 8GB+.

Can you share what you are actually doing with the command?

Are you using a mount? A copy command?

This is a copy and we’re running it with the following parameters:

–ignore-checksum
–cache-workers=8
–transfers=24
–checkers=16
–cache-tmp-wait-time=10m
–buffer-size=0
–s3-upload-concurrency=1
–swift-chunk-size=100m
–s3-chunk-size=5m
–cache-chunk-no-memory
–ignore-size

And you are using the cache backend instead of just the vfs? I’d remove the cache backend as I don’t think that’s a good use case for it. I forget if fast-list is supported on S3 or not as my primary use is Google Drive.

Not sure why the memory would be so high as that might be related back to the cache backend so without it, it might work a bit better as well.

Sorry but I’m not completely familiar with some of the terminology. Are you suggesting I remove the cache-workers parameter? I’m currently running a test with:

–ignore-checksum
–transfers=24
–checkers=16
–buffer-size=0
–cache-chunk-no-memory
–ignore-size

By the way, I’m using rclone 1.44

What’s your rclone.conf look like (without the keys/passwords)?

/home/osi/.config/rclone/rclone.conf

[swift]
type = swift
env_auth = true
user =
key =
auth = http://192.168.100.50:5000/v2.0/
user_id =
domain =
tenant =
tenant_id =
tenant_domain =
region =
storage_url =
auth_token =
auth_version =
endpoint_type = internal

[s3-swift-backup]
type = s3
provider = AWS
env_auth = true
access_key_id =
secret_access_key =
region = eu-west-1
endpoint =
location_constraint = eu-west-1
acl = private
server_side_encryption =
storage_class =

Gotcha, so you are not using the cache backend at all (which is fine) so none of those cache parameters would be needed.

I’d hope someone that uses S3 can help out with performance tuning as for google drive, it really isn’t the same.

Can you share the full command line you are using please?

/usr/sbin/rclone -v copy swift:swift_bucket_target s3-swift-backup:s3_bucket_target --log-file=/home/oscar/rclone.log --ignore-checksum --cache-workers=8 --transfers=2 --cache-tmp-wait-time=10m --buffer-size=0 --s3-upload-concurrency=1 --swift-chunk-size=100m --s3-chunk-size=5m

we have tried with transfers as high as 64 and the same with checkers but the memory utilization is always the same.

I tried this with rclone 1.44 with the same memory results:

/usr/sbin/rclone144 -v copy swift:swift_bucket_target s3-swift-backup:s3_bucket_target --log-file=/home/oscar/rclone.log --ignore-checksum --transfers=24 --checkers=16 --buffer-size=0 --cache-chunk-no-memory --ignore-size

Hmm…

Can you run your command with the --rc flag, then follow the instructions here: https://rclone.org/rc/#debugging-rclone-with-pprof

And post the result of this here? That will tell us what is using the memory.

go tool pprof -web http://localhost:5572/debug/pprof/heap

Are you using the cache backend?

The output is included below. At this time rclone has completed the copy and is now running checks. Running top on the system indicates rclone is using over 7GB of memory:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4819 pvs 20 0 9569268 7.291g 18484 S 13.3 9.3 8:30.12 rclone144

NOTE: I’m trying this with rclone 1.44

Fetching profile over HTTP from http://localhost:5572/debug/pprof/heap
Saved profile in /home/pvs/pprof/pprof.rclone144.alloc_objects.alloc_space.inuse_objects.inuse_space.155.pb.gz
File: rclone144
Type: inuse_space
Time: Nov 13, 2018 at 11:33am (UTC)
Showing nodes accounting for 3.35GB, 98.73% of 3.39GB total
Dropped 113 nodes (cum <= 0.02GB)
flat flat% sum% cum cum%
0.94GB 27.69% 27.69% 0.94GB 27.69% github.com/ncw/rclone/backend/swift.(*Fs).newObjectWithInfo
0.90GB 26.50% 54.19% 0.90GB 26.50% encoding/json.(*decodeState).literalStore
0.69GB 20.41% 74.60% 0.69GB 20.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.XMLToStruct
0.58GB 17.02% 91.62% 0.58GB 17.02% github.com/ncw/rclone/backend/s3.(*Fs).newObjectWithInfo
0.24GB 7.10% 98.73% 0.24GB 7.10% github.com/ncw/rclone/fs/march.matchListings
0 0% 98.73% 0.90GB 26.50% encoding/json.(*Decoder).Decode
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).array
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).object
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).unmarshal
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).value
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/backend/s3.(*Fs).List
0 0% 98.73% 0.58GB 17.02% github.com/ncw/rclone/backend/s3.(*Fs).itemToDirEntry
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/backend/s3.(*Fs).list
0 0% 98.73% 0.69GB 20.41% github.com/ncw/rclone/backend/s3.(*Fs).list.func1
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/backend/s3.(*Fs).listDir
0 0% 98.73% 0.58GB 17.02% github.com/ncw/rclone/backend/s3.(*Fs).listDir.func1
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/backend/s3.(*Object).ModTime
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/backend/s3.(*Object).readMetaData
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/backend/s3.(*Object).readMetaData.func1
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).List
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).list
0 0% 98.73% 0.94GB 27.69% github.com/ncw/rclone/backend/swift.(*Fs).list.func1
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot.func1
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).listDir
0 0% 98.73% 3.11GB 91.62% github.com/ncw/rclone/fs/list.DirSorted
0 0% 98.73% 0.24GB 7.13% github.com/ncw/rclone/fs/march.(*March).Run.func1
0 0% 98.73% 3.11GB 91.62% github.com/ncw/rclone/fs/march.(*March).makeListDir.func1
0 0% 98.73% 0.24GB 7.13% github.com/ncw/rclone/fs/march.(*March).processJob
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/fs/march.(*March).processJob.func1
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/fs/march.(*March).processJob.func2
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/operations.Equal
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/operations.NeedTransfer
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/operations.equal
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/sync.(*syncCopyMove).pairChecker
0 0% 98.73% 0.71GB 20.96% github.com/ncw/rclone/lib/pacer.(*Pacer).Call
0 0% 98.73% 0.71GB 20.96% github.com/ncw/rclone/lib/pacer.(*Pacer).call
0 0% 98.73% 0.71GB 20.90% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run
0 0% 98.73% 0.71GB 20.90% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Send
0 0% 98.73% 0.70GB 20.50% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/restxml.Unmarshal
0 0% 98.73% 0.69GB 20.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.UnmarshalXML
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).HeadObject
0 0% 98.73% 0.69GB 20.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjects
0 0% 98.73% 0.90GB 26.50% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).Objects
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).ObjectsWalk
0 0% 98.73% 0.90GB 26.50% github.com/ncw/rclone/vendor/github.com/ncw/swift.readJson

According to what Animosity022 responded a few days ago, we’re not using the cache backend.

According to the rclone log, at this point the copy of files is still going:

Fetching profile over HTTP from http://localhost:5572/debug/pprof/heap
Saved profile in /home/pvs/pprof/pprof.rclone144.alloc_objects.alloc_space.inuse_objects.inuse_space.121.pb.gz
File: rclone144
Type: inuse_space
Time: Nov 13, 2018 at 11:27am (UTC)
Showing nodes accounting for 3.23GB, 99.40% of 3.25GB total
Dropped 21 nodes (cum <= 0.02GB)
flat flat% sum% cum cum%
0.94GB 28.94% 28.94% 0.94GB 28.94% github.com/ncw/rclone/backend/swift.(*Fs).newObjectWithInfo
0.90GB 27.79% 56.72% 0.90GB 27.79% encoding/json.(*decodeState).literalStore
0.65GB 20.15% 76.87% 0.69GB 21.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.XMLToStruct
0.51GB 15.57% 92.44% 0.51GB 15.57% github.com/ncw/rclone/backend/s3.(*Fs).newObjectWithInfo
0.10GB 3.04% 95.48% 0.60GB 18.61% github.com/ncw/rclone/backend/s3.(*Fs).listDir.func1
0.10GB 3.04% 98.53% 0.10GB 3.04% github.com/ncw/rclone/backend/swift.(*Fs).listDir.func1
0.02GB 0.48% 99.01% 0.02GB 0.56% encoding/xml.(*Decoder).rawToken
0.01GB 0.39% 99.40% 0.03GB 0.95% encoding/xml.(*Decoder).Token
0 0% 99.40% 0.90GB 27.79% encoding/json.(*Decoder).Decode
0 0% 99.40% 0.90GB 27.79% encoding/json.(*decodeState).array
0 0% 99.40% 0.90GB 27.79% encoding/json.(*decodeState).object
0 0% 99.40% 0.90GB 27.79% encoding/json.(*decodeState).unmarshal
0 0% 99.40% 0.90GB 27.79% encoding/json.(*decodeState).value
0 0% 99.40% 1.31GB 40.23% github.com/ncw/rclone/backend/s3.(*Fs).List
0 0% 99.40% 0.51GB 15.57% github.com/ncw/rclone/backend/s3.(*Fs).itemToDirEntry
0 0% 99.40% 1.31GB 40.23% github.com/ncw/rclone/backend/s3.(*Fs).list
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/backend/s3.(*Fs).list.func1
0 0% 99.40% 1.31GB 40.23% github.com/ncw/rclone/backend/s3.(*Fs).listDir
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/backend/swift.(*Fs).List
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/backend/swift.(*Fs).list
0 0% 99.40% 1.04GB 31.98% github.com/ncw/rclone/backend/swift.(*Fs).list.func1
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot.func1
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/backend/swift.(*Fs).listDir
0 0% 99.40% 3.25GB 100% github.com/ncw/rclone/fs/list.DirSorted
0 0% 99.40% 3.25GB 100% github.com/ncw/rclone/fs/march.(*March).makeListDir.func1
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/fs/march.(*March).processJob.func1
0 0% 99.40% 1.31GB 40.23% github.com/ncw/rclone/fs/march.(*March).processJob.func2
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/lib/pacer.(*Pacer).Call
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/lib/pacer.(*Pacer).call
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Send
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/restxml.Unmarshal
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.UnmarshalXML
0 0% 99.40% 0.70GB 21.62% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjects
0 0% 99.40% 0.90GB 27.79% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).Objects
0 0% 99.40% 1.94GB 59.77% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).ObjectsWalk
0 0% 99.40% 0.90GB 27.79% github.com/ncw/rclone/vendor/github.com/ncw/swift.readJson

and now checks are being done:

Fetching profile over HTTP from http://localhost:5572/debug/pprof/heap
Saved profile in /home/pvs/pprof/pprof.rclone144.alloc_objects.alloc_space.inuse_objects.inuse_space.147.pb.gz
File: rclone144
Type: inuse_space
Time: Nov 13, 2018 at 11:32am (UTC)
Showing nodes accounting for 3.35GB, 98.73% of 3.39GB total
Dropped 113 nodes (cum <= 0.02GB)
flat flat% sum% cum cum%
0.94GB 27.69% 27.69% 0.94GB 27.69% github.com/ncw/rclone/backend/swift.(*Fs).newObjectWithInfo
0.90GB 26.50% 54.19% 0.90GB 26.50% encoding/json.(*decodeState).literalStore
0.69GB 20.41% 74.60% 0.69GB 20.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.XMLToStruct
0.58GB 17.02% 91.62% 0.58GB 17.02% github.com/ncw/rclone/backend/s3.(*Fs).newObjectWithInfo
0.24GB 7.10% 98.73% 0.24GB 7.10% github.com/ncw/rclone/fs/march.matchListings
0 0% 98.73% 0.90GB 26.50% encoding/json.(*Decoder).Decode
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).array
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).object
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).unmarshal
0 0% 98.73% 0.90GB 26.50% encoding/json.(*decodeState).value
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/backend/s3.(*Fs).List
0 0% 98.73% 0.58GB 17.02% github.com/ncw/rclone/backend/s3.(*Fs).itemToDirEntry
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/backend/s3.(*Fs).list
0 0% 98.73% 0.69GB 20.41% github.com/ncw/rclone/backend/s3.(*Fs).list.func1
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/backend/s3.(*Fs).listDir
0 0% 98.73% 0.58GB 17.02% github.com/ncw/rclone/backend/s3.(*Fs).listDir.func1
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/backend/s3.(*Object).ModTime
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/backend/s3.(*Object).readMetaData
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/backend/s3.(*Object).readMetaData.func1
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).List
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).list
0 0% 98.73% 0.94GB 27.69% github.com/ncw/rclone/backend/swift.(*Fs).list.func1
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot.func1
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/backend/swift.(*Fs).listDir
0 0% 98.73% 3.11GB 91.62% github.com/ncw/rclone/fs/list.DirSorted
0 0% 98.73% 0.24GB 7.13% github.com/ncw/rclone/fs/march.(*March).Run.func1
0 0% 98.73% 3.11GB 91.62% github.com/ncw/rclone/fs/march.(*March).makeListDir.func1
0 0% 98.73% 0.24GB 7.13% github.com/ncw/rclone/fs/march.(*March).processJob
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/fs/march.(*March).processJob.func1
0 0% 98.73% 1.27GB 37.43% github.com/ncw/rclone/fs/march.(*March).processJob.func2
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/operations.Equal
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/operations.NeedTransfer
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/operations.equal
0 0% 98.73% 0.02GB 0.72% github.com/ncw/rclone/fs/sync.(*syncCopyMove).pairChecker
0 0% 98.73% 0.71GB 20.96% github.com/ncw/rclone/lib/pacer.(*Pacer).Call
0 0% 98.73% 0.71GB 20.96% github.com/ncw/rclone/lib/pacer.(*Pacer).call
0 0% 98.73% 0.71GB 20.90% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run
0 0% 98.73% 0.71GB 20.90% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Send
0 0% 98.73% 0.70GB 20.50% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/restxml.Unmarshal
0 0% 98.73% 0.69GB 20.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.UnmarshalXML
0 0% 98.73% 0.02GB 0.55% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).HeadObject
0 0% 98.73% 0.69GB 20.41% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjects
0 0% 98.73% 0.90GB 26.50% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).Objects
0 0% 98.73% 1.84GB 54.19% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).ObjectsWalk
0 0% 98.73% 0.90GB 26.50% github.com/ncw/rclone/vendor/github.com/ncw/swift.readJson

and at this point memory usage starts dropping:

Fetching profile over HTTP from http://localhost:5572/debug/pprof/heap
Saved profile in /home/pvs/pprof/pprof.rclone144.alloc_objects.alloc_space.inuse_objects.inuse_space.165.pb.gz
File: rclone144
Type: inuse_space
Time: Nov 13, 2018 at 11:35am (UTC)
Showing nodes accounting for 3.37GB, 98.98% of 3.41GB total
Dropped 89 nodes (cum <= 0.02GB)
flat flat% sum% cum cum%
0.94GB 27.56% 27.56% 0.94GB 27.56% github.com/ncw/rclone/backend/swift.(*Fs).newObjectWithInfo
0.90GB 26.35% 53.91% 0.90GB 26.35% encoding/json.(*decodeState).literalStore
0.69GB 20.26% 74.17% 0.69GB 20.26% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.XMLToStruct
0.58GB 16.94% 91.10% 0.58GB 16.94% github.com/ncw/rclone/backend/s3.(*Fs).newObjectWithInfo
0.24GB 7.07% 98.17% 0.24GB 7.07% github.com/ncw/rclone/fs/march.matchListings
0.03GB 0.8% 98.98% 0.03GB 0.87% net/textproto.(*Reader).ReadMIMEHeader
0 0% 98.98% 0.90GB 26.35% encoding/json.(*Decoder).Decode
0 0% 98.98% 0.90GB 26.35% encoding/json.(*decodeState).array
0 0% 98.98% 0.90GB 26.35% encoding/json.(*decodeState).object
0 0% 98.98% 0.90GB 26.35% encoding/json.(*decodeState).unmarshal
0 0% 98.98% 0.90GB 26.35% encoding/json.(*decodeState).value
0 0% 98.98% 1.27GB 37.20% github.com/ncw/rclone/backend/s3.(*Fs).List
0 0% 98.98% 0.58GB 16.94% github.com/ncw/rclone/backend/s3.(*Fs).itemToDirEntry
0 0% 98.98% 1.27GB 37.20% github.com/ncw/rclone/backend/s3.(*Fs).list
0 0% 98.98% 0.69GB 20.26% github.com/ncw/rclone/backend/s3.(*Fs).list.func1
0 0% 98.98% 1.27GB 37.20% github.com/ncw/rclone/backend/s3.(*Fs).listDir
0 0% 98.98% 0.58GB 16.94% github.com/ncw/rclone/backend/s3.(*Fs).listDir.func1
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/backend/swift.(*Fs).List
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/backend/swift.(*Fs).list
0 0% 98.98% 0.94GB 27.56% github.com/ncw/rclone/backend/swift.(*Fs).list.func1
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/backend/swift.(*Fs).listContainerRoot.func1
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/backend/swift.(*Fs).listDir
0 0% 98.98% 3.10GB 91.10% github.com/ncw/rclone/fs/list.DirSorted
0 0% 98.98% 0.24GB 7.09% github.com/ncw/rclone/fs/march.(*March).Run.func1
0 0% 98.98% 3.10GB 91.10% github.com/ncw/rclone/fs/march.(*March).makeListDir.func1
0 0% 98.98% 0.24GB 7.09% github.com/ncw/rclone/fs/march.(*March).processJob
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/fs/march.(*March).processJob.func1
0 0% 98.98% 1.27GB 37.20% github.com/ncw/rclone/fs/march.(*March).processJob.func2
0 0% 98.98% 0.03GB 0.87% github.com/ncw/rclone/fs/operations.Equal
0 0% 98.98% 0.03GB 0.87% github.com/ncw/rclone/fs/operations.NeedTransfer
0 0% 98.98% 0.03GB 0.87% github.com/ncw/rclone/fs/operations.equal
0 0% 98.98% 0.03GB 0.87% github.com/ncw/rclone/fs/sync.(*syncCopyMove).pairChecker
0 0% 98.98% 0.71GB 20.72% github.com/ncw/rclone/lib/pacer.(*Pacer).Call
0 0% 98.98% 0.71GB 20.72% github.com/ncw/rclone/lib/pacer.(*Pacer).call
0 0% 98.98% 0.70GB 20.66% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run
0 0% 98.98% 0.70GB 20.66% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Send
0 0% 98.98% 0.69GB 20.30% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/restxml.Unmarshal
0 0% 98.98% 0.69GB 20.26% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/private/protocol/xml/xmlutil.UnmarshalXML
0 0% 98.98% 0.69GB 20.26% github.com/ncw/rclone/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjects
0 0% 98.98% 0.90GB 26.35% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).Objects
0 0% 98.98% 1.84GB 53.91% github.com/ncw/rclone/vendor/github.com/ncw/swift.(*Connection).ObjectsWalk
0 0% 98.98% 0.90GB 26.35% github.com/ncw/rclone/vendor/github.com/ncw/swift.readJson
0 0% 98.98% 0.03GB 0.87% net/http.(*persistConn).readLoop
0 0% 98.98% 0.03GB 0.87% net/http.(*persistConn).readResponse
0 0% 98.98% 0.03GB 0.87% net/http.ReadResponse

According to the memory dumps, the in-use memory is a mixture of s3 objects and swift objects.

This can have a few different causes…

  1. You have a very large directory. Rclone needs to load all the objects in to sync one directory so if you had a directory (or several) with 100,000 files in (for example) that could explain it. I suspect this might be the case because of the very long pause at the start.

  2. Lots of objects are changing and rclone needs to delete objects at the destination. Rclone’s default sync mode is --delete-after which means it has to buffer objects to be deleted after the sync. You can fix this with --delete-during.

Can you see if setting --checkers 1 reduces the memory usage? This will slow the sync down, but will mean that it only has 1 directory in memory rather than 8 (the default).

Setting checkers to 1 didn’t change the behavior. Some of our servers have hundreds of thousands of files in one directory (some even more) and we suspected rclone was loading the objects to sync. Unfortunately we can’t change the behavior of having everything in one directory.

I suspect that the time and the memory are probably caused by that one directory. Rclone will have to load s3 objects and swift objects for the source and destination directory.

Rclone needs a sub directory listing primitive to break the sync into smaller pieces…

If you try this with the latest beta you should find it doesn’t use so much memory due to a recent improvement.

Unfortunately that didn’t work and we still used as much memory. We used:

rclone v1.44-084-g3975d82b-beta
- os/arch: linux/amd64
- go version: go1.11.2

I guess it’s the number of files we have in this directory, which after taking a closer look and is over 6,000,000. One thing that still confuses me, and if it’s off subject we can skip it, is that in this particular case we transferred 315 files for a total of 47MB:

Transferred:       47.327M / 47.327 MBytes, 100%, 36.703 kBytes/s, ETA 0s
Errors:                 0
Checks:             11301 / 21310, 53%
Transferred:          315 / 315, 100%
Elapsed time:     22m0.3s

but it seems the final checks checked ALL the files that were transferred:

2018/11/15 01:20:09 INFO  : S3 bucket s3_bucket_target path xyz/: Waiting for transfers to finish
2018/11/15 01:20:09 INFO  : 
Transferred:       47.327M / 47.327 MBytes, 100%, 1.091 kBytes/s, ETA 0s
Errors:                 0
Checks:           6491886 / 6491886, 100%
Transferred:          315 / 315, 100%
Elapsed time:  12h20m8.5s

That should definitely have worked… I just tried it now on a directory with 500,000 files and with one file in files.txt.

Before rclone used 970MB of RAM, using the latest beta it used 21MB RAM.

How many files do you have in the file you are passing into --files-from and what was the parameters you used with rclone?

That is going to take a lot of memory. The way rclone is written the sync/copy/move routines will load an entire directory into memory at a time.

This shows that rclone has transferred 315 files and has checked 11,301 files to see if they need to be transferred.

This says that 315 files were transferred and rclone has checked 6,491,886 files to make sure that they didn’t need to be transferred.

1 Like

I re-ran the process by first building the file list and then running the copy using the file list. Building the file list used at most 600MB of memory and completed in less than 5 minutes. The copy was run with:

/usr/sbin/rclonebeta -v copy --files-from /home/pvs/jnj-file-list swift:swift_bucket_target s3-swift-backup:s3_bucket_target --ignore-checksum --cache-workers=8 --transfers=2 --cache-tmp-wait-time=10m --buffer-size=0 --s3-upload-concurrency=1 --swift-chunk-size=100m --s3-chunk-size=5m --log-file=/home/pvs/automation/log/rclone.log

and it’s been running for more than 24 hours with this in the log:

2018/11/21 11:50:13 INFO :
Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors: 0
Checks: 0 / 0, -
Transferred: 0 / 0, -
Elapsed time: 25h58m12.1s

2018/11/21 11:51:13 INFO :
Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors: 0
Checks: 0 / 0, -
Transferred: 0 / 0, -
Elapsed time: 25h59m12.1s

2018/11/21 11:52:13 INFO :
Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors: 0
Checks: 0 / 0, -
Transferred: 0 / 0, -
Elapsed time: 26h0m12.1s

Ultimately I think we can’t get past having more than 6 million files in one directory and that’s a self-imposed problem.

Not sure how practical, but normally if I had that type of issue (cloud or local), I’d have to create the backups locally on the system by tarring things up and making the back there and copying it up to something else.

Cloud will be especially bad use case for copying a 6 million file folder up as that’s normally a problem locally from my experience.

1 Like