Possible memory leak for Azure blob storage

What is the problem you are having with rclone?

rclone constantly increasing memory consumption over time when uploading millions files to the Azure blob storage.
Statistics about uploaded data: folders~36715; num files=10215623; size=4.47 TB

Run the command 'rclone version' and share the full output of the command.

rclone v1.60.0
- os/version: centos 8 (64 bit)
- os/kernel: 4.18.0-301.1.el8.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.19.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Cloud storage - Azure blob

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Folder is mounted using fstab fuse

rclonefs#azure2:vna /mnt/azure  fuse _netdev,config=/root/.config/rclone/rclone.conf,rc,allow-other,default-permissions,rw,max-read-ahead=4M,vfs-cache-mode=writes,vfs-cache-max-size=2G,azureblob-memory-pool-flush-time=10m0s,buffer-size=0,azureblob-memory-pool-use-mmap,use-mmap,dir-cache-time=2h,timeout=1h,cache-dir=/var/dcmsys/cache,uid=dcmsys,gid=dcmsys,cache-chunk-no-memory,size-only,allow-non-empty   0 0

The rclone config contents with secrets removed.

[azure2]
type = azureblob
account = vnatest3
key = secretkey==

Result of memory checking after 2 weeks of uploading:

Saved profile in /root/pprof/pprof.rclone.alloc_objects.alloc_space.inuse_objects.inuse_space.006.pb.gz
File: rclone
Type: inuse_space
Time: Dec 2, 2022 at 10:10am (PST)
Showing nodes accounting for 15060.62MB, 99.01% of 15211.03MB total
Dropped 122 nodes (cum <= 76.06MB)
      flat  flat%   sum%        cum   cum%
 3279.79MB 21.56% 21.56%  3441.29MB 22.62%  github.com/Azure/azure-storage-blob-go/azblob.BlobGetPropertiesResponse.NewMetadata
 2476.10MB 16.28% 37.84%  2476.10MB 16.28%  github.com/rclone/rclone/vfs/vfscache.newItem
 1916.35MB 12.60% 50.44%  1916.35MB 12.60%  github.com/rclone/rclone/vfs.newFile
 1297.24MB  8.53% 58.97%  1297.24MB  8.53%  github.com/rclone/rclone/vfs.(*Dir).addObject
 1250.65MB  8.22% 67.19%  1250.65MB  8.22%  path.Join
 1132.62MB  7.45% 74.64%  4815.47MB 31.66%  github.com/rclone/rclone/backend/azureblob.(*Fs).Put
  848.55MB  5.58% 80.21%   861.65MB  5.66%  bazil.org/fuse.(*Conn).ReadRequest
  842.03MB  5.54% 85.75%   842.03MB  5.54%  net/textproto.(*Reader).ReadMIMEHeader
  636.54MB  4.18% 89.93%  2292.81MB 15.07%  github.com/rclone/rclone/vfs.newRWFileHandle
  360.58MB  2.37% 92.30%   379.08MB  2.49%  github.com/Azure/azure-storage-blob-go/azblob.(*Metadata).UnmarshalXML
  262.51MB  1.73% 94.03%   262.51MB  1.73%  encoding/base64.(*Encoding).EncodeToString
     180MB  1.18% 95.21%      180MB  1.18%  strings.(*Builder).grow (inline)
     152MB     1% 96.21%      152MB     1%  github.com/rclone/rclone/vfs.(*File).addWriter
     151MB  0.99% 97.21%  7914.50MB 52.03%  github.com/rclone/rclone/cmd/mount.(*Dir).Create
  131.01MB  0.86% 98.07%   131.01MB  0.86%  encoding/xml.copyValue
  101.01MB  0.66% 98.73%   124.01MB  0.82%  github.com/rclone/rclone/backend/azureblob.(*Fs).newObjectWithInfo
   39.63MB  0.26% 98.99%  2515.73MB 16.54%  github.com/rclone/rclone/vfs/vfscache.(*Cache)._get
       3MB  0.02% 99.01%   527.09MB  3.47%  encoding/xml.(*Decoder).unmarshal
         0     0% 99.01%   861.65MB  5.66%  bazil.org/fuse/fs.(*Server).Serve
         0     0% 99.01%  8141.77MB 53.53%  bazil.org/fuse/fs.(*Server).Serve.func1
         0     0% 99.01%  8141.77MB 53.53%  bazil.org/fuse/fs.(*Server).handleRequest
         0     0% 99.01%  8141.77MB 53.53%  bazil.org/fuse/fs.(*Server).serve
         0     0% 99.01%   527.09MB  3.47%  encoding/xml.(*Decoder).Decode (inline)
         0     0% 99.01%   527.09MB  3.47%  encoding/xml.(*Decoder).DecodeElement
         0     0% 99.01%   413.58MB  2.72%  encoding/xml.(*Decoder).unmarshalInterface
         0     0% 99.01%   527.09MB  3.47%  encoding/xml.(*Decoder).unmarshalPath
         0     0% 99.01%   527.09MB  3.47%  encoding/xml.Unmarshal
         0     0% 99.01%   545.14MB  3.58%  github.com/Azure/azure-pipeline-go/pipeline.(*pipeline).Do
         0     0% 99.01%   545.14MB  3.58%  github.com/Azure/azure-pipeline-go/pipeline.PolicyFunc.Do
         0     0% 99.01%   535.59MB  3.52%  github.com/Azure/azure-storage-blob-go/azblob.(*SharedKeyCredential).New.func1
         0     0% 99.01%   526.59MB  3.46%  github.com/Azure/azure-storage-blob-go/azblob.ContainerURL.ListBlobsHierarchySegment
         0     0% 99.01%   545.14MB  3.58%  github.com/Azure/azure-storage-blob-go/azblob.NewRetryPolicyFactory.func1.1
         0     0% 99.01%   545.14MB  3.58%  github.com/Azure/azure-storage-blob-go/azblob.NewTelemetryPolicyFactory.func1.1
         0     0% 99.01%   545.14MB  3.58%  github.com/Azure/azure-storage-blob-go/azblob.NewUniqueRequestIDPolicyFactory.func1.1
         0     0% 99.01%   526.59MB  3.46%  github.com/Azure/azure-storage-blob-go/azblob.containerClient.ListBlobHierarchySegment
         0     0% 99.01%   527.09MB  3.47%  github.com/Azure/azure-storage-blob-go/azblob.containerClient.listBlobHierarchySegmentResponder
         0     0% 99.01%   535.59MB  3.52%  github.com/Azure/azure-storage-blob-go/azblob.responderPolicy.Do
         0     0% 99.01%   251.03MB  1.65%  github.com/rclone/rclone/backend/azureblob.(*Fs).List
         0     0% 99.01%   124.01MB  0.82%  github.com/rclone/rclone/backend/azureblob.(*Fs).itemToDirEntry
         0     0% 99.01%   611.60MB  4.02%  github.com/rclone/rclone/backend/azureblob.(*Fs).list
         0     0% 99.01%   526.59MB  3.46%  github.com/rclone/rclone/backend/azureblob.(*Fs).list.func1
         0     0% 99.01%   293.03MB  1.93%  github.com/rclone/rclone/backend/azureblob.(*Fs).listDir
         0     0% 99.01%   124.01MB  0.82%  github.com/rclone/rclone/backend/azureblob.(*Fs).listDir.func1
         0     0% 99.01%  3683.85MB 24.22%  github.com/rclone/rclone/backend/azureblob.(*Object).Update
         0     0% 99.01%  3680.80MB 24.20%  github.com/rclone/rclone/backend/azureblob.(*Object).decodeMetaDataFromPropertiesResponse
         0     0% 99.01%  3682.30MB 24.21%  github.com/rclone/rclone/backend/azureblob.(*Object).readMetaData
         0     0% 99.01%   124.01MB  0.82%  github.com/rclone/rclone/cmd/mount.(*Dir).ReadDirAll
         0     0% 99.01%   861.65MB  5.66%  github.com/rclone/rclone/cmd/mount.mount.func2
         0     0% 99.01%   512.14MB  3.37%  github.com/rclone/rclone/fs.pacerInvoker
         0     0% 99.01%   137.51MB   0.9%  github.com/rclone/rclone/fs/list.DirSorted
         0     0% 99.01%  4815.47MB 31.66%  github.com/rclone/rclone/fs/operations.Copy
         0     0% 99.01%   489.09MB  3.22%  github.com/rclone/rclone/lib/pacer.(*Pacer).Call
         0     0% 99.01%   509.14MB  3.35%  github.com/rclone/rclone/lib/pacer.(*Pacer).call
         0     0% 99.01%  1916.35MB 12.60%  github.com/rclone/rclone/vfs.(*Dir).Create
         0     0% 99.01%   124.01MB  0.82%  github.com/rclone/rclone/vfs.(*Dir).ReadDirAll
         0     0% 99.01%   131.01MB  0.86%  github.com/rclone/rclone/vfs.(*Dir)._readDir
         0     0% 99.01%  5847.15MB 38.44%  github.com/rclone/rclone/vfs.(*File).Open
         0     0% 99.01%  1248.15MB  8.21%  github.com/rclone/rclone/vfs.(*File).Path
         0     0% 99.01%  2292.81MB 15.07%  github.com/rclone/rclone/vfs.(*File).openRW
         0     0% 99.01%  1294.72MB  8.51%  github.com/rclone/rclone/vfs.(*RWFileHandle).Truncate
         0     0% 99.01%  1294.72MB  8.51%  github.com/rclone/rclone/vfs.(*RWFileHandle).openPending
         0     0% 99.01%  2370.20MB 15.58%  github.com/rclone/rclone/vfs/vfscache.(*Cache).Exists
         0     0% 99.01%   145.54MB  0.96%  github.com/rclone/rclone/vfs/vfscache.(*Cache).Item (inline)
         0     0% 99.01%  2515.73MB 16.54%  github.com/rclone/rclone/vfs/vfscache.(*Cache).get
         0     0% 99.01%  4815.97MB 31.66%  github.com/rclone/rclone/vfs/vfscache.(*Item).Close.func2
         0     0% 99.01%  4815.97MB 31.66%  github.com/rclone/rclone/vfs/vfscache.(*Item)._store
         0     0% 99.01%  4815.97MB 31.66%  github.com/rclone/rclone/vfs/vfscache.(*Item).store
         0     0% 99.01%  4815.97MB 31.66%  github.com/rclone/rclone/vfs/vfscache/writeback.(*WriteBack).upload
         0     0% 99.01%   842.03MB  5.54%  net/http.(*persistConn).readLoop
         0     0% 99.01%   842.03MB  5.54%  net/http.(*persistConn).readResponse
         0     0% 99.01%   842.03MB  5.54%  net/http.ReadResponse
         0     0% 99.01%      180MB  1.18%  strings.(*Builder).Grow (inline)
         0     0% 99.01%      180MB  1.18%  strings.ToLower

Binary file is here: Data package from December 2nd. - FileTransfer.io

Rclone consumed ~19Gb RAM at this point

For comparison, on GCP the same workload but rclone ate ~2GB RAM.

I see you are using a mount to upload things to azure.

This will use more memory than using rclone copy - I expect you have your reasons, but I thought I'd mention it!

Your trace shows that rclone's directory cache is caching the azure objects, I think.

Try running vfs/forget on the directory cache to clear it and release the memory.

This is a known problem (the directory cache growing without limit) and we are working on a solution right now!

I don't think it is azure specific.

ncw,
Actually, this stats data was extracted after "usr/bin/rclone rc vfs/forget" command executed. You can see the stats before I executed vfs/forget command below:

Saved profile in /root/pprof/pprof.rclone.alloc_objects.alloc_space.inuse_objects.inuse_space.006.pb.gz
File: rclone
Type: inuse_space
Time: Dec 2, 2022 at 9:57am (PST)
Showing nodes accounting for 15260.93MB, 98.43% of 15504.21MB total
Dropped 178 nodes (cum <= 77.52MB)
      flat  flat%   sum%        cum   cum%
 3278.79MB 21.15% 21.15%  3440.29MB 22.19%  github.com/Azure/azure-storage-blob-go/azblob.BlobGetPropertiesResponse.NewMetadata
 2475.10MB 15.96% 37.11%  2475.10MB 15.96%  github.com/rclone/rclone/vfs/vfscache.newItem
 1915.35MB 12.35% 49.47%  1915.35MB 12.35%  github.com/rclone/rclone/vfs.newFile
 1348.16MB  8.70% 58.16%  1348.16MB  8.70%  path.Join
 1301.43MB  8.39% 66.56%  1301.43MB  8.39%  github.com/rclone/rclone/vfs.(*Dir).addObject
 1132.62MB  7.31% 73.86%  4813.97MB 31.05%  github.com/rclone/rclone/backend/azureblob.(*Fs).Put
  851.05MB  5.49% 79.35%   851.62MB  5.49%  bazil.org/fuse.(*Conn).ReadRequest
  841.53MB  5.43% 84.78%   841.53MB  5.43%  net/textproto.(*Reader).ReadMIMEHeader
  636.04MB  4.10% 88.88%  2296.50MB 14.81%  github.com/rclone/rclone/vfs.newRWFileHandle
  334.07MB  2.15% 91.03%   354.07MB  2.28%  github.com/Azure/azure-storage-blob-go/azblob.(*Metadata).UnmarshalXML
  259.51MB  1.67% 92.71%   259.51MB  1.67%  encoding/base64.(*Encoding).EncodeToString
  178.50MB  1.15% 93.86%   178.50MB  1.15%  strings.(*Builder).grow (inline)
     152MB  0.98% 94.84%      152MB  0.98%  github.com/rclone/rclone/vfs.(*File).addWriter
     151MB  0.97% 95.81%  7915.69MB 51.06%  github.com/rclone/rclone/cmd/mount.(*Dir).Create
  120.01MB  0.77% 96.59%   120.01MB  0.77%  encoding/xml.copyValue
   92.51MB   0.6% 97.18%   112.51MB  0.73%  github.com/rclone/rclone/backend/azureblob.(*Fs).newObjectWithInfo
   80.06MB  0.52% 97.70%    80.06MB  0.52%  io.ReadAll
   39.63MB  0.26% 97.96%  2514.73MB 16.22%  github.com/rclone/rclone/vfs/vfscache.(*Cache)._get
      36MB  0.23% 98.19%    97.50MB  0.63%  encoding/xml.(*Decoder).Token
   27.50MB  0.18% 98.37%   633.72MB  4.09%  encoding/xml.(*Decoder).unmarshal
    4.50MB 0.029% 98.40%  8340.85MB 53.80%  bazil.org/fuse/fs.(*Server).handleRequest
    1.54MB 0.0099% 98.40%   283.62MB  1.83%  github.com/rclone/rclone/cmd/mount.(*Dir).ReadDirAll
    1.51MB 0.0098% 98.41%   114.03MB  0.74%  github.com/rclone/rclone/backend/azureblob.(*Fs).listDir.func1
       1MB 0.0064% 98.42%  8341.85MB 53.80%  bazil.org/fuse/fs.(*Server).serve
    0.50MB 0.0032% 98.42%   286.59MB  1.85%  github.com/rclone/rclone/vfs.(*Dir).ReadDirAll
    0.50MB 0.0032% 98.43%   852.12MB  5.50%  bazil.org/fuse/fs.(*Server).Serve
    0.50MB 0.0032% 98.43%  3687.80MB 23.79%  github.com/rclone/rclone/backend/azureblob.(*Object).readMetaData
         0     0% 98.43%  8341.85MB 53.80%  bazil.org/fuse/fs.(*Server).Serve.func1
         0     0% 98.43%   633.72MB  4.09%  encoding/xml.(*Decoder).Decode (inline)
         0     0% 98.43%   633.72MB  4.09%  encoding/xml.(*Decoder).DecodeElement
         0     0% 98.43%   478.58MB  3.09%  encoding/xml.(*Decoder).unmarshalInterface
         0     0% 98.43%   633.72MB  4.09%  encoding/xml.(*Decoder).unmarshalPath
         0     0% 98.43%   633.72MB  4.09%  encoding/xml.Unmarshal
         0     0% 98.43%   725.83MB  4.68%  github.com/Azure/azure-pipeline-go/pipeline.(*pipeline).Do
         0     0% 98.43%   728.83MB  4.70%  github.com/Azure/azure-pipeline-go/pipeline.PolicyFunc.Do
         0     0% 98.43%   124.50MB   0.8%  github.com/Azure/azure-storage-blob-go/azblob.(*BlobPropertiesInternal).UnmarshalXML
         0     0% 98.43%   720.78MB  4.65%  github.com/Azure/azure-storage-blob-go/azblob.(*SharedKeyCredential).New.func1
         0     0% 98.43%   708.28MB  4.57%  github.com/Azure/azure-storage-blob-go/azblob.ContainerURL.ListBlobsHierarchySegment
         0     0% 98.43%   728.83MB  4.70%  github.com/Azure/azure-storage-blob-go/azblob.NewRetryPolicyFactory.func1.1
         0     0% 98.43%   728.83MB  4.70%  github.com/Azure/azure-storage-blob-go/azblob.NewTelemetryPolicyFactory.func1.1
         0     0% 98.43%   728.83MB  4.70%  github.com/Azure/azure-storage-blob-go/azblob.NewUniqueRequestIDPolicyFactory.func1.1
         0     0% 98.43%   709.78MB  4.58%  github.com/Azure/azure-storage-blob-go/azblob.containerClient.ListBlobHierarchySegment
         0     0% 98.43%   713.78MB  4.60%  github.com/Azure/azure-storage-blob-go/azblob.containerClient.listBlobHierarchySegmentResponder
         0     0% 98.43%   720.78MB  4.65%  github.com/Azure/azure-storage-blob-go/azblob.responderPolicy.Do
         0     0% 98.43%   365.73MB  2.36%  github.com/rclone/rclone/backend/azureblob.(*Fs).List
         0     0% 98.43%   112.51MB  0.73%  github.com/rclone/rclone/backend/azureblob.(*Fs).itemToDirEntry
         0     0% 98.43%   703.30MB  4.54%  github.com/rclone/rclone/backend/azureblob.(*Fs).list
         0     0% 98.43%   692.78MB  4.47%  github.com/rclone/rclone/backend/azureblob.(*Fs).list.func1
         0     0% 98.43%   406.73MB  2.62%  github.com/rclone/rclone/backend/azureblob.(*Fs).listDir
         0     0% 98.43%  3681.35MB 23.74%  github.com/rclone/rclone/backend/azureblob.(*Object).Update
         0     0% 98.43%  3679.80MB 23.73%  github.com/rclone/rclone/backend/azureblob.(*Object).decodeMetaDataFromPropertiesResponse
         0     0% 98.43%   852.12MB  5.50%  github.com/rclone/rclone/cmd/mount.mount.func2
         0     0% 98.43%   665.33MB  4.29%  github.com/rclone/rclone/fs.pacerInvoker
         0     0% 98.43%   228.59MB  1.47%  github.com/rclone/rclone/fs/list.DirSorted
         0     0% 98.43%  4813.97MB 31.05%  github.com/rclone/rclone/fs/operations.Copy
         0     0% 98.43%   602.28MB  3.88%  github.com/rclone/rclone/lib/pacer.(*Pacer).Call
         0     0% 98.43%   638.33MB  4.12%  github.com/rclone/rclone/lib/pacer.(*Pacer).call
         0     0% 98.43%  1915.35MB 12.35%  github.com/rclone/rclone/vfs.(*Dir).Create
         0     0% 98.43%   219.61MB  1.42%  github.com/rclone/rclone/vfs.(*Dir)._readDir
         0     0% 98.43%  5849.34MB 37.73%  github.com/rclone/rclone/vfs.(*File).Open
         0     0% 98.43%  1328.66MB  8.57%  github.com/rclone/rclone/vfs.(*File).Path
         0     0% 98.43%  2296.50MB 14.81%  github.com/rclone/rclone/vfs.(*File).openRW
         0     0% 98.43%  1298.91MB  8.38%  github.com/rclone/rclone/vfs.(*RWFileHandle).Truncate
         0     0% 98.43%  1298.91MB  8.38%  github.com/rclone/rclone/vfs.(*RWFileHandle).openPending
         0     0% 98.43%    78.51MB  0.51%  github.com/rclone/rclone/vfs.Nodes.Less
         0     0% 98.43%  2369.20MB 15.28%  github.com/rclone/rclone/vfs/vfscache.(*Cache).Exists
         0     0% 98.43%   145.54MB  0.94%  github.com/rclone/rclone/vfs/vfscache.(*Cache).Item (inline)
         0     0% 98.43%  2514.73MB 16.22%  github.com/rclone/rclone/vfs/vfscache.(*Cache).get
         0     0% 98.43%  4813.97MB 31.05%  github.com/rclone/rclone/vfs/vfscache.(*Item).Close.func2
         0     0% 98.43%  4813.97MB 31.05%  github.com/rclone/rclone/vfs/vfscache.(*Item)._store
         0     0% 98.43%  4813.97MB 31.05%  github.com/rclone/rclone/vfs/vfscache.(*Item).store
         0     0% 98.43%  4813.97MB 31.05%  github.com/rclone/rclone/vfs/vfscache/writeback.(*WriteBack).upload
         0     0% 98.43%    80.06MB  0.52%  io/ioutil.ReadAll (inline)
         0     0% 98.43%   841.53MB  5.43%  net/http.(*persistConn).readLoop
         0     0% 98.43%   841.53MB  5.43%  net/http.(*persistConn).readResponse
         0     0% 98.43%   841.53MB  5.43%  net/http.ReadResponse
         0     0% 98.43%    78.51MB  0.51%  sort.Sort
         0     0% 98.43%    78.51MB  0.51%  sort.pdqsort
         0     0% 98.43%   178.50MB  1.15%  strings.(*Builder).Grow (inline)
         0     0% 98.43%   178.50MB  1.15%  strings.ToLower

Today I've started another session for comparison of RAM consumption on both GCP and Azure clouds (on recent rclone version) and will send you an updated metrics on Friday.
Best regards,
Raymond

You appear to have about 10,000,000 vfs entries. Does this seem plausible? Looking at your original post you do indeed have about 10,000,000 files.

Each of these is holding an fs.Object of maybe 1.5k in size leading to the 15GB memory usage we see.

image

So why does the memory usage remain after you run vfs/forget?

I know that bazil/fuse caches directory entries sometimes. Suspiciously it is holding on to 848M of RAM composed of 64Byte objects which is about 13M so about the right size to be hanging on to those vfs nodes.

image

Would it be easy for you to compile rclone yourself?

If you compile it with

go build -tags cmount

You'll get a version of rclone which has a new mount command rclone cmount. This interfaces directly with libfuse and doesn't do the caching. It is 100% compatible with rclone mount.

Can you try that?

If you can't build it yourself then I'll make a centos 8 VM and build it there for you.

PS You could also try --attr-timeout 0 - this isn't recommended in the docs though but it may work now.

In theory 0s should be the correct value for filesystems which can
change outside the control of the kernel. However this causes quite a
few problems such as
rclone using too much memory,
rclone not serving files to samba
and excessive time listing directories.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.