VFS cache stale files in sftp mode

What is the problem you are having with rclone?

It appears that when the debug logs indicate 'cancelling writeback' the vfs cache stops reporting every minute, directory listings become stale, uploads still occur however gets 'stuck' at 100%

What is your rclone version (output from rclone version)

v1.55.1

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Container rclone/rclone:latest on Ubuntu 64

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Container runs with these flags:

rclone --config /config/rclone
.conf serve sftp --authorized-keys=/config/authorized_keys --dir-cache-time=1000h --buffer-size
=128M --addr=:2022 --log-level=DEBUG --log-file=/log/rclone.log --user-agent=mycustom1 --cache-dir=/data/cache --vfs-cache-mode=full --vfs-read-ahead=256M --vfs-cache-max-age=30
0h --vfs-cache-max-size=5G gdrive2-data:.users/brick

In normal operations the DEBUG log gets updated every minute with a 'vfs cache: cleaned:' line - I can pin-point the time the cache "stops working" roughly here:

# grep uploading rclone.log  | tail -5
2021/05/21 00:16:50 INFO  : vfs cache: cleaned: objects 144 (was 144) in use 17, to upload 6, u
ploading 2, total size 4.627G (was 4.627G)
2021/05/21 00:17:50 INFO  : vfs cache: cleaned: objects 191 (was 191) in use 11, to upload 2, u
ploading 0, total size 4.760G (was 4.760G)
2021/05/21 00:18:50 INFO  : vfs cache: cleaned: objects 241 (was 241) in use 17, to upload 4, u
ploading 3, total size 4.893G (was 4.893G)
2021/05/21 00:19:50 INFO  : vfs cache: cleaned: objects 247 (was 292) in use 17, to upload 8, u
ploading 0, total size 4.998G (was 5.026G)
2021/05/21 00:20:50 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fd71d6f01e399672770dca83
aeed935e.png: vfs cache: cancelling writeback (uploading true) 0xc00564cf50 item 19325

It appears that 'vfs cache: cleaned:' stops reporting every minute once 'vfs cache: cancelling writeback' occurs.

Whilst in this state on the client side (using a sftp client) some directories list, others not:

$ sftp -P2022 server
Connected to server
sftp> ls
backedup-storage    backups             backup2             
sftp> ls -al
drwxr-xr-x    0 root     root            0 May 18 20:30 backedup-storage
drwxr-xr-x    0 root     root            0 May 18 20:35 backups
drwxr-xr-x    0 root     root            0 May 18 20:31 backups2
sftp> cd backups
sftp> ls -al
**times out**

Once I restart rclone (the container in my case) - it appears that all the content not yet uploaded on the vfs cache gets uploaded, and everything returns to normal, until 'vfs cache: cancelling writeback' occurs in the logs again.

The rclone config contents with secrets removed.

[gdrive2]
type = drive
client_id = clientid_removed
client_secret = client_secret_removed
token = token_removed
root_folder_id = 0AEhS0wxLefMzk8PVA

[gdrive2-data]
type = crypt
remote = gdrive2:data
filename_encryption = standard
password = password_removed
password2 = password_removed

Whilst in the error state

  • The sftp client can upload others files, however they remain stuck at '100%', the logs then say:
2021/05/21 07:51:48 DEBUG : backedup-storage/Videos/VHS/Las_Loop_Modeparade.mkv(0xc003dcb780): writeAt: size=24114, off=10707628032
2021/05/21 07:51:48 DEBUG : backedup-storage/Videos/VHS/Las_Loop_Modeparade.mkv(0xc003dcb780): >_writeAt: n=24114, err=<nil>
2021/05/21 07:51:48 DEBUG : backedup-storage/Videos/VHS/Las_Loop_Modeparade.mkv(0xc003dcb780): close:
2021/05/21 07:51:48 DEBUG : backedup-storage/Videos/VHS/Las_Loop_Modeparade.mkv: vfs cache: set ting modification time to 2021-05-21 07:51:48.308072334 +0000 UTC m=+57963.037840797
2021/05/21 07:51:48 INFO  : backedup-storage/Videos/VHS/Las_Loop_Modeparade.mkv: vfs cache: queuing for upload in 5s
2021/05/21 07:51:48 DEBUG : backedup-storage/Videos/VHS/Las_Loop_Modeparade.mkv(0xc003dcb780): >close: err=<nil>
  • These files (such as the above) are not remaining open (checked lsof), as expected
  • These files has the correct filesize within the vfs cache-dir
  • md5sum's of the file on the vfs cache-dir and source material also match
  • As mentioned, a simple rclone restart will ensure this all gets read-again and uploads correctly and all is normal under we hit the error condition again

Thank you!

I suspect something has deadlocked holding the directory mutex which is stopping things progressing...

Can you replicate this problem at will? If so how?

Can you make it go wrong again, then kill -QUIT rclone and collect the backtrace and post it here?

Thanks @ncw.

I cannot replicate the problem at will sadly, however it seems to occur every 24h-or-so.

I'll wait for the error condition to occur again and supply the backtrace, thanks!

Righty @ncw it happened again so I could catch the backtrace which is attached..

broken2.log.txt (475.4 KB)

For reference, it appears vfs stopped working at:

# grep 'uploading' rclone.log-20210522 | tail -5
2021/05/21 22:21:20 INFO  : vfs cache: cleaned: objects 22 (was 148) in use 22, to upload 10, uploading 3, total size 25.876G (was 25.878G)
2021/05/21 22:22:20 INFO  : vfs cache: cleaned: objects 17 (was 134) in use 17, to upload 7, uploading 1, total size 26.002G (was 26.009G)
2021/05/21 22:23:20 INFO  : vfs cache: cleaned: objects 25 (was 123) in use 25, to upload 11, uploading 4, total size 26.126G (was 26.136G)
2021/05/21 22:24:20 INFO  : vfs cache: cleaned: objects 25 (was 161) in use 25, to upload 12, uploading 4, total size 26.255G (was 26.260G)
2021/05/21 22:25:20 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: cancelling writeback (uploading true) 0xc031e88cb0 item 56068

Thanks!

That is a very useful log - thank you.

This is where the problem happened

1: chan receive [640 minutes] [Created by vfscache.New @ cache.go:144]
    writeback  writeback.go:385      (*WriteBack)._cancelUpload(*WriteBack(#90), *writeBackItem(#1153))
    writeback  writeback.go:291      (*WriteBack).Remove(*WriteBack(#90), Handle(0xdb04))
    vfscache   item.go:874           (*Item)._remove(*Item(#795), string(#9, len=34), 0x5a96)
    vfscache   item.go:923           (*Item).RemoveNotInUse(*Item(#795), 0x0, #859, 0x1, 0x5a96)
    vfscache   cache.go:502          (*Cache).removeNotInUse(*Cache(#92), *Item(#795), Duration(0x0))
    vfscache   cache.go:652          (*Cache).purgeOverQuota(*Cache(#92), 5368709120)
    vfscache   cache.go:679          (*Cache).clean(*Cache(#92), bool(#841))
    vfscache   cache.go:742          (*Cache).cleaner(#92, #26, #311)

So you are right it happened when cancelling the upload.

However there are two things I don't understand yet

  1. This appears to be running the cache purge so cleaning your cache back down to size. However it shouldn't have been trying to cancel something which is being uploaded as it should be "dirty".

  2. This caused a deadlock somehow - the cancel for the upload never went through so the above trace waits forever for it.

Have you got any more of that log? It would be really useful to have anything mentioning backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png - I'd like to know what state that file is in.

Can you open a new issue on Github about this also - I think this may take some time to fix!

Cheers @ncw !

As for more in the log about that file - yes sure:

# zgrep backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png *.gz
2021/05/21 22:24:32 DEBUG : /backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: OpenFile: flags=O_WRONLY|O_CREATE|O_TRUNC, perm=-rwxrwxrwx
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: Open: flags=O_WRONLY|O_CREATE|O_TRUNC
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: newRWFileHandle: 
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): openPending: 
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: truncate to size=0
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): >openPending: err=<nil>
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: >newRWFileHandle: err=<nil>
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: >Open: fd=backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png (rw), err=<nil>
2021/05/21 22:24:32 DEBUG : /backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: >OpenFile: fd=backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png (rw), err=<nil>
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): _writeAt: size=23190, off=0
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): >_writeAt: n=23190, err=<nil>
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): _writeAt: size=0, off=23190
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): >_writeAt: n=0, err=<nil>
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): close: 
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: setting modification time to 2021-05-21 22:24:32.428790008 +0000 UTC m=+37680.681425438
2021/05/21 22:24:32 INFO  : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: queuing for upload in 5s
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png(0xc00ae55880): >close: err=<nil>
2021/05/21 22:24:32 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: setting modification time to 2000-12-08 00:14:37 +0000 UTC
2021/05/21 22:24:33 DEBUG : serve sftp 102.65.40.41:41870->172.19.0.14:2022: exec command: binary = "md5sum", args = "backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png"
2021/05/21 22:24:33 DEBUG : serve sftp 102.65.40.41:41870->172.19.0.14:2022: command "md5sum backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png" failed with error: unexpected non file
2021/05/21 22:24:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: starting upload
2021/05/21 22:24:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: MD5 = dd26141b5189d019869390febcba7503 OK
2021/05/21 22:24:38 INFO  : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: Copied (new)
2021/05/21 22:24:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: fingerprint now "23190,2000-12-08 00:14:37 +0000 UTC"
2021/05/21 22:24:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: writeback object to VFS layer
2021/05/21 22:24:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: Not setting pending mod time 2000-12-08 00:14:37 +0000 UTC as it is already set
2021/05/21 22:25:20 DEBUG : vfs cache RemoveNotInUse (maxAge=1080000000000000, emptyOnly=false): item backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png not removed, freed 0 bytes
2021/05/21 22:25:20 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: cancelling writeback (uploading true) 0xc031e88cb0 item 56068
2021/05/21 22:25:20 DEBUG : backups/stasiewa/marais/.thumbnails/normal/fe1f1749570c2d75a34bd8af9646c0fb.png: vfs cache: cancelling upload

Curious the 'unexpected non file' statement there!

If it is of interest - the sending end (to the rclone sftp server) is also using rclone.

Example 3

Here's another example from today where things stopped working at 14:10:46,

# grep uploading rclone.log | tail -5
2021/05/23 14:06:46 INFO  : vfs cache: cleaned: objects 12 (was 26) in use 12, to upload 3,     uploading 0, total size 25.804G (was 25.804G)
2021/05/23 14:07:46 INFO  : vfs cache: cleaned: objects 13 (was 51) in use 13, to upload 3,      uploading 1, total size 25.886G (was 25.936G)
2021/05/23 14:08:46 INFO  : vfs cache: cleaned: objects 16 (was 27) in use 16, to upload 6,      uploading 1, total size 26.016G (was 26.016G)
2021/05/23 14:09:46 INFO  : vfs cache: cleaned: objects 9 (was 93) in use 9, to upload 0, u    ploading 0, total size 26.146G (was 26.147G)
2021/05/23 14:10:46 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f918    35f4c563d322.png: vfs cache: cancelling writeback (uploading true) 0xc0000af2d0 item 53703

Checking the specific filename:

# grep backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png rclone.log
2021/05/23 14:09:19 DEBUG : /backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: OpenFile: flags=O_WRONLY|O_CREATE|O_TRUNC, perm=-rwxrwxrwx
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: Open: flags=O_WRONLY|O_CREATE|O_TRUNC
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: newRWFileHandle: 
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): openPending: 
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: checking remote fingerprint "21132,2009-06-23 17:11:08 +0000 UTC" against cached fingerprint ""
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: truncate to size=21132
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): >openPending: err=<nil>
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: truncate to size=0
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: >newRWFileHandle: err=<nil>
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: >Open: fd=backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png (rw), err=<nil>
2021/05/23 14:10:37 DEBUG : /backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: >OpenFile: fd=backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png (rw), err=<nil>
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): _writeAt: size=21132, off=0
2021/05/23 14:10:37 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): >_writeAt: n=21132, err=<nil>
2021/05/23 14:10:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): _writeAt: size=0, off=21132
2021/05/23 14:10:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): >_writeAt: n=0, err=<nil>
2021/05/23 14:10:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): close: 
2021/05/23 14:10:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: setting modification time to 2021-05-23 14:10:37.925815988 +0000 UTC m=+56787.975779484
2021/05/23 14:10:38 INFO  : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: queuing for upload in 5s
2021/05/23 14:10:38 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png(0xc0084452c0): >close: err=<nil>
2021/05/23 14:10:43 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: starting upload
2021/05/23 14:10:44 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: MD5 = bdf42daa956cf81903e2aa0948c0ae76 OK
2021/05/23 14:10:44 INFO  : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: Copied (replaced existing)
2021/05/23 14:10:44 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: fingerprint now "21132,2021-05-23 14:10:37.925 +0000 UTC"
2021/05/23 14:10:44 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: writeback object to VFS layer
2021/05/23 14:10:46 DEBUG : vfs cache RemoveNotInUse (maxAge=1080000000000000, emptyOnly=false): item backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png not removed, freed 0 bytes
2021/05/23 14:10:46 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: cancelling writeback (uploading true) 0xc0000af2d0 item 53703
2021/05/23 14:10:46 DEBUG : backups/stasiewa/marais/.thumbnails/normal/00704371f1673ac2f91835f4c563d322.png: vfs cache: cancelling upload

So not the same MD5 issue as above.

Stack trace:
example3.log.txt (184.4 KB)

Thanks!

I will reply on the issue to this - thank you :slight_smile:

Thanks! Give a shout if you need any more info!

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.