Rclone mount vfs azureblob

What is the problem you are having with rclone?

I am facing an issue with inaccessible data caused by "old" cache entries.
I think it happend during a rename of the folder by the application, but cannot proof it.

What is your rclone version (output from rclone version)

rclone v1.53.2

  • os/arch: linux/amd64
  • go version: go1.15.3

Which OS you are using and how many bits (eg Windows 7, 64 bit)

SLES 12 SP5 64bit

Which cloud storage system are you using? (eg Google Drive)

AzureBlob

The command you were trying to run (eg rclone copy /tmp remote:tmp)

/usr/bin/rclone mount --daemon --cache-dir /mnt/resource/rclonecache --track-renames --dir-cache-time 24h --attr-timeout 4m --vfs-write-back 5s --vfs-cache-mode full --transfers=16 --azureblob-upload-cutoff 32m --azureblob-chunk-size 32m --checksum --vfs-read-chunk-size 32m --vfs-read-chunk-size-limit 500m --multi-thread-cutoff=32m --multi-thread-streams=16 --vfs-cache-max-age 168h --vfs-cache-max-size 300g --allow-other --allow-non-empty --use-mmap --buffer-size 1

A log from the command with the -vv flag

vfs cache: too many errors 11/10: last error: vfs reader: failed to write to cache file: failed to open for download: -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, github.com/Azure/azure-storage-blob-go@v0.10.0/azblob/zc_storage_error.go:42
===== RESPONSE ERROR (ServiceCode=BlobNotFound) =====
Description=The specified blob does not exist.
RequestId:a3f6406b-201e-0103-684a-e41c96000000
Time:2021-01-06T16:40:35.1772218Z, Details:
Code: BlobNotFound
Authorization: REDACTED
User-Agent: [rclone/v1.53.2]
X-Ms-Client-Request-Id: [a46c1b50-8557-4f32-63cc-d252c648eca3]
X-Ms-Date: [Wed, 06 Jan 2021 16:40:35 GMT]
X-Ms-Range: [bytes=0-33554431]
X-Ms-Version: [2019-02-02]

RESPONSE Status: 404 The specified blob does not exist.
Content-Length: [215]
Content-Type: [application/xml]
Date: [Wed, 06 Jan 2021 16:40:34 GMT]
Server: [Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0]
X-Ms-Client-Request-Id: [a46c1b50-8557-4f32-63cc-d252c648eca3]
X-Ms-Error-Code: [BlobNotFound]
X-Ms-Request-Id: [a3f6406b-201e-0103-684a-e41c96000000]
X-Ms-Version: [2019-02-02]

Details

I have 2 directories on blob storage. Its cryptic so i just shorten the names:
old_1607547325_1607503210_277_727039C7-35B7-48EF-8BB8-D0C4A37F47B4
new_1607547325_1607503210_277_727039C7-35B7-48EF-8BB8-D0C4A37F47B4

The application did a rename from the old name to the new one. So i can find the files in the new directory on the blob storage.
But I cannot access the files through rclone.

The issue is, when i try to access the files through the rclone mount on new_1607547325_1607503210_277_727039C7-35B7-48EF-8BB8-D0C4A37F47B4
rclone tries to search in the old folder. The old folder still exists but does not have the files.

When i try to access the data I get the above error message with the folder old_1607547325_1607503210_277_727039C7-35B7-48EF-8BB8-D0C4A37F47B4
For sure it cannot find the files cause they are not at the blob storage.

Rclone seems to have a wrong "link". I want to access the files in the new folder and it redirects me to the old folder.

  1. Could it be related to the --track-renames flag? Does azure blob support this?
  2. Do i have the possibility to check the cache somehow, why it is pointing to the old directory?
  3. Can i purge the cache without restarting rclone?

So from my point of view, the copy/rename/transfer to blob worked properly, the data is consistent on the storage in the new directory, but rclone is still pointing to the old one, so i cannot access the data.

Unfortunatly that is happing not very often, so its hard to reproduce. I will try to setup a test machine anyhow to get some more details.

Hope you can point me to the right direction here.

Note that this says that you are happy for rclone to keep directories for 24 hours without re-reading them which might be part of the problem. If the directory got read then rclone will ignore external changes for 24 hours.

It might be worth trying the latest beta also.

Azure blob does support this, but not on rclone mount

Not easily

You can use the rc and vfs/refresh and vfs/forget

I only allow the system with rclone to connect to the blob. No external accesses are allowed and I do not run other processes on the server which might change beside of rclone.

OK, so i will disable that in the config

Unfortunatly I get the following error with both commands:
2021/01/07 19:05:59 Failed to rc: connection failed: Post "http://localhost:5572/vfs/forget": dial tcp [::1]:5572: connect: connection refused
So i need to set the rc flag for the daemon...

The error is still there so i do not want to restart rclone for troubleshooting reasons. Do I have another chance to clear the cache?

https://rclone.org/rc/#vfs-refresh
"If rclone is run with the --rc flag then it starts an http server which can be used to remote control rclone using its API."
so add --rc to the mount command.

and this will do a refresh
rclone rc vfs/refresh recursive=true

Try this (from the mount docs)

You can send a SIGHUP signal to rclone for it to flush all directory caches, regardless of how old they are. Assuming only one rclone instance is running, you can reset the cache like this:

kill -SIGHUP $(pidof rclone)

I did what you said, but rclone still points to the old directory. Hard to say what is happening here.

Try using the rc commands?

I restarted the rclone process and that solved the issue directly. There was something wrong in the cache, probably because of the track renames?
I will remove the track renames and add the rc flag for the next test.

I enabled rc testwise on a system and tried to refresh the cache with
rclone rc vfs/refresh recursive=true

After some time I get the error:
2021/01/13 10:29:01 Failed to rc: connection failed: Post "http://localhost:5572/vfs/refresh": net/http: timeout awaiting response headers

Is there any timeout flag i can define?

This is taking longer than 30 seconds which is the default timeout. If you add _async=true it will do it in the background and return immediately.

You can poll for the completion using the rc job interface if you want.

I can see the issue again. I enabled --rc on the machines so i can refresh the vfs cache but it does not care about it. It seems that there is another cache wich is not refreshed. I remove also the track-renames stuff and it still comes up again.

What I am doing:

ls /"new_path"/file.txt #this works

cat /"new_path"/file.txt

Input/output error

rclone log says:
vfs cache: failed to download: vfs reader: failed to write to cache file: failed to open for download: -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, github.com/Azure/azure-storage-blob-go@v0.10.0/azblob/zc_storage_error.go:42
===== RESPONSE ERROR (ServiceCode=BlobNotFound) =====
Description=The specified blob does not exist.
RequestId:20067330-a01e-0070-5e82-ee2a00000000
Time:2021-01-19T16:43:49.4167029Z, Details:
Code: BlobNotFound
GET https://"old_path"
Authorization: REDACTED
User-Agent: [rclone/v1.53.2]
X-Ms-Client-Request-Id: [e06e3116-fa47-4af5-75bc-ae11c4cdcdd7]
X-Ms-Date: [Tue, 19 Jan 2021 16:43:49 GMT]
X-Ms-Range: [bytes=0-650196]
X-Ms-Version: [2019-02-02]

RESPONSE Status: 404 The specified blob does not exist.
Content-Length: [215]
Content-Type: [application/xml]
Date: [Tue, 19 Jan 2021 16:43:48 GMT]
Server: [Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0]
X-Ms-Client-Request-Id: [e06e3116-fa47-4af5-75bc-ae11c4cdcdd7]
X-Ms-Error-Code: [BlobNotFound]
X-Ms-Request-Id: [20067330-a01e-0070-5e82-ee2a00000000]
X-Ms-Version: [2019-02-02]

rclone rc vfs/refresh dir="old_path"

"file does not exist"

rclone rc vfs/refresh dir="new_path"

"file does not exist"

rclone rc vfs/forget dir="old_path"

"forgotten": "old_path"

rclone rc vfs/forget dir="new_path"

"forgotten": "new_path"

But still when I want to access the file in the new path it says input/putput error. The directory structure is properly. It is in the new path, but when I want to access a file in the directory it tries to connect to the old location.

Do i miss something in my flags regarding a parallel caching mechanism?

Thanks

It would be worth trying the latest beta here as if you have external modifications of files not through rclone then you can get this symptom.

Only rclone is modifying things on the storage.There is no other access allowed except from this system with one rclone process running.
Any other ideas?

Try the beta.

If that doesn't work then the best thing would be to make a way for me to reproduce it, so a sequence of commands I can run to get the mount into the broken state.

Tried the latest beta. Still not working, but fortunatly I found a way to reproduce it. Hopefully you can fix it.

It seems to be an issue with the rename command. mv and cp are working properly.
To reproduce try this:

On rclone blob mount:

#mkdir rclonetest
#echo "this is a test" > rclonetest/test.txt
#cat rclonetest/test.txt
this is a test
#rename rclone rclone1 rclonetest
#ls
rclone1test
#cat rclone1test/test.txt
this is a test (works but logfile says that it uses the old name in the cache. Below log file. On blob I have one rclonetest without the file and one rclone1test with the file. So it just relies on the cache currently and also DO NOT delete the old directory on blob. Also a rm to the directory does not delete it on blob. So thats another issue)

After some time, I think after the cache gets cleared. I cannot access the file any more:
#cat rclone1test/test.txt
cat: rclone1test/test.txt: Input/output error

Unfortunatly these commands are not removing the cache at all. I can still see the stuff in the cache directory. Therefore I can not force the cache to be removed and reproduce it immediatly
rclone rc vfs/forget file=test/rclone1test/test.txt
rclone rc vfs/forget file=test/rclonetest/test.txt

Logfile:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: Open: flags=OpenReadOnly
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: Open: flags=O_RDONLY
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: newRWFileHandle:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: >newRWFileHandle: err=
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: >Open: fd=test/rclonetest/test.txt (rw), err=
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: >Open: fh=&{test/rclonetest/test.txt (rw)}, err=
2021/01/21 14:37:40 DEBUG : &{test/rclonetest/test.txt (rw)}: Read: len=4096, offset=0
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): _readAt: size=4096, off=0
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): openPending:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: vfs cache: checking remote fingerprint "15,2021-01-21 13:34:55.607814746 +0000 UTC,e19c1283c925b3206685ff522acfe3e6" against cached fingerprint "15,2021-01-21 13:34:55.607814746 +0000 UTC,e19c1283c925b3206685ff522acfe3e6"
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: vfs cache: truncate to size=15
2021/01/21 14:37:40 DEBUG : test/rclone1test: Added virtual directory entry vAddFile: "test.txt"
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): >openPending: err=
2021/01/21 14:37:40 DEBUG : vfs cache: looking for range={Pos:0 Size:15} in [{Pos:0 Size:15}] - present true
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): >_readAt: n=15, err=EOF
2021/01/21 14:37:40 DEBUG : &{test/rclonetest/test.txt (rw)}: >Read: read=15, err=
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: Attr:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: >Attr: a=valid=4m0s ino=0 size=15 mode=-rw-r--r--, err=
2021/01/21 14:37:40 DEBUG : &{test/rclonetest/test.txt (rw)}: Flush:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): RWFileHandle.Flush
2021/01/21 14:37:40 DEBUG : &{test/rclonetest/test.txt (rw)}: >Flush: err=
2021/01/21 14:37:40 DEBUG : &{test/rclonetest/test.txt (rw)}: Release:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): RWFileHandle.Release
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): close:
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt: vfs cache: setting modification time to 2021-01-21 14:34:55.607814746 +0100 CET
2021/01/21 14:37:40 DEBUG : test/rclonetest/test.txt(0xc000b78080): >close: err=
2021/01/21 14:37:40 DEBUG : &{test/rclonetest/test.txt (rw)}: >Release: err=

2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: Open: flags=OpenReadOnly
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: Open: flags=O_RDONLY
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: newRWFileHandle:
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: >newRWFileHandle: err=
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: >Open: fd=test/rclonetest/test.txt (rw), err=
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: >Open: fh=&{test/rclonetest/test.txt (rw)}, err=
2021/01/21 14:39:00 DEBUG : &{test/rclonetest/test.txt (rw)}: Read: len=4096, offset=0
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt(0xc000b70380): _readAt: size=4096, off=0
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt(0xc000b70380): openPending:
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: vfs cache: checking remote fingerprint "15,2021-01-21 13:34:55.607814746 +0000 UTC,e19c1283c925b3206685ff522acfe3e6" against cached fingerprint "15,2021-01-21 13:34:55.607814746 +0000 UTC,e19c1283c925b3206685ff522acfe3e6"
2021/01/21 14:39:00 ERROR : test/rclonetest/test.txt: vfs cache: detected external removal of cache file
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: vfs cache: removed metadata from cache as cache file externally deleted
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: vfs cache: truncate to size=15
2021/01/21 14:39:00 DEBUG : test/rclone1test: Added virtual directory entry vAddFile: "test.txt"
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt(0xc000b70380): >openPending: err=
2021/01/21 14:39:00 DEBUG : vfs cache: looking for range={Pos:0 Size:15} in [] - present false
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: ChunkedReader.RangeSeek from -1 to 0 length -1
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: ChunkedReader.Read at -1 length 32768 chunkOffset 0 chunkSize 16777216
2021/01/21 14:39:00 DEBUG : test/rclonetest/test.txt: ChunkedReader.openRange at 0 length 16777216
2021/01/21 14:39:00 INFO : test/rclonetest/test.txt: vfs cache: downloader: error count now 1: vfs reader: failed to write to cache file: failed to open for download: -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, github.com/Azure/azure-storage-blob-go@v0.12.0/azblob/zc_storage_error.go:42

2021/01/21 14:39:00 ERROR : test/rclonetest/test.txt: vfs cache: failed to download: vfs reader: failed to write to cache file: failed to open for download: -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, github.com/Azure/azure-storage-blob-go@v0.12.0/azblob/zc_storage_error.go:42
===== RESPONSE ERROR (ServiceCode=BlobNotFound) =====
Description=The specified blob does not exist.
RequestId:b3bf8046-901e-0073-43fa-ef2907000000
Time:2021-01-21T13:39:00.6662581Z, Details:
Code: BlobNotFound

I tried to replicate but I couldn't get the rename command to work

$ rename rclone rclone1 rclonetest
Bareword "rclone" not allowed while "strict subs" in use at (user-supplied code).

I'm trying

rename         1.10-1       all          Perl extension for renaming multiple files

Is that the correct command?

I am using it on SLES so I am not sure if its the same on your distribution.

It is this:
rename from util-linux 2.33.2

#zypper search util-linux
Refreshing service 'spacewalk'.
Loading repository data...
Reading installed packages...

S | Name | Summary | Type
--+--------------------+----------------------------------------+--------
i | util-linux | A collection of basic system utilities | package
| util-linux-lang | Languages for package util-linux | package
i | util-linux-systemd | A collection of basic system utilities | package

Additional Info:
The "link" from the old directory and files in the cache to the new filesystem structure is still working for the Metadata after some days. It only breaks for the File caching.
I can do "ls" to the files in the new directory and it routes it internally to the old folder and then to the new folder on blob. So everything is fine here. It also updates the Metacache.
But when I want to access the file it says the above errors cause it somehow lost the link between the old folder to the new folder on blob.

Sorry for the bad explanation.

Do you have any news here?

Ah, I've just figured out with a bit of grepping that it is called rename.ul in ubuntu

I haven't managed to replicate this with the latest beta.

Can you attach a complete log of the mount with -vv - I want to see exactly what happens around the time your run the rename command. I suspect something is going wrong with renaming the directory.

Note that renaming a directory on azureblob isn't straight forward - all the files within need to be renamed.