Rclone mount: Synchonous write transaction?

Hello i m testing rclone git master (latest tag is 1.54-stable, sha: 1406b6c3c9e21f59cba027101bfb2e014a21736e). My windows version is 20H2 x64, winfsp - 2020.2

I experience an issue when i can not get status for write operation onto rclone mount -ed virtual wile system.

Digging deeper i found that control to the system was returned after cmd/cmount.Write() exits, then cmd/cmount.Flush() is called and actual remote flushing (in my case backend/webdav.Update()) occurs. So if there is access violation on remote, client will not catch an error.

e.g. echo 1 > w:\test.txt return control before Flush starts...

Maybe it is supposed to be so, to increase write performance? But this look strange enought.

The command you were trying to run:

rclone mount :webdav: w: --webdav-url https://webdav:8800/

im looking that WinFSPy does not return control before cleanup: winfspy\plumbing\file_system_interface.py:_trampolin_fs_Cleanup() is called immediately after _trampolin_fs_Write()

So maybe there is something different in volume setup?

The way FUSE filesystem work is that they will try to return errors on Write and also on Flush but there is no way to return an error from Release - this is called by the kernel some time after the user has finished with the file.

From your description it sounds like some writeback caching is going on.

Can you do a simple example with rclone mount -vv --debug-fuse and post the log here? Then I will be able to see exactly what you mean.

There may well be mount options to control this -o flag or WinFSP options which you can pass through as well.

mistaken with git version, updated to actual master sha: d27c35ee4a9882d453dcda6420ac935d520eac93

so control was returned here:

Ah, I see what is going on. It is rclone doing the caching for small files.

Try setting --streaming-upload-cutoff 0

  --streaming-upload-cutoff SizeSuffix   Cutoff for switching to chunked upload if file size is unknown. Upload starts after reaching cutoff or when file ends. (default 100k)

i will do. I think it is cgofuse issue as i experience issue before rclone got control, i had played several -o s with no luck.

This will force rclone to open the file in the Write which should fix the problem for you I think.

I wonder if rclone should always be setting that with mount.

rclone.log (11.6 KB)

you may see time break before 14:27:47, that is i got control in the sHell...

nope that does not help

it probably i should try to reproduce filesystem mounting like WinFSPy do on rclone level...

ok first we should confirm that issue is really persists, @ncw may i hope that you will check that control was returned before Flush occur? It all sounds that it should not happed this way.
Then i would investigate the difference between cgofuse and WinFSPy mounts.

I just noticed this in your log

2021/04/25 14:27:37 DEBUG : webdav root '': Target remote doesn't support streaming uploads, creating temporary local FS to spool file

What that means is the webdav backend doesn't support unknown length uploads so rclone has to spool the file to disk. This means that the file won't be transmitted to the remote end until the Flush is called. However the flush should return any errors at that point.

You could try your test to a different backend (you can use the local disk) and see if you get the same result?

rclone2.log (16.1 KB) rclone.log (10.6 KB)

command: rclone mount :local:f: p: -vv --debug-fuse

shell output:

C:\Users\kalem>echo 1 > p:\test.txt

C:\Users\kalem>copy terra.ma p:\
Files copied:         1.

C:\Users\kalem\Desktop>

Destination is not writeable but operation succeeds. i also made time breaks (by breakpoints) before Flush started but operation is still succeeds without breakpoints.

So result is same even with local filesystem.

i also have unclear behavior with etalon winfsp memfs:

by default i have following calls by order (not sure if there are another important calls yet): Create, Cleanup, return control, Write

and passing -f option that corresponds to VolumeParams.FlushAndPurgeOnCleanup = 1;

i have: Create, Write, Cleanup, returning control, Close

I think i should check this volume parameter in cgofuse (that actually call fuse_main_real)...

Also i can not find place where Write operation return to WinFSP the STATUS_PENDING code. Current behaviour would be because of operation returns STATUS_SUCCESS code for asynchronous write...

Can you try your local test again with --streaming-upload-cutoff 0?

I can see it is buffering the file from the log.

rclone.log (11.7 KB)

Control is still returned before Flush starts...

If this was linux then an error reported by rclone on Flush would translate into an error return from the close() syscall.

2021/04/26 14:13:34 DEBUG : /test.txt: Flush: fh=0x0
2021/04/26 14:13:34 ERROR : test.txt: WriteFileHandle.New Rcat failed: Access is denied.
2021/04/26 14:13:34 ERROR : test.txt: WriteFileHandle.Flush error: Access is denied.
2021/04/26 14:13:34 ERROR : IO error: Access is denied.
2021/04/26 14:13:34 DEBUG : /test.txt: >Flush: errc=-5

Looking at the WinFSP debug it appears WinFSP calls both Flush and Release in one transaction which is reasonable, but crucially it seems to be ignoring the error code of the Flush

__debug_bin[TID=1164]: FFFF800C4D60EC30: >>Close 0000000000000000:000002A8BD34C270
2021/04/26 14:13:34 DEBUG : /test.txt: Flush: fh=0x0
2021/04/26 14:13:34 ERROR : test.txt: WriteFileHandle.New Rcat failed: Access is denied.
2021/04/26 14:13:34 ERROR : test.txt: WriteFileHandle.Flush error: Access is denied.
2021/04/26 14:13:34 ERROR : IO error: Access is denied.
2021/04/26 14:13:34 DEBUG : /test.txt: >Flush: errc=-5     <-------------- an error is returned by rclone here
2021/04/26 14:13:34 DEBUG : /test.txt: Release: fh=0x0
2021/04/26 14:13:34 DEBUG : test.txt: WriteFileHandle.Release nothing to do
2021/04/26 14:13:34 DEBUG : /test.txt: >Release: errc=0
__debug_bin[TID=1164]: FFFF800C4D60EC30: <<Close IoStatus=0[0] <------ but no error appears here?

So I wonder if this is a WinFSP FUSE layer bug?

@billziss-gh what do you think? Where did this error go?

not sure if i correctly explained the issue that i concerned with.

It does not matter if there is an error or not, but now control returned to the system immediately after Write operation finished so as control returned, there is not way to get status or measure time spent on operation. This would be inspected with debugger.

As far as i know, write operation should return 259/0x103 code (STATUS_PENDING) for asynchronous write that definitely rclone do, but i still can not find corresponding place to inspect neither in rclone or cgofuse.

I would guess that this is caused by a mis-match between the FUSE file system layer which was designed for Linux/Unix systems and WinFSP which has to adapt that into a Windows file system.