Dealing with `deletefile` when writing a backend from scratch

Ah...

It is relatively easy to add a new escape, eg

The test cases are automatically generated

You can leave this as a TODO for the moment if you want.

Now 5 tests remaining, except punctuation test which never pass.

https://gist.github.com/Lesmiscore/03bfcca4b3c30f82a4d6ab5d58bf77fb/raw/0e3f848980d0b8a7f7f959b63fe2d583ccc4cc1f/tests.log

Looking at this TestIntegration/FsMkdir/FsPutFiles/FsListDirFile2 looks like it is failing because of

    fstest.go:144: 
        	Error Trace:	fstest.go:144
        	            				fstest.go:150
        	            				fstests.go:205
        	            				fstests.go:222
        	            				fstests.go:870
        	Error:      	Should be true
        	Test:       	TestIntegration/FsMkdir/FsPutFiles
        	Messages:   	Internet Archive item lesmi-rclone-test path rclone-test-kunuran0sugecov3guwarep4/file name.txt: md5 hash incorrect - expecting "84eed0960840c442fd04a3528fb0de71" got "d2e2c5ba2830266db6494f6210530b1c"

I'd work on that first. Is the upload corrupted? Or is it caching of an old value? Or what?

That seems to be an old value, if the content has been overwritten throughout the tests. The /metadata/ endpoint, which List*() relies on, gives the expected value.

Did the object get updated but its hashes didn't? Or did the endpoint return stale data?

Test case and backend considered upload "completed" before actual data to be updated.
Currently the backend won't wait for the file to be processed. (it needs either new and old size and new and old hashes, as IA doesn't provide direct API to poll its status)

Rclone needs to be able to query the correct hashes as soon as the backend returns from the Update call, so you'll have to figure a work-around for that.

I just need to implement polling... which doesn't always work...
I'll try anyway.

Only one test remains! finally...

@ncw Now, the last test that was failing is PASSED (except another one which is known to not to pass).
Ready for review.

I do think semicolon problem should be in a separate PR or anything.

edit: all debugging println's are removed

Well done :slight_smile:

We can put the semicolon escape in another PR if you want, or I'll add the mechanism for it and you can rebase and use it.

I've set the CI off on the PR to see if it passes the tests, then I'll take a look at the code!

Can you please do this? I'll rebase it later on.

I note that you've got some lint stuff to fix up here: backend/internetarchive: add support for Internet Archive · rclone/rclone@0f02a32 · GitHub

Mostly about exported methods needing comments!

Note that you can run these checks locally using make check - you'll need to install golangci-lint which make build_dep will do for you.

I added a Semicolon encoding to master now - if you rebase on master you can pick that up and fix the last test!

Thank you! I confirmed the punctuation test passes.
I've rebased now. Let me know if it needs squashing.

=== RUN   TestIntegration
    fstests.go:418: Using remote "TestIA:lesmi-rclone-test/"
=== RUN   TestIntegration/FsMkdir
=== RUN   TestIntegration/FsMkdir/FsEncoding
=== RUN   TestIntegration/FsMkdir/FsEncoding/punctuation
    fstests.go:674: testing "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
--- PASS: TestIntegration (120.97s)
    --- PASS: TestIntegration/FsMkdir (120.08s)
        --- PASS: TestIntegration/FsMkdir/FsEncoding (119.23s)
            --- PASS: TestIntegration/FsMkdir/FsEncoding/punctuation (118.96s)
PASS
ok      command-line-arguments  120.982s

Before review or merge, let me test reducing encodings. Some of them may be removed. Gave up. Literally unpredictable.

=== RUN   TestIntegration
    fstests.go:418: Using remote "TestIA:lesmi-rclone-test/"
=== RUN   TestIntegration/FsMkdir
=== RUN   TestIntegration/FsMkdir/FsEncoding
=== RUN   TestIntegration/FsMkdir/FsEncoding/punctuation
    fstests.go:674: testing "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
    fstests.go:682: 
                Error Trace:    fstests.go:682
                Error:          Received unexpected error:
                                parse "https://s3.us.archive.org/lesmi-rclone-test/rclone-test-binurih9lohejar7lisumeb0/!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~/!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~": invalid URL escape "%&'"
                Test:           TestIntegration/FsMkdir/FsEncoding/punctuation
    fstest.go:299: Want: , Got: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~/!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ (100)
    fstest.go:300: Sleeping for 625ms for list eventual consistency: 1/20
    fstest.go:299: Want: , Got: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~/!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ (100)
    fstest.go:300: Sleeping for 781.25ms for list eventual consistency: 2/20
    fstest.go:299: Want: , Got: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~/!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ (100)
    fstest.go:300: Sleeping for 976.5625ms for list eventual consistency: 3/20
    fstest.go:299: Want: , Got: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~/!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ (100)
    fstest.go:300: Sleeping for 1.220703125s for list eventual consistency: 4/20
    fstest.go:299: Want: , Got: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~/!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ (100)
    fstest.go:300: Sleeping for 1.525878906s for list eventual consistency: 5/20

Great

Backend submissions are usually squashed into one commit. I don't think there is any extra stuff in yours is there?

This indicates something went wrong in the URLs...

This is really hard to get right, I agree and might be inconsistent in the server. Encoding the troublesome characters is fine.

Inspecting requests from Web UI suggests that if it's able to make a request with URL-encoded, like PUT /bucket-test%2Fhello%3Fworld%21.txt HTTP/1.0, it works. You can even include semicolons.
But rest module decodes it before making request, thus causing the problem.

diff --git a/backend/internetarchive/internetarchive.go b/backend/internetarchive/internetarchive.go
index 568f4a1..4b06559 100644
--- a/backend/internetarchive/internetarchive.go
+++ b/backend/internetarchive/internetarchive.go
@@ -709,7 +709,7 @@ func (o *Object) Update(ctx context.Context, in io.Reader, src fs.ObjectInfo, op
        var resp *http.Response
        opts := rest.Opts{
                Method:        "PUT",
-               Path:          path.Join("/", bucket, bucketPath),
+               Path:          fmt.Sprintf("/%s", url.QueryEscape(path.Join(bucket, bucketPath))),
                Body:          in,
                ContentLength: &size,
                ExtraHeaders:  headers,

Good. We should be able to support that.

Can you check with -vv --dump bodies to see what gets sent in the HTTP request.

I think it is http.NewRequestWithContext in the standard library which is doing the decoding because it calls net/url.Parse

See the url.URL docs

My reading of the code is that it should be storing the %encoded stuff in RawPath and using that in the request.

Maybe the problem is at the other end and doing that url.QueryEscape is what it needs? Looking at what rclone is sending with -vv --dump bodies should clear things up.

Okay, reduced encoding to minimum. Confirmed tests pass.

I thought I was using PathEncoding at that step... Confusing what was going on

Well done. I'll do a code review on the pull request now.

1 Like