Error copying HTTP directory with plus in name

Hello,

I suspect this is a bug, rclone is unable to sync a sub/directory which has a plus at the end of the name. For example when using the following command

rclone sync --http-url "https://archive.org/download/everdrivepack/MegaCD/" ":http:" "LOCAL PATH" --checkers 1 --transfers=1 --progress --delete-during --local-no-sparse --multi-thread-streams 0 --low-level-retries 32 --retries 10 --exclude "*zip/**" --exclude "*7z/**" --refresh-times -vv

It will not copy the contents of https://archive.org/download/everdrivepack/MegaCD/MegaSD%20MD%2B/

The debug error is

2020-06-19 13:12:59 ERROR : Attempt 10/10 failed with 1 errors and: error listing "MegaSD MD+/": directory not found

Thanks,

Mike

That looks to be more so the HTTP Server is giving you a translation when it displays on the page.

This is the URL:

https://archive.org/download/everdrivepack/MegaCD/MegaSD%20MD%2B so if you hit that, it works fine:

rclone ls --http-url "https://archive.org/download/everdrivepack/MegaCD/MegaSD%20MD%2B/" ":http:"
201688633 Golden Axe Symphony MD+.zip
168143910 Michael Jackson's Moonwalker - Instrumental MD+.zip
198690759 Michael Jackson's Moonwalker - Vocals MD+.zip
700664999 OutRun Arcade OST & Pyron Color Restoration MD+.zip
700664999 OutRun Arranged MD+.zip
977954764 Streets of Rage 2 Arranged MD+.zip
886838420 Ys III - Oath in Felghana MD+.7z
2020/06/19 10:16:29 ERROR : Michael Jackson's Moonwalker - Instrumental MD+.zip: error listing: error listing "Michael Jackson's Moonwalker - Instrumental MD+.zip/": directory not found
2020/06/19 10:16:29 ERROR : OutRun Arranged MD+.zip: error listing: error listing "OutRun Arranged MD+.zip/": directory not found
2020/06/19 10:16:29 ERROR : Michael Jackson's Moonwalker - Vocals MD+.zip: error listing: error listing "Michael Jackson's Moonwalker - Vocals MD+.zip/": directory not found
2020/06/19 10:16:29 ERROR : OutRun Arcade OST & Pyron Color Restoration MD+.zip: error listing: error listing "OutRun Arcade OST & Pyron Color Restoration MD+.zip/": directory not found
2020/06/19 10:16:29 ERROR : Ys III - Oath in Felghana MD+.7z: error listing: error listing "Ys III - Oath in Felghana MD+.7z/": directory not found
2020/06/19 10:16:29 ERROR : Streets of Rage 2 Arranged MD+.zip: error listing: error listing "Streets of Rage 2 Arranged MD+.zip/": directory not found
2020/06/19 10:16:29 ERROR : Golden Axe Symphony MD+.zip: error listing: error listing "Golden Axe Symphony MD+.zip/": directory not found
2020/06/19 10:16:29 Failed to ls with 8 errors: last error was: error listing "Golden Axe Symphony MD+.zip/": directory not found
felix@gemini:/data/test$

but it also seems to list out files and tries to go down so that HTTP server seems not standard.

My copy works as well:

felix@gemini:/data/test$ rclone copy -vv --http-url "https://archive.org/download/everdrivepack/MegaCD/" ":http:" /data/test
2020/06/19 10:19:26 DEBUG : rclone: Version "v1.52.1" starting with parameters ["rclone" "copy" "-vv" "--http-url" "https://archive.org/download/everdrivepack/MegaCD/" ":http:" "/data/test"]
2020/06/19 10:19:26 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2020/06/19 10:19:33 DEBUG : AX-101 (Japan).zip: Starting multi-thread copy with 2 parts of size 160.375M
2020/06/19 10:19:33 DEBUG : AX-101 (Japan).zip: multi-thread copy: stream 2/2 (168165376-336304390) size 160.350M starting
2020/06/19 10:19:33 DEBUG : AX-101 (Japan).zip: multi-thread copy: stream 1/2 (0-168165376) size 160.375M starting
2020/06/19 10:19:33 DEBUG : 3 Ninjas Kick Back (USA).zip: Sizes differ (src 281202948 vs dst 149225050)
2020/06/19 10:19:33 DEBUG : A-Rank Thunder - Tanjou-hen (Japan).zip: Sizes differ (src 383643211 vs dst 199786074)
2020/06/19 10:19:33 DEBUG : A-X-101 (USA) (RE).zip: Sizes differ (src 347563235 vs dst 0)
2020/06/19 10:19:33 DEBUG : AH3 - Thunderstrike (USA).zip: Sizes differ (src 364482435 vs dst 220135002)
2020/06/19 10:19:33 DEBUG : After Armageddon Gaiden - Majuu Toushouden Eclipse (Japan).zip: Starting multi-thread copy with 2 parts of size 251M
2020/06/19 10:19:33 DEBUG : After Armageddon Gaiden - Majuu Toushouden Eclipse (Japan).zip: multi-thread copy: stream 2/2 (263192576-526349370) size 250.966M starting
2020/06/19 10:19:33 DEBUG : After Armageddon Gaiden - Majuu Toushouden Eclipse (Japan).zip: multi-thread copy: stream 1/2 (0-263192576) size 251M starting

The plus really should be encoded though I would have thought.

This has encoded parenthesis and works:
rclone copy --http-url "https://archive.org/download/everdrivepack/MegaCD" ":http:" --include="3 Ninja*/**" "/tmp/" -vv

This has the plus and doesn't:
rclone copy --http-url "https://archive.org/download/everdrivepack/MegaCD" ":http:" --include="MegaSD MD+/**" "/tmp/" -vv

Although if I test it with caddy browse it works. but caddy isn't encoding those in the html.

rclone copy  --http-url "https://xx.xx.org/pub/" ":http:"  "/tmp/"  -vv
2020/06/19 10:26:23 DEBUG : rclone: Version "v1.52.0-001-g1cceadaf-beta" starting with parameters ["rclone" "copy" "--http-url" "https://xx.xx.org/pub/" ":http:" "/tmp/" "-vv"]
2020/06/19 10:26:23 DEBUG : Using config file from "/home/xx/.rclone.conf"
2020/06/19 10:26:25 DEBUG : Local file system at /tmp/: Waiting for checks to finish
2020/06/19 10:26:25 DEBUG : Local file system at /tmp/: Waiting for transfers to finish
2020/06/19 10:26:25 DEBUG : test+/xx.mp4: Starting multi-thread copy with 2 parts of size 164.562M
2020/06/19 10:26:25 DEBUG : test+/xx.mp4: multi-thread copy: stream 2/2 (172556288-345077865) size 164.529M starting
2020/06/19 10:26:25 DEBUG : test+/xx.mp4: multi-thread copy: stream 1/2 (0-172556288) size 164.562M starting

But i'd think rclone would try to interpret %2B as a plus.

Those are actually directories... even with the extension.

Are they? I can click on the link and it downloads.

The 'view contents' link. Hover over it. They expand that into a directory. :laughing:

So it's both a link you can download and a directory? Isn't that like a duplicate that would cause confusion?

Yes. If they are both downloaded. :laughing:

rclone lsf   --http-url "https://archive.org/download/everdrivepack/MegaCD" ":http:" 2>&1  | grep -i Out
Heart of the Alien - Out of This World Parts I and II (USA) (RE).zip/
Heart of the Alien - Out of This World Parts I and II (USA) (RE).zip

I think the op excludes the zip contents though (). So I think it would be okay.
--exclude "*zip/**" --exclude "*7z/**"

It should be the job of the http server though to re-encode them. Kind of like how this works for the same file with parenthesis. Both work. Again testing on caddy.

xx@dell-rob:/tmp$ rclone lsl   --http-url "https://xx.xx.org/pub/test%28%29" ":http:"   -vv
2020/06/19 10:52:42 DEBUG : rclone: Version "v1.52.0-001-g1cceadaf-beta" starting with parameters ["rclone" "lsl" "--http-url" "https://xx.xx.org/pub/test%28%29" ":http:" "-vv"]
2020/06/19 10:52:42 DEBUG : Using config file from "/home/xx/.rclone.conf"
161139110 2020-06-10 13:42:46.000000000 xx.mp4
2020/06/19 10:52:42 DEBUG : 3 go routines active
xx@dell-rob:/tmp$ rclone lsl   --http-url "https://xx.xx.org/pub/test()" ":http:"   -vv
2020/06/19 10:52:53 DEBUG : rclone: Version "v1.52.0-001-g1cceadaf-beta" starting with parameters ["rclone" "lsl" "--http-url" "https://xx.xx.org/pub/test()" ":http:" "-vv"]
2020/06/19 10:52:53 DEBUG : Using config file from "/home/xx/.rclone.conf"
161139110 2020-06-10 13:42:46.000000000 xx.mp4
2020/06/19 10:52:53 DEBUG : 3 go routines active''

If they are going to encode the plus sign, then they should decode it.

The archive.org creates a link for each ZIP and LZ file with the file contents, thats why I have the 2 excludes, and that part works correctly.

The site isn't handling the encoding of the plus sign.

Thank you! :slight_smile:

A + in a URL is encoding for a space so you have to encode the + as %2B. The spaces can be encoded as %20 too so using the URL the browser uses works

rclone ls --http-url "https://archive.org/download/everdrivepack/MegaCD/MegaSD%20MD%2B/" :http:

Rclone could be slightly smarter here as space is never a valid character in a URL so it could encode that for you.

I'll investigate further the original problem.

What version of rclone are you using? There have been various bugs in the http backend over the years!

rclone will download the directory if I specify it in the command line, but will not if I specify the parent directory. Anyway, it seems to be a webserver error.

If I enter this URL I get no error:
https://archive.org/download/everdrivepack/MegaCD/MegaSD%20MD%2B/

For example if I enter this URL in the browser or any command line tool I get an error:
https://archive.org/download/everdrivepack/MegaCD/MegaSD MD+/

But if I remove the last slash it will work (but redirect me to the backend webserver, this is problably some load balancer...), so this indeed seems like a web server error.
https://archive.org/download/everdrivepack/MegaCD/MegaSD MD+

I'm using the latest build (the one with the refresh-times option)

Strictly speaking this isn't a valid URL so what happens to it is probably undefined!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.