HTTP not listing from "directory lister"

I am trying to list files from a website that uses “Directory Lister” to list files,
But sadly rclone isn’t able to list from it.

Would be really glad if anyone could help.

Using rclone version:
rclone v1.45

  • os/arch: windows/amd64
  • go version: go1.11

Looking at the directory lister example

This then links to files like

This won’t work with the current scheme as rclone is expecting the links to look like this

If you extract the names from the directory listings, then rclone can fetch the files for you or mount them or whatever…

eg put this into files.txt


The names can have directories in, but should start from the root


$ rclone --http-url --files-from files.txt lsf :http:

$ rclone -v --http-url --files-from files.txt copy :http: code-copy
2018/12/02 14:06:27 INFO  : Local file system at /tmp/code-copy: Waiting for checks to finish
2018/12/02 14:06:27 INFO  : Local file system at /tmp/code-copy: Waiting for transfers to finish
2018/12/02 14:06:27 INFO  : hello-world.css: Copied (new)
2018/12/02 14:06:27 INFO  : hello-world.c: Copied (new)
2018/12/02 14:06:27 INFO  : Copied (new)
2018/12/02 14:06:27 INFO  : hello-world.html: Copied (new)
2018/12/02 14:06:27 INFO  : 
Transferred:   	       659 / 659 Bytes, 100%, 447 Bytes/s, ETA 0s
Errors:                 0
Checks:                 0 / 0, -
Transferred:            4 / 4, 100%
Elapsed time:        1.4s

Hope that helps!

1 Like

Thank you so much for your reply.

I had another question.

Can we skip the files that don’t exist in the http remote but are present in the --files-from files.txt?
I’ve extracted all of files into a text file but some of the files in the remote longer exists.
So rclone doens’t do anything and puts up this error,
Failed to lsf: Stat failed: failed to stat: HTTP Error 404: 404 Not Found

Can I suppress this error and still continue with the files that are present in the remote as well as in the files.txt.

I think rclone should be doing that already…

In fact it looks like a bug…

Try this (uploaded in 15-30 mins)

You are right,

This does work.

Thanks a lot.

Thanks for testing. I’ll merge that to the latest beta now - it will be there in 15-30 mins.


I’ve done just that, a scrapy spider crawls the website to list me all the links and stores it into a file for the remote i am trying to fetch using rclone.
Both the crawler and rclone runs in a scheduled manner, once every 6 hours,
but Rclone is taking very long to start the process of copy (about 2-4 hours) depending on the number of entries in the file mentioned in --files-from.

Without the use of --files-from rclone would perform much better,
is this supposed to take a long time to start?
(The file has around 4k-5k entries)

Rclone version I’ve tried on:
rclone: Version “v1.45-031-ge7684b7e-beta”

Hmm, yes rclone is checking each file in the file-from exists. However it does this in a very inefficient way, one at a time! Rclone should really be parallelising this using --checkers threads at once like it does everything else.

This would be relatively easy to implement.

The code is here:

Can you please make a new issue on github about that and we can have a go at fixing it. Maybe you’d like to help?

1 Like

I’ve created a new issue,

I dont think I would be much help right now as I dont know GoLang at all,
Currently I am learning the basics of it,
Will try to help when I am able to understand whats going on in the project.

I’ve posted a beta in the issue for you to try :smile:

This issue isn’t a good one for people new to Go as anything involving concurrency is always difficult!


The latest Beta is showing as:

Isn’t is supposed to be:
or something with the version 033.

I might be mistaken.
I apologise if thats the case.

I’ve encountered a new problem that i really need your help with.

There is an http remote which has many files that i need to copy into a google drive remote.
I’ve extracted all the links and stored it in a file names files.txt

files.txt is in this format:


Now the issue is that the server requires those jtokens to authenticate and if i were to pass the entire link including the jtoken, to a downloader(idm) then the file is downloaded.

Is there a way for rclone to able to takle this problem?

The numbers are commits since the version was made so I wouldn’t expect the number to be less.

The latest beta right now is

I don’t think you can do it with the http backend, but you can use rclone copyurl


So do I need to call

rclone copyurl remote:dir1/file1.txt

for each individual file?

I actually have a lot of files(~10k), 200-500MB each, that need to be transferred, and this process will be very inefficient.

When I use rclone copyurl then --dry-run wasn’t working, and it copied the file anyway.

The speed was very less too,

Transferred: 28.406M / 210.926 MBytes, 13%, 339.306 kBytes/s, ETA 9m10s
Errors: 0
Checks: 0 / 0, -
Transferred: 0 / 1, 0%
Elapsed time: 1m25.7s

using a downloader or browser to download the file would give ~9-10 MBps, if i can acheive that then transferring the files one by one might become feasible.

My rclone version is:

rclone v1.45-056-g95e52e1a-beta

  • os/arch: windows/amd64
  • go version: go1.11



PS: I really appreciate you taking the time to reply to my queries.Thanks for your help.

Can you please make a new issue on github about this!

What you can do is use the rclone API to call the copyurl

So you’d run an rclone server “rclone rcd --rc-no-auth” then in another window issue

rclone rc operations/copyurl fs=drive: remote=path/to/file.txt url=

This will stop rclone having to make and remake the drive remote.

You can also do them in parallel if you supply “_async=true” to the command. (You might want to pace them a little otherwise it will do all of them at once!)

I’ve created a new issue on github regarding this.

I am not sure I understand how to pace them.

Basically i have a text file with all the links, so do you mean i can run a python script for this,
and the command is passed to the rclone server?


Just put a sleep 1 between each command or something like that.

1 Like

Thank you very much, it is working.

I have written down a script that does just that.
Few issues i am facing:

  • Extremely slow download rate for each file (10~15 KBytes/s)
  • copyurl copies content even if it exists in destination (the rcd server gets interrupted at times, so the script happens to copy files already copied)


It turns out that this problem that i was facing before still persits and rclone’s total bandwidth usage is around ~2 Mbits/s upload and ~2 Mbits/s download,
this is when i have 10-20 jobs running asncronously at a time.

Is there a way to increase the download rate for each rclone copyurl?

Also is it possible to check if file exists in the destination for rclone copyurl?
Basically if –ignore-existing flag can be used in this method.

I checked the logs, out of 200 files that the script tried copying using copyurl
180 transfers terminated with this error, after rclone sent a chunk of data:

2019/01/04 21:48:18 DEBUG : dir/file1.txt: Sending chunk 0 length 8388608
2019/01/04 21:48:18 ERROR : dir/file1.txt: Post request put error: Post stream error: stream ID 763; PROTOCOL_ERROR

Can this error be fixed?

with a run time of 6 hours only ~20 files were transferred with each transfer at average 10~15 KBytes/sec.:sweat:
Is there anything i can do to imporve this?

I really appreciate your help.

Can you try using copyurl to the local disk - how fast does it run then?

How big are the files you are copying?

I think the expectation here should be rclone does some sort of checking on the file, so if the remote file is the same length then it doesn’t copy it.

Implementing --ignore-existing is probably a good idea too.

That appears to be an HTTP2 error.

Can you run one copyurl (ideally with a small text file) which demostratest the problem with -vv --dump bodies?

It would be helpful to have a log to look at (with -vv) of the transfers.

--ignore-existing doesn’t seem to work with rclone copyurl. . . .:sos:
If it can work then, I would be grateful if you could tell me how to send --ignore-existing flags to the rcd.:raised_hands:t2:

About 200~300 MB each, some are Larger ~600 MB.

It has the same effect when saving in the disk
(The website seems to allow only ~150KB per connection, while idm manages to get downloads from 16 connections thereby downloading many parts and later appending)
(the ~150KB gets distributed among all the files downloading at a time :cold_sweat:)

I tried with a few text files, the problem didnt occour.
I think it only happens when I have schedules too many async jobs to the rcd.

PS: still appreciate you taking the time to reply.:innocent: