Http remote without directory listings

is it possible to use rclone copy on an http server if I can generate a known file list to work from? All images are available from url okay.

my file list looks like so:

cat filez

/d/a/i/d/e/i/5f37fe00-a79b-40ab-921d-eebb4077336b.jpeg

config file looks like so:

type = google cloud storage
client_id =
client_secret =
project_number = 5299483XXXXX
service_account_file = ./tmpkey.json
object_acl = publicRead
bucket_acl = publicRead
location = us-central1
storage_class = REGIONAL

[TMP_HTTP]
type = http
url = http://localhost:5000/hostedImageRoot/

command:

rclone --config image-uploader.conf copy :http: files/ --files-from filez

I tried --files-from, but I'm getting:

2019/09/12 09:41:49 ERROR : Attempt 1/3 failed with 3 errors and: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:41:49 ERROR : : error listing: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:42:06 ERROR : : error reading source directory: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:42:06 ERROR : Attempt 2/3 failed with 3 errors and: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:42:06 ERROR : : error listing: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:42:24 ERROR : : error reading source directory: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:42:24 ERROR : Attempt 3/3 failed with 3 errors and: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden
2019/09/12 09:42:24 Failed to copy with 3 errors: last error was: error listing "": failed to readDir: HTTP Error 403: 403 Forbidden

You'll need to use --no-traverse too to stop the directory listings.

https://rclone.org/docs/#no-traverse

Thank you ncw,

as always I appreciate the help, I couldn't find that option anywhere!! :slight_smile:

question for you, we have many customers that store images within a shared http directory, and they recently disabled listings, but we can generate a file list of all of the images they have. What's troubling is I see in the documentation

If you are only copying a small number of files (or are filtering most of the files) and/or have a large number of files on the destination then --no-traverse will stop rclone listing the destination and save time.

However, if you are copying a large number of files, especially if you are doing a copy where lots of the files under consideration haven’t changed and won’t need copying then you shouldn’t use --no-traverse.

Some of these users have a million + files, and we only want to upload new/missing files from local to the google storage remote.

Do you have any suggestions to keep memory low and speeds hi while still keeping files in a somewhat synced state?

Filename might work, or even file size, and I'm not sure if checksum is supported on the http copy side of things.

Using --no-traverse with the http backend doesn't really cost extra as it is necessary to do a HEAD on each file anyway even when doing listings.

So I wouldn't worry about it.

I think your --files-from --no-traverse will work as efficiently as anything else!

rclone will be checking file size and modification date with the http backend. Checksums aren't supported.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.