Add support for new kind of index in HTTP backend

What is the problem you are having with rclone?

I'm trying to add "https://proxy.hhindex2.workers.dev/" this site in the HTTP backend but it seems like rclone can't read this kind of index. @ncw is it possible to add support for this index ?

Run the command 'rclone version' and share the full output of the command.

rclone v1.57.0

  • os/version: Microsoft Windows 10 Pro 2009 (64 bit)
  • os/kernel: 10.0.22000.376 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.17.2
  • go/linking: dynamic
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

HTTP

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone lsd Test:

The rclone config contents with secrets removed.

[Test]
type = http
url = https://proxy.hhindex2.workers.dev/

A log from the command with the -vv flag

PS C:\Users\anjum> rclone lsd Test: -vv
2022/01/19 19:05:34 DEBUG : rclone: Version "v1.57.0" starting with parameters ["C:\\rclone\\rclone.exe" "lsd" "Test:" "-vv"]
2022/01/19 19:05:34 DEBUG : Creating backend with remote "Test:"
2022/01/19 19:05:34 DEBUG : Using config file from "C:\\Users\\anjum\\AppData\\Roaming\\rclone\\rclone.conf"
2022/01/19 19:05:35 DEBUG : 3 go routines active

Sites that do javascript and odd things won't work.

You can always do a feature request, but the backlog is huge and the value add here is low/minimal as it's a very complex thing to fix and that's for one site as each site that does stuff like that is different.

hello,

there are have few posts about this same issue, using websites using url containing 0:
that by design, these sites do not want users to be able to get a list of all files and then do massive downloads.

tho you can write your own script to get a list of files and then

rclone copy --http-url=https://proxy.hhindex2.workers.dev :http:0:/00.VHDX /home/user01/dest --files-from-raw=/home/user01/rclone/scripts/files.lst -v
INFO  : 00.VHDX: Copied (new)

thanks for your suggestion. I'm not really a good script writer. If you have a template can you share it? Also, what is the file format of --files-from-raw file list?

It looks like the index is created from JavaScript. Here is what is in the page...

      <script>
          window.drive_names = JSON.parse('["Hash Hackers Pro BOT 01","Hash Hackers Temp Torrent Group (Regularly Removed Data)","HH_Anime.ANK-Raws","HH_AudioBooks","HH_Bollywood","HH_Books.Mags","HH_Business","HH_Calibre","HH_CalibreServers.01","HH_Cartoons","HH_Comics.0-Day.01","HH_Course.Categories","HH_Courses","HH_Courses.A-H","HH_Courses.I-Q","HH_Courses.R-Z","HH_Courses.SourceSites","HH_Courses.Udemy.01","HH_Courses-ArchiveSites","HH_Discography","HH_EDU.01","HH_EDU.02","HH_Leaks.Hacks","HH_Leaks.RansomGroups","HH_Library","HH_Music.FLAC","HH_MusicVideos","HH_PlexTV","HH_PornSites","HH_PornStars","HH_PornUNKNOWN","HH_PornZ.ASIAN-FILES","HH_Regional","HH_SiteClones","HH_TheOccult.click","HH_TV.Networks","HH_WebDev","HH.Anime.01","HH.Anime.02","HH.BDMV","HH.Comics.01","HH.Docus.01","HH.Foreign","HH.Games.01","HH.Movies","HH.Music.01","HH.Music.02","HH.Software","HH.Sports","HH.TV.Alpha","HH.TV.Episodes","HH.Z.COLLECTIONS.Resort","HH.Z.EnCodes","HH.Z.Movie.Packs","HH.Z.TO.BE.DELETED","HH.Z.UPLOADS","HH.Z.UPLOADS.FOREIGN","HH.ZZ.11","HH.ZZ.25","HH.ZZ.CLEAN.ME","HH.ZZ.CLEANER","HH.ZZ.CLEANER.FOLDERS","HH.ZZ.Movie.Cleaner","HH.ZZ.Music.Cleaner","HH.ZZ.SITES","HH.ZZ.SubFolders","HH.ZZ.TV.Cleaner","HH.ZZ.UNKNOWN","HH.ZZZ.Subs.Rename","HH.ZZZZ.CHINESE","HH.ZZZZ.Colab.Cloner","HHN_Collection.EDITH"]');
          window.UI = JSON.parse('{"theme":"slate","version":"2.1.2","logo_image":true,"logo_height":"","logo_width":"100px","favicon":"https://cdn.jsdelivr.net/npm/@googledrive/index@2.0.20/images/favicon.ico","logo_link_name":"https://cdn.jsdelivr.net/npm/@googledrive/index@2.0.20/images/bhadoo-cloud-logo-white.svg","fixed_header":false,"header_padding":"60","nav_link_1":"Home","nav_link_3":"Current Path","nav_link_4":"Contact","show_logout_button":true,"fixed_footer":false,"hide_footer":true,"header_style_class":"navbar-dark bg-primary","footer_style_class":"bg-primary","css_a_tag_color":"white","css_p_tag_color":"white","folder_text_color":"white","loading_spinner_class":"text-light","search_button_class":"btn btn-danger","path_nav_alert_class":"alert alert-primary","file_view_alert_class":"alert alert-danger","file_count_alert_class":"alert alert-secondary","contact_link":"https://telegram.dog/Telegram","copyright_year":"2050","company_name":"Bhadoo Cloud","company_link":"https://telegram.dog/Telegram","credit":true,"display_size":true,"display_time":false,"display_download":true,"disable_player":false,"custom_srt_lang":"","disable_video_download":false,"second_domain_for_dl":true,"downloaddomain":"https://stream.proxy.jan-2022-api.workers.dev","videodomain":"https://stream.proxy.december-2021-api-2.workers.dev","poster":"https://cdn.jsdelivr.net/npm/@googledrive/index@2.0.20/images/poster.jpg","audioposter":"https://cdn.jsdelivr.net/npm/@googledrive/index@2.0.20/images/music.jpg","jsdelivr_cdn_src":"https://cdn.jsdelivr.net/npm/@googledrive/index","render_head_md":true,"render_readme_md":true,"display_drive_link":false,"plyr_io_version":"3.6.4","plyr_io_video_resolution":"16:9","unauthorized_owner_link":"https://telegram.dog/Telegram","unauthorized_owner_email":"abuse@telegram.org","arc_code":"jfoY2h19","modified_function_no_use":false}');
      </script>

Rclone doesn't support running javascript to make the HTTP pages so I don't think it is possible with the http backend.

It would be possible for rclone to use the chrome dev tools interface which I discovered recently to drive a headless chrome browser to render the pages first but that is probably out of scope for the http backend.

a simple list of files, the path is realative to the root path given, in this case, 0:/00.VHDX

file.lst

Test-VHD.vhdx

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.