Archive.org (internetarchive) - Issue setting metadata when uploading

Hello all! Thanks for any help!

What is the problem you are having with rclone?

I'm having rclone upload .m4a audio files (I've tried .mp3, too - same issue) to archive.org. The files upload but for some reason, it won't set any metadata that I tell rclone to pass. I explain more in my log section.

Run the command 'rclone version' and share the full output of the command.

rclone v1.66.0

  • os/version: Microsoft Windows Server 2019 Standard 1809 (64 bit)
  • os/kernel: 10.0.17763.5329 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.22.1
  • go/linking: static
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Internet Archive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone.exe copy "E:\this_is_my_audio_file_1.m4a" "internetarchive_config:" --metadata --metadata-set creator="It's me" --metadata-set mediatype=audio --metadata-set collection="test collection" --metadata-set title="It's my title" --config "rcloneconf.conf"

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[internetarchive_config]
type = internetarchive
access_key_id = XXX
secret_access_key = XXX

A log from the command that you were trying to run with the -vv flag

I did --dump headers too because it shows the metadata values. You can see it passes my supplied metadata as as X-Amz-Filemeta, but I think they might need to be X-Archive-Meta. It passes X-Archive-Meta-Mediatype: data, and the file mediatype is set as as data when it's uploaded, not audio like I specified. Nothing else is set that I specify like title, collection, creator - it's just whatever archive.org defaults to. I'm sure I'm missing something, but I haven't found anything in the docs yet.

2024/03/22 17:53:10 DEBUG : MetadataUpload map[collection:test collection creator:It's me mediatype:audio title:It's my title]
2024/03/22 17:53:10 DEBUG : rclone: Version "v1.66.0" starting with parameters ["C:\\Program Files (x86)\\VideoLAN\\VLC\\rclone-v1.61.1-windows-amd64\\rclone.exe" "copy" "E:\\this_is_my_audio_file_1.m4a" "internetarchive_config:" "--metadata" "--metadata-set" "creator=It's me" "--metadata-set" "mediatype=audio" "--metadata-set" "collection=test collection" "--metadata-set" "title=It's my title" "--config" ".\\rclone-v1.61.1-windows-amd64\\rcloneconf.conf" "-vv" "--dump" "headers"]
2024/03/22 17:53:10 DEBUG : Creating backend with remote "E:\\this_is_my_audio_file_1.m4a"
2024/03/22 17:53:10 DEBUG : Using config file from "C:\\Program Files (x86)\\VideoLAN\\VLC\\rclone-v1.61.1-windows-amd64\\rcloneconf.conf"
2024/03/22 17:53:10 DEBUG : fs cache: adding new entry for parent of "E:\\this_is_my_audio_file_1.m4a", "//?/E:/"
2024/03/22 17:53:10 DEBUG : Creating backend with remote "internetarchive_config:"
2024/03/22 17:53:10 DEBUG : You have specified to dump information. Please be noted that the Accept-Encoding as shown may not be correct in the request and the response may not show Content-Encoding if the go standard libraries auto gzip encoding was in effect. In this case the body of the request will be gunzipped before showing it.
2024/03/22 17:53:10 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/22 17:53:10 DEBUG : HTTP REQUEST (req 0xc0008d0a20)
2024/03/22 17:53:10 DEBUG : GET /metadata/this_is_my_audio_file_1.m4a HTTP/1.1
Host: archive.org
User-Agent: rclone/v1.66.0
Authorization: XXXX
Accept-Encoding: gzip

2024/03/22 17:53:10 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/22 17:53:13 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/22 17:53:13 DEBUG : HTTP RESPONSE (req 0xc0008d0a20)
2024/03/22 17:53:13 DEBUG : HTTP/2.0 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Fri, 22 Mar 2024 22:53:17 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx/1.25.1
Strict-Transport-Security: max-age=15724800
Vary: Accept-Encoding

2024/03/22 17:53:13 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/22 17:53:13 DEBUG : this_is_my_audio_file_1.m4a: Need to transfer - File not found at Destination
2024/03/22 17:53:13 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/22 17:53:13 DEBUG : HTTP REQUEST (req 0xc0008d1e60)
2024/03/22 17:53:13 DEBUG : PUT /this_is_my_audio_file_1.m4a HTTP/1.1
Host: s3.us.archive.org
User-Agent: rclone/v1.66.0
Content-Length: 4439991
Authorization: XXXX
X-Amz-Auto-Make-Bucket: 1
X-Amz-Filemeta-Atime: 2024-03-22T17:49:37.3861455-05:00
X-Amz-Filemeta-Btime: 2024-03-22T17:49:36.6350592-05:00
X-Amz-Filemeta-Collection: test collection
X-Amz-Filemeta-Creator: It's me
X-Amz-Filemeta-Mediatype: audio
X-Amz-Filemeta-Mode: 666
X-Amz-Filemeta-Rclone-Mtime: 2024-03-22T17:42:32.7018067-05:00
X-Amz-Filemeta-Rclone-Update-Track: kucahor5jipiwum5jamovic0lepemin5
X-Amz-Filemeta-Title: It's my title
X-Archive-Auto-Make-Bucket: 1
X-Archive-Cascade-Delete: 1
X-Archive-Keep-Old-Version: 0
X-Archive-Meta-Mediatype: data
X-Archive-Queue-Derive: 0
X-Archive-Size-Hint: 4439991
Accept-Encoding: gzip

2024/03/22 17:53:13 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/22 17:54:07 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/22 17:54:07 DEBUG : HTTP RESPONSE (req 0xc0008d1e60)
2024/03/22 17:54:07 DEBUG : HTTP/1.1 200 Ok
Connection: close
Content-Length: 0
Accept-Ranges: bytes
Access-Control-Allow-Headers: authorization,x-amz-acl,x-amz-auto-make-bucket,cache-control,x-requested-with,x-file-name,x-file-size,x-archive-ignore-preexisting-bucket,x-archive-interactive-priority,x-archive-meta-title,x-archive-meta-description,x-archive-meta-language,x-archive-meta-mediatype,x-archive-meta01-subject,x-archive-meta02-subject,x-archive-meta03-subject,x-archive-meta04-subject,x-archive-meta05-subject,x-archive-meta01-collection,x-archive-meta02-collection
Access-Control-Allow-Methods: GET,POST,PUT,DELETE
Access-Control-Allow-Origin: *
Date: Fri, 22 Mar 2024 22:53:18 GMT
Server: Apache/2.4.41 (Ubuntu)

2024/03/22 17:54:07 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/22 17:54:07 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/22 17:54:07 DEBUG : HTTP REQUEST (req 0xc000a12900)
2024/03/22 17:54:07 DEBUG : GET /metadata/this_is_my_audio_file_1.m4a HTTP/1.1
Host: archive.org
User-Agent: rclone/v1.66.0
Authorization: XXXX
Accept-Encoding: gzip

2024/03/22 17:54:07 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/22 17:54:07 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/22 17:54:07 DEBUG : HTTP RESPONSE (req 0xc000a12900)
2024/03/22 17:54:07 DEBUG : HTTP/2.0 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Fri, 22 Mar 2024 22:54:11 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx/1.25.1
Strict-Transport-Security: max-age=15724800
Vary: Accept-Encoding

2024/03/22 17:54:07 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/22 17:54:07 DEBUG : this_is_my_audio_file_1.m4a: Dst hash empty - aborting Src hash check
2024/03/22 17:54:07 INFO  : this_is_my_audio_file_1.m4a: Copied (new)
2024/03/22 17:54:07 INFO  :
Transferred:        4.234 MiB / 4.234 MiB, 100%, 74.828 KiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:        57.2s

2024/03/22 17:54:07 DEBUG : 4 go routines active

It is Amazon S3 standard and should be supported by internet-archive.

Can you try setting your own metadata but omit --metadata flag?

Thanks for replying! I ran it without --metadata. It's the same test file, I just gave it a different name. I tried to set the creator, mediatype, collection, and title again.

rclone.exe copy "E:\this_is_my_audio_file_3.m4a" "internetarchive_config:" --metadata-set creator="It's me" --metadata-set mediatype=audio --metadata-set collection="test collection" --metadata-set title="It's my title" --config "rcloneconf.conf" -vv --dump headers

Still no change, Here's the metadada page for that file: https://archive.org/metadata/this_is_my_audio_file_3.m4a

Heres' the -vv --dump headers output:

2024/03/23 07:10:27 DEBUG : rclone: Version "v1.66.0" starting with parameters ["C:\\Program Files (x86)\\VideoLAN\\VLC\\rclone-v1.61.1-windows-amd64\\rclone.exe" "copy" "E:\\this_is_my_audio_file_3.m4a" "internetarchive_config:" "--metadata-set" "creator=It's me" "--metadata-set" "mediatype=audio" "--metadata-set" "collection=test collection" "--metadata-set" "title=It's my title" "--config" ".\\rclone-v1.61.1-windows-amd64\\rcloneconf.conf" "-vv" "--dump" "headers"]
2024/03/23 07:10:27 DEBUG : Creating backend with remote "E:\\this_is_my_audio_file_3.m4a"
2024/03/23 07:10:27 DEBUG : Using config file from "C:\\Program Files (x86)\\VideoLAN\\VLC\\rclone-v1.61.1-windows-amd64\\rcloneconf.conf"
2024/03/23 07:10:27 DEBUG : fs cache: adding new entry for parent of "E:\\this_is_my_audio_file_3.m4a", "//?/E:/"
2024/03/23 07:10:27 DEBUG : Creating backend with remote "internetarchive_config:"
2024/03/23 07:10:27 DEBUG : You have specified to dump information. Please be noted that the Accept-Encoding as shown may not be correct in the request and the response may not show Content-Encoding if the go standard libraries auto gzip encoding was in effect. In this case the body of the request will be gunzipped before showing it.
2024/03/23 07:10:27 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/23 07:10:27 DEBUG : HTTP REQUEST (req 0xc000994000)
2024/03/23 07:10:27 DEBUG : GET /metadata/this_is_my_audio_file_3.m4a HTTP/1.1
Host: archive.org
User-Agent: rclone/v1.66.0
Authorization: XXXX
Accept-Encoding: gzip

2024/03/23 07:10:27 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/23 07:10:31 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/23 07:10:31 DEBUG : HTTP RESPONSE (req 0xc000994000)
2024/03/23 07:10:31 DEBUG : HTTP/2.0 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Sat, 23 Mar 2024 12:10:35 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx/1.25.1
Strict-Transport-Security: max-age=15724800
Vary: Accept-Encoding

2024/03/23 07:10:31 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/23 07:10:31 DEBUG : this_is_my_audio_file_3.m4a: Need to transfer - File not found at Destination
2024/03/23 07:10:31 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/23 07:10:31 DEBUG : HTTP REQUEST (req 0xc000994000)
2024/03/23 07:10:31 DEBUG : PUT /this_is_my_audio_file_3.m4a HTTP/1.1
Host: s3.us.archive.org
User-Agent: rclone/v1.66.0
Content-Length: 4439991
Authorization: XXXX
X-Amz-Auto-Make-Bucket: 1
X-Amz-Filemeta-Rclone-Mtime: 2024-03-22T17:42:32.7018067-05:00
X-Amz-Filemeta-Rclone-Update-Track: zupusuz0rizewef2qevewib2vemoxus1
X-Archive-Auto-Make-Bucket: 1
X-Archive-Cascade-Delete: 1
X-Archive-Keep-Old-Version: 0
X-Archive-Meta-Mediatype: data
X-Archive-Queue-Derive: 0
X-Archive-Size-Hint: 4439991
Accept-Encoding: gzip

2024/03/23 07:10:31 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/23 07:11:26 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/23 07:11:26 DEBUG : HTTP RESPONSE (req 0xc000994000)
2024/03/23 07:11:26 DEBUG : HTTP/1.1 200 Ok
Connection: close
Content-Length: 0
Accept-Ranges: bytes
Access-Control-Allow-Headers: authorization,x-amz-acl,x-amz-auto-make-bucket,cache-control,x-requested-with,x-file-name,x-file-size,x-archive-ignore-preexisting-bucket,x-archive-interactive-priority,x-archive-meta-title,x-archive-meta-description,x-archive-meta-language,x-archive-meta-mediatype,x-archive-meta01-subject,x-archive-meta02-subject,x-archive-meta03-subject,x-archive-meta04-subject,x-archive-meta05-subject,x-archive-meta01-collection,x-archive-meta02-collection
Access-Control-Allow-Methods: GET,POST,PUT,DELETE
Access-Control-Allow-Origin: *
Date: Sat, 23 Mar 2024 12:10:36 GMT
Server: Apache/2.4.41 (Ubuntu)

2024/03/23 07:11:27 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/23 07:11:27 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/23 07:11:27 DEBUG : HTTP REQUEST (req 0xc000994900)
2024/03/23 07:11:27 DEBUG : GET /metadata/this_is_my_audio_file_3.m4a HTTP/1.1
Host: archive.org
User-Agent: rclone/v1.66.0
Authorization: XXXX
Accept-Encoding: gzip

2024/03/23 07:11:27 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2024/03/23 07:11:27 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/23 07:11:27 DEBUG : HTTP RESPONSE (req 0xc000994900)
2024/03/23 07:11:27 DEBUG : HTTP/2.0 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Sat, 23 Mar 2024 12:11:31 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx/1.25.1
Strict-Transport-Security: max-age=15724800
Vary: Accept-Encoding

2024/03/23 07:11:27 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2024/03/23 07:11:27 DEBUG : this_is_my_audio_file_3.m4a: Dst hash empty - aborting Src hash check
2024/03/23 07:11:27 INFO  : this_is_my_audio_file_3.m4a: Copied (new)
2024/03/23 07:11:27 INFO  :
Transferred:        4.234 MiB / 4.234 MiB, 100%, 69.523 KiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:       1m0.6s

2024/03/23 07:11:27 INFO  :
Transferred:        4.234 MiB / 4.234 MiB, 100%, 69.523 KiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:       1m0.6s

2024/03/23 07:11:27 DEBUG : 4 go routines active```

It seems the same or a similar issue as this post: How to add metadata in Internet Archive (Archive.org)?. Different OS and older version of rclone.

But it looks like the OP stopped responding before it got resolved.

Similarly like in thread you linked - all looks OK but indeed there is no metadata.

Maybe original author of this remote could help us?

@Lesmiscore - could you please have a look?

Hmm, that gives me an idea. I wonder if these should be URL encoded. s3 doesn't appear to need it though, but maybe internet archive does.

I also wonder if the local backend metadata is confusing things.

Can you try copying the file from a backend which doesn't support metadata (choose one from here?

And also try uploading with your metadata with nothing but A-Za-z characters so no spaces or punctuation?

1 Like

I uploaded the file to ProtonDrive and then copied it from ProtonDrive.

rclone.exe copy protondrive_config:"audiofile6.m4a" "E:\" --config "rcloneconf.conf" -vv --dump headers

2024/03/24 09:56:07 DEBUG : Creating backend with remote "protondrive_config:audiofile6.m4a"
2024/03/24 09:56:07 DEBUG : Using config file from "C:\\Program Files (x86)\\VideoLAN\\VLC\\rclone-v1.61.1-windows-amd64\\rcloneconf.conf"
2024/03/24 09:56:07 DEBUG : proton drive root link ID 'audiofile6.m4a': Has cached credentials
2024/03/24 09:56:10 DEBUG : proton drive root link ID 'audiofile6.m4a': Used cached credential to initialize the ProtonDrive API
2024/03/24 09:56:12 DEBUG : fs cache: adding new entry for parent of "protondrive_config:audiofile6.m4a", "protondrive_config:"
2024/03/24 09:56:12 DEBUG : Creating backend with remote "E:\\"
2024/03/24 09:56:12 DEBUG : fs cache: renaming cache item "E:\\" to be canonical "//?/E:/"
2024/03/24 09:56:13 DEBUG : audiofile6.m4a: Need to transfer - File not found at Destination
2024/03/24 09:56:18 DEBUG : audiofile6.m4a: sha1 = 2e5d7600a7d5800abf717a6b196dfd9a572eb005 OK
2024/03/24 09:56:18 DEBUG : audiofile6.m4a.naburoj1.partial: renamed to: audiofile6.m4a
2024/03/24 09:56:18 INFO  : audiofile6.m4a: Copied (new)
2024/03/24 09:56:18 INFO  :
Transferred:        4.234 MiB / 4.234 MiB, 100%, 0 B/s, ETA -
Transferred:            1 / 1, 100%
Elapsed time:        11.1s

If I've misunderstood what you're looking for, let me know and I'll try something else!

Sure:

rclone.exe copy "E:\audiofile7.m4a" internetarchive_config: --metadata --metadata-set creator=Me --metadata-set mediatype=audio --metadata-set collection=TestCollection --metadata-set title=MyTitle --config "rcloneconf.conf"

No change, unfortunately: https://archive.org/metadata/audiofile7.m4a

hi, not sure it applies in your case.
might test using --header-upload

1 Like

I couldn't get --header-upload to work, but --header is working like a champ!

rclone.exe copy "E:\audiofile12.m4a" internetarchive_config: --metadata --header "X-Archive-Meta-Mediatype: audio" --header "X-Archive-Meta-title: Hey This Works" --header "X-Archive-Meta-Creator: It's Me" --config "rcloneconf.conf" -vv --dump headers

https://archive.org/metadata/audiofile12.m4a

I'll play around with it more to get it to do what I need it to do. That was a good shout! Thanks!

According to the docs I found it should be X-Archive-Meta or x-amz-meta

We seem to be using a whole lot of things which aren't that

"x-archive-filemeta-sha1":    srcObj.sha1,
"x-archive-filemeta-md5":     srcObj.md5,
"x-archive-filemeta-crc32":   srcObj.crc32,
"x-archive-filemeta-size":    fmt.Sprint(srcObj.size),
"x-archive-filemeta-rclone-mtime":        srcObj.modTime.Format(time.RFC3339Nano),
"x-archive-filemeta-rclone-update-track": updateTracker,
"x-amz-filemeta-rclone-mtime":        modTime.Format(time.RFC3339Nano),
"x-amz-filemeta-rclone-update-track": updateTracker,
"x-amz-filemeta-%s", mk)] = mv

The string filemeta appears nowhere in those docs.

Can anyone find a better pointer into official docs?

@Lesmiscore can you throw any light?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.