What is the problem you are having with rclone?
I am trying to understand the behavior of the wait_archive
option with the internetarchive
remote. Based on the documentation my expectation is that rclone write operations will wait for archive book_op.php and archive.php operations to finish. However, I am not entirely sure what "wait" means in this context.
The specific situation that we are experiencing is with an rclone mount
volume wherein newly written files (which may take hours to process) can cycle out of vfs cache due to normal turnover reads of other files in the same remote, before they become available for reads. Additionally, any vfs directory refresh while in flight will essentially remove these files as they are "not present" on the remote.
I realize that the internetarchive
backend is... unique in several ways, which we are doing our best to accommodate. My initial approach was to apply the wait_archive
parameter with some reasonable-ish values in hours (12h, for example), however even setting it to 1h for a file that takes far more than that didn't seem to have any effect.
The ideal behavior for us would be that files are retained locally (in vfs cache) until they are confirmed as "committed" (aka in directory listing) on the remote, at which point they are treated as normal vfs entities. I realize that this can potentially cause irreconcilable cache contention if remote processing time is unbounded, but we are looking for best effort approximations here.
To summarize, I'd like to understand:
- what does
wait_archive
actually do? - are files in "wait" state treated in any special way by vfs?
- is there any way to avoid wiping local files in processing state by a vfs directory refresh or being forced out of cache?
Thank you!
Run the command 'rclone version' and share the full output of the command.
rclone v1.67.0
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 5.15.0-122-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.22.4
- go/linking: static
- go/tags: none
Which cloud storage system are you using? (eg Google Drive)
Internet Archive
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
typically used with rclone mount
, more details above
Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.
[ia]
type = internetarchive
access_key_id = XXX
secret_access_key = XXX
wait_archive = 1h0m0s
description = backend for sector-store
headers = x-archive-queue-derive,0
[ia-fil-f02011071-sectors-all]
type = union
upstreams = ia:ia-fil-f02011071-sectors-0/data ia:ia-fil-f02011071-sectors-1/data ia:ia-fil-f02011071-sectors-2/data ia:ia-fil-f02011071-sectors-3/data ia:ia-fil-f02011071-sectors-4/data ia:ia-fil-f02011071-sectors-5/data ia:ia-fil-f02011071-sectors-6/data ia:ia-fil-f02011071-sectors-7/data ia:ia-fil-f02011071-sectors-8/data ia:ia-fil-f02011071-sectors-9/data ia:ia-fil-f02011071-sectors-10/data ia:ia-fil-f02011071-sectors-11/data ia:ia-fil-f02011071-sectors-12/data ia:ia-fil-f02011071-sectors-13/data (snip_
create_policy = eplus
description = main remote for storage operations
A log from the command that you were trying to run with the -vv
flag
no relevant logs observed (part of the question)