What is the problem you are having with rclone?
We are happily using rclone
in order to backup our (on-site) object storage to an off-site (Ceph-based) location.
For most of our buckets we haven’t had any issues, however, for one of our logging bucket we have a relatively short expiry rule:
❯ mc ilm rule ls prod/infra-acpt-logging-loki-303 --json
{
"status": "success",
"target": "prod/infra-acpt-logging-loki-303",
"config": {
"Rules": [
{
"Expiration": {
"ExpiredObjectDeleteMarker": true
},
"ID": "expire-after-1day",
"NoncurrentVersionExpiration": {
"NoncurrentDays": 1
},
"Status": "Enabled"
}
]
},
"updatedAt": "2025-01-28T08:38:48Z"
}
The reason for this short expiry rule is that it is logging and the system itself (in this case LokiStack) has rules setup for when it can be removed (for us, 90 days). So it does not make sense to keep these objects even longer.
However, because of this rule it can sometimes happen that when before rclone
job starts transferring it has listed some files that during the job itself will be deleted/expired. This result in a logging entry like this:
2025/09/04 20:38:43 ERROR : infrastructure/1a6a923d9571e693/19744cc7ae7:197453bdba4:645b0269: Failed to copy: failed to open source object: operation error S3: GetObject, https response error StatusCode: 404, RequestID: 1862284EC7F6EE1E, HostID: 5d4e4d0f6fc859fe0f0c9ba35f218284c3f7dd583372659a5ce994e609e5dbc4, NoSuchKey:
When we check this particular object (via MinIO mc
). :
❯ mc ls prod/infra-acpt-logging-loki-303/infrastructure/1a6a923d9571e693/19744cc7ae7:197453bdba4:645b0269 --versions
[2025-09-04 20:37:45 CEST] 0B STANDARD 765269d4-a030-4727-b339-0573e65fbe75 v2 DEL 19744cc7ae7:197453bdba4:645b0269
[2025-06-06 14:34:05 CEST] 11KiB STANDARD 19260aa0-2d9c-47dc-ae57-3a85a9d981a8 v1 PUT 19744cc7ae7:197453bdba4:645b0269
It seems that just before it was supposed to transfer it, it got deleted. This has happened multiple times during this rclone
job. As this job is supposed to retry on error it will try the job but then find other (new) expired objects. Resulting in some loop. Because of this phenomenon the job was stuck for a long time, over the weekend.
Basically, our question is:
How to make sure something like this won’t happen? Are there any flags that should be enabled or disabled in order to prevent such behavior?
Run the command 'rclone version' and share the full output of the command.
rclone v1.70.2
- os/version: alpine 3.22.0 (64 bit)
- os/kernel: 6.8.0-79-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.24.4
- go/linking: static
- go/tags: none
Which cloud storage system are you using? (eg Google Drive)
MinIO to Ceph RGW
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
rclone sync --config /config/rclone.conf source:"infra-acpt-logging-loki-303"/ target:"infra-acpt-logging-loki-303"/ --retries=3 --low-level-retries 10 --log-level=NOTICE --use-mmap --list-cutoff=100000 --progress --stats 1m --stats-log-level=ERROR --metadata --transfers=50 --checkers=8 --checksum --s3-use-multipart-etag=true --multi-thread-cutoff=256Mi --s3-chunk-size=5Mi
Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.
We are using env variables to set it, but basically should look like:
[minio]
type = s3
provider = minio
access_key_id = xxx
secret_access_key = xxx
endpoint = xxx
region = ""
[ceph]
type = s3
provider = Ceph
access_key_id = xxx
secret_access_key = xxx
endpoint = xxx
sse_customer_algorithm = xxx
sse_customer_key_base64 = xxx
sse_customer_key_md5 = xxx
region = ""
A log from the command that you were trying to run with the -vv
flag
[2025-09-04 20:10:03 CEST] INFO: START rclone sync from https://xxx.xxx.xxx.xxx/infra-acpt-logging-loki-303 to https://xxx.xxx.xxx/infra-acpt-logging-loki-303
[2025-09-04 20:10:03 CEST] INFO: Executing command: rclone sync --config /config/rclone.conf source:"infra-acpt-logging-loki-303"/ target:"infra-acpt-logging-loki-303"/ --retries=3 --low-level-retries 10 --log-level=NOTICE --use-mmap --list-cutoff=100000 --progress --stats 1m --stats-log-level=ERROR --metadata --transfers=50 --checkers=8 --checksum --s3-use-multipart-etag=true --multi-thread-cutoff=256Mi --s3-chunk-size=5Mi
...
[lots of listing]
...
2025/09/04 20:38:43 ERROR : infrastructure/1a6a923d9571e693/19744cc7ae7:197453bdba4:645b0269: Failed to copy: failed to open source object: operation error S3: GetObject, https response error StatusCode: 404, RequestID: 1862284EC7F6EE1E, HostID: 5d4e4d0f6fc859fe0f0c9ba35f218284c3f7dd583372659a5ce994e609e5dbc4, NoSuchKey:
...
[lots of transferring]
...
2025/09/06 10:21:38 NOTICE: Failed to sync with 13 errors: last error was: march failed with 12 error(s): first error: operation error S3: ListObjectsV2, exceeded maximum number of attempts, 10, https response error StatusCode: 0, RequestID: , HostID: , request send failed, Get "https://xxx.xxx.xxx.xxx/infra-acpt-logging-loki-303?delimiter=%2F&encoding-type=url&list-type=2&max-keys=1000&prefix=application%2F3364f26957d32e57%2F": net/http: timeout awaiting response headers
[2025-09-06 10:21:38 CEST] ERROR: rclone sync FAILED with return code 5. See https://rclone.org/docs/#exit-code
[2025-09-06 10:21:38 CEST] ERROR: FAILED rclone sync from https://xxx.xxx.xxx.xxx/infra-acpt-logging-loki-303 to https://xxx.xxx.xxx/infra-acpt-logging-loki-303
[2025-09-06 10:21:38 CEST] INFO: ZIPPING and UPLOADING report log file to https://xxx.xxx.xxx.xxx/infra-acpt-rclone-logging/infra-acpt-logging-loki-303
Indeed the file is not synced to target:
❯ aws s3api head-object \
--profile infra-acpt \
--bucket infra-acpt-logging-loki-303 \
--key infrastructure/1a6a923d9571e693/19744cc7ae7:197453bdba4:645b0269 \
--sse-customer-algorithm AES256 \
--sse-customer-key "$KEY_BASE64" \
--sse-customer-key-md5 "$MD5_DIGEST" \
--output json;
An error occurred (404) when calling the HeadObject operation: Not Found