Rclone should (optionally?) ignore 404 for DELETE

What is the problem you are having with rclone?

We are using rclone to sync between two swift clusters. Unfortunately, some of the containers have "ghost objects" in them - objects that appear in the container listing but do not in fact exist. This means that sometimes rclone will try and DELETE one of these objects, and the swift object store returns 404 in this case, even though the object has been successfully deleted (i.e. it no longer appears in the container listing). I'm not really convinced this is correct behaviour on swift's part, but when I reported it upstream they don't seem to want to change its behaviour now.

So I'd ideally like an option for rclone to not treat 404 on DELETE as a failure.

Run the command 'rclone version' and share the full output of the command.

mvernon@ms-be2069:~$ rclone version
rclone v1.60.1-DEV
- os/version: debian 11.6 (64 bit)
- os/kernel: 5.10.0-19-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.18.1
- go/linking: dynamic
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Openstack Swift

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync --checkers=64 --transfers=8 \
       --retries 1 --ignore-errors \
       --filter-from /etc/swift/swiftrepl_filters_nothumbs \
       --config /etc/swift/rclone.conf \
       --no-update-modtime \
       --use-mmap \
       --checksum \
       --swift-no-large-objects \
       "${local_dc}:" "${remote_dc}:"

The rclone config contents with secrets removed.

[eqiad]
type = swift
env_auth = false
user = redacted
key = also_redacted
auth = https://ms-fe.svc.eqiad.wmnet/auth/v1.0

[codfw]
type = swift
env_auth = false
user = redacted
key = furthermore_redacted
auth = https://ms-fe.svc.codfw.wmnet/auth/v1.0

A log from the command with the -vv flag

Not actually -vv, but these are syslog entries, and I think they are sufficient to demonstrate the issue.

Apr 10 10:23:23 ms-be2069 swift-rclone-sync[46647]: ERROR : wikipedia-commons-local-public.ad/a/ad/Nilip_Deb_Profile_01A.jpg: Couldn't delete: Object Not Found
[...]
Apr 10 10:23:26 ms-be2069 swift-rclone-sync[46647]: ERROR : wikipedia-en-local-public.97/9/97/Sportyak_II_on_1963_low_water_trip_GRCA_2901B.JPG: Couldn't delete: Object Not Found
Apr 10 10:23:26 ms-be2069 swift-rclone-sync[46647]: ERROR : Attempt 1/1 failed with 80 errors and: failed to delete 55 files
Apr 10 10:23:26 ms-be2069 swift-rclone-sync[46647]: Failed to sync with 80 errors: last error was: failed to delete 55 files

Thanks :slight_smile:

I've seen these errors on swift clusters before.

The problem is that doing the delete of the object doesn't remove it from the directory listing despite it not being found. So it will be found to be deleted next time too won't it?

I guess we could do the 404 == OK behavior on delete on a flag?

As far as I've seen so far, doing a delete of one of these object does in fact remove it from the listing in future (you end up with consistent deleted objects in the container.db files). I've not yet found one that isn't deletable, thankfully!

Yes, a flag would be fine, thanks.

Ah that is good.

In that case no need for a flag.

Try this

v1.63.0-beta.6959.fe43c0020.fix-swift-delete on branch fix-swift-delete (uploaded in 15-30 mins)

Hi, I downloaded rclone-v1.63.0-beta.6959.fe43c0020.fix-swift-delete-linux-amd64.zip and tried it, but I think the behaviour has not changed:

mvernon@ms-be2069:~$ sudo /home/mvernon/rclone.fix-swift-delete sync --checkers=64 --transfers=8 \
       --retries 1 --ignore-errors \
       --filter-from /etc/swift/swiftrepl_filters_nothumbs \
       --config /etc/swift/rclone.conf \
       --no-update-modtime \
       --use-mmap \
       --checksum \
       --swift-no-large-objects \
"codfw:" "eqiad:"
2023/04/17 11:49:30 ERROR : wikipedia-commons-local-public.e4/e/e4/Taurie_Music-1.jpg: Failed to copy: failed to open source object: Object Not Found
2023/04/17 13:15:04 ERROR : wikipedia-en-local-public.1a/1/1a/Erskine_Minnesota_May_2007.JPG: Couldn't delete: Object Not Found
2023/04/17 13:15:04 ERROR : Attempt 1/1 failed with 3 errors and: failed to delete 1 files
2023/04/17 13:15:04 Failed to sync with 3 errors: last error was: failed to delete 1 files
mvernon@ms-be2069:~$ echo $?
1
mvernon@ms-be2069:~$ /home/mvernon/rclone.fix-swift-delete version
rclone v1.63.0-beta.6959.fe43c0020.fix-swift-delete
- os/version: debian 11.6 (64 bit)
- os/kernel: 5.10.0-19-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.20.3
- go/linking: static
- go/tags: none

(complicated a little by the fact there was an object that failed to copy, but if I read your patch correctly, I should have seen something like Dangling object - ignoring: rather than Couldn't delete: Object Not Found?)

Hi, have managed to reproduce this without any failed copies to complicate matters:

mvernon@ms-be2069:~$ sudo /home/mvernon/rclone.fix-swift-delete sync --checkers=64 --transfers=8        --retries 1 --ignore-errors        --filter-from /etc/swift/swiftrepl_filters_nothumbs        --config /etc/swift/rclone.conf        --no-update-modtime        --use-mmap        --checksum        --swift-no-large-objects "codfw:" "eqiad:"
2023/04/19 13:25:16 ERROR : wikipedia-en-local-public.a8/a/a8/Pentagon_video_security.gif: Couldn't delete: Object Not Found
2023/04/19 13:25:17 ERROR : Attempt 1/1 failed with 2 errors and: failed to delete 1 files
2023/04/19 13:25:17 Failed to sync with 2 errors: last error was: failed to delete 1 files
mvernon@ms-be2069:~$ echo $?
1

So we have only a deletion failed, and still the non-zero exit status (and, I think, not the error message we were hoping for given your patch).

I looked at my code again, and I can see I'm testing for the wrong error :frowning:

Try this - hopefully it will work better!

v1.63.0-beta.6966.d0a6a71aa.fix-swift-delete on branch fix-swift-delete (uploaded in 15-30 mins)

Complicated by another missing object-to-copy, but this looks more hopeful:

mvernon@ms-be2069:~$ sudo /home/mvernon/rclone.fix-swift-delete sync --checkers=64 --transfers=8        --retries 1 --ignore-errors        --filter-from /etc/swift/swiftrepl_filters_nothumbs        --config /etc/swift/rclone.conf        --no-update-modtime        --use-mmap        --checksum        --swift-no-large-objects "codfw:" "eqiad:"
2023/04/21 19:41:07 ERROR : wikipedia-en-local-public.a8/a/a8/Pentagon_video_security.gif: Failed to copy: failed to open source object: Object Not Found
2023/04/21 20:41:33 ERROR : wikipedia-en-local-public.ce/c/ce/Merrick_and_his_wife.jpg: Dangling object - ignoring: Object Not Found
2023/04/21 20:41:33 ERROR : Attempt 1/1 failed with 1 errors and: failed to open source object: Object Not Found
2023/04/21 20:41:33 Failed to sync: failed to open source object: Object Not Found

That looks like it is working!

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.63

I've seen this on swift clusters too. Not sure we can do anything about it, but at least deleting the object will work now :crossed_fingers:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.