S3 force in-place put/copy

What is the problem you are having with rclone?

I would like to force rclone to do an in-place copy of an existing object in s3.

Run the command 'rclone version' and share the full output of the command.

❯ rclone version
rclone v1.65.2

  • os/version: darwin 14.3.1 (64 bit)
  • os/kernel: 23.3.0 (arm64)
  • os/type: darwin
  • os/arch: arm64 (ARMv8 compatible)
  • go/version: go1.21.6
  • go/linking: dynamic
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

S3 (IBM COS)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

I am trying to do a simple 1 object copy like this:

rclone copy "INSTANCE:bucket-a/9014" "INSTANCE:bucket-a/"

I have tried different combinations of these parameters:

--no-check-dest --no-traverse --ignore-existing --ignore-times  -vvv

which always results in

don't need to copy/move 9014, it is already at target location

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[INSTANCE]
type = s3
env_auth = false
provider = IBMCOS
access_key_id = XXX
secret_access_key = XXX
endpoint = s3.us-south.cloud-object-storage.appdomain.cloud
location_constraint =
server_side_encryption =
storage_class =

A log from the command that you were trying to run with the -vv flag

rclone copy "INSTANCE:bucket-a/9014" "INSTANCE:bucket-a/" --no-check-dest --no-traverse --ignore-existing --ignore-times  -vv
2024/02/20 13:10:35 DEBUG : rclone: Version "v1.65.2" starting with parameters ["rclone" "copy" "INSTANCE:bucket-a/9014" "INSTANCE:bucket-a/" "--no-check-dest" "--no-traverse" "--ignore-existing" "--ignore-times" "-vv"]
2024/02/20 13:10:35 DEBUG : Creating backend with remote "INSTANCE:bucket-a/9014"
2024/02/20 13:10:35 DEBUG : Using config file from "/Users/mpcarl/.config/rclone/rclone.conf"
2024/02/20 13:10:35 DEBUG : Resolving service "s3" region "us-east-1"
2024/02/20 13:10:35 DEBUG : fs cache: adding new entry for parent of "INSTANCE:bucket-a/9014", "INSTANCE:bucket-a"
2024/02/20 13:10:35 DEBUG : Creating backend with remote "INSTANCE:bucket-a/"
2024/02/20 13:10:35 DEBUG : Resolving service "s3" region "us-east-1"
2024/02/20 13:10:35 DEBUG : fs cache: renaming cache item "INSTANCE:bucket-a/" to be canonical "INSTANCE:bucket-a"
2024/02/20 13:10:35 DEBUG : S3 bucket bucket-a: don't need to copy/move 9014, it is already at target location
2024/02/20 13:10:35 INFO  :
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:         0.1s

2024/02/20 13:10:35 DEBUG : 7 go routines active

hi, so i tried to replicate your issue but could not. maybe my test is not correct.

rclone tree INSTANCE:bucket-a 
/
└── 9014
    └── file.txt
1 directories, 1 files

rclone copy INSTANCE:bucket-a/9014 INSTANCE:bucket-a -vv 
2024/02/20 14:52:05 DEBUG : rclone: Version "v1.65.2" starting with parameters ["c:\\data\\rclone\\rclone.exe" "copy" "INSTANCE:bucket-a/9014" "INSTANCE:bucket-a" "-vv"]
2024/02/20 14:52:05 DEBUG : Creating backend with remote "INSTANCE:bucket-a/9014"
2024/02/20 14:52:05 DEBUG : Using config file from "c:\\data\\rclone\\rclone.conf"
2024/02/20 14:52:05 DEBUG : Resolving service "s3" region "us-east-2"
2024/02/20 14:52:06 DEBUG : Creating backend with remote "INSTANCE:bucket-a"
2024/02/20 14:52:06 DEBUG : Resolving service "s3" region "us-east-2"
2024/02/20 14:52:06 DEBUG : file.txt: Need to transfer - File not found at Destination
2024/02/20 14:52:06 DEBUG : S3 bucket bucket-a: Waiting for checks to finish
2024/02/20 14:52:06 DEBUG : S3 bucket bucket-a: Waiting for transfers to finish
2024/02/20 14:52:06 DEBUG : file.txt: md5 = 9afcb2d16863f2df14342a4143c7e45d OK
2024/02/20 14:52:06 INFO  : file.txt: Copied (server-side copy)
2024/02/20 14:52:06 INFO  : 
Transferred:   	          4 B / 4 B, 100%, 0 B/s, ETA -
Transferred:            1 / 1, 100%
Server Side Copies:     1 @ 4 B
Elapsed time:         0.3s

rclone tree INSTANCE:bucket-a 
/
├── 9014
│   └── file.txt
└── file.txt
1 directories, 2 files

In my case there is no file.txt. The object I am trying to copy is named 9014. This also shows it creates a new copy of the object and does not overwrite the existing object.

This is the equivalent awscli command which seems to work as expected

 aws s3api [...] copy-object --bucket bucket-a --key 9014 --copy-source bucket-a/9014 --metadata-directive REPLACE

ok, now, i can reproduce the issue.

can you explain the reason to run that command, perhaps there are workarounds?

here is a possible workaround.

use two remotes, each with the same exact config as INSTANCE
let's call them INSTANCE01 and INSTANCE02

rclone copy INSTANCE01:bucket-a/9014 INSTANCE02:bucket-a -vv --ignore-times --server-side-across-configs 
2024/02/20 17:32:38 DEBUG : rclone: Version "v1.65.2" starting with parameters ["c:\\data\\rclone\\rclone.exe" "copy" "INSTANCE01:bucket-a/9014" "INSTANCE02:bucket-a" "-vv" "--ignore-times" "--server-side-across-configs"]
2024/02/20 17:32:38 DEBUG : Creating backend with remote "INSTANCE01:bucket-a/9014"
2024/02/20 17:32:38 DEBUG : Using config file from "c:\\data\\rclone\\rclone.conf"
2024/02/20 17:32:38 DEBUG : fs cache: adding new entry for parent of "INSTANCE01:bucket-a/9014", "INSTANCE01:bucket-a"
2024/02/20 17:32:38 DEBUG : Creating backend with remote "INSTANCE02:bucket-a"
2024/02/20 17:32:38 DEBUG : 9014: Transferring unconditionally as --ignore-times is in use
2024/02/20 17:32:39 DEBUG : 9014: md5 = 9afcb2d16863f2df14342a4143c7e45d OK
2024/02/20 17:32:39 INFO  : 9014: Copied (server-side copy)

The reason for the in-place copy is that you can set bucket level policies that are only applied when an object is uploaded. If I want to apply a new bucket level policy to a large number of existing objects, you need to upload them again. In order to avoid all of the network transfer time, doing an in-place copy has the same affect on the objects as uploading again.

I will try this action with two remotes and report back.

I was able to try this today. And while it did replace the existing object, it did not do a server side copy. It did a download and an upload of the object. OK for a few small objects, but if I have millions of objects or a set of very large objects, this would not work.
So it is still not working as an equivalent of the awscli command.

i posted a full debug log, that did server-side copy

without a full debug log, hard to know, what exactly is going on.
and for a deeper look, post a debug log using --dump=headers

Might be possible that rclone does not do what you want (though @asdffdsa test shows that server-side in-place copy is possible) but then you can maybe use aws cli and rclone together.

I do something like this sometimes:

rclone lsf S3remote:bucket_name -R --files-only | xargs -P 16 -n 1 aws_script.sh

where aws_script.sh:

aws ... [your specific command] ... --key $1 ...

This will run 16 parallel aws cli jobs fed by rclone output. Depending on your system you can scale it up or down.

Or purely server side use something like AWS batch jobs - you would have to investigate if is is possible wiht IBM COS though - they might have similar option.

If not fast enough then worst case you can write your own small program using some native S3 access library. Whatever rust, go or what you prefer. rclone is fantastic but sometimes is not enough to solve specific problem.

Thanks for the assistance on this topic.

also aws cli is real slow (written in Python takes "ages" to initialise any invoking it) - when you have a bit more files to process I find much better to use minio mc cli - 10x+ faster

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.