Remote directory is clobbered by "copyto" command without trailing slash

What is the problem you are having with rclone?

The copyto command clobbers remote directories if there is no trailing slash and the file being copied doesn't exist in the remote directory. I checked the rclone changelog, and I don't see any relevant bug fixes for this problem.

Basically, when there is a file in the remote directory with a matching name, then the command will function as expected, comparing the hash and date and performing a copy operation only if a change is detected.

However, when there is not a file in the remote directory with a matching name, then the command compares the hash, size, and date of the remote directory itself (which it obviously finds is different) and replaces the remote directory with a copy of the file. No warning is given about clobbering the remote directory.

Stranger still, by deleting the copied file and recreating the remote directory, then all of its former contents are restored which suggests it wasn't really purged from the server (thank goodness), only the link was removed.

I understand that copyto is a comparatively newer command than copy, but it would appear that it's still too experimental for use in a production environment given this type of undocumented behavior.

Run the command 'rclone version' and share the full output of the command.

rclone v1.49.2
- os/arch: linux/386
- go version: go1.12.9

Which cloud storage system are you using? (eg Google Drive)

PCloud

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone /pub/worlds/test/map.sqlite.gz remote:xlhost/testbak

The rclone config contents with secrets removed.

[remote]
type = pcloud
token = {"access_token":"REMOVED","token_type":"bearer","expiry":"0001-01-01T00:00:00Z"}

A log from the command with the -vv flag

2022/01/04 20:50:52 DEBUG : rclone: Version "v1.49.2" starting with parameters ["rclone" "-vv" "copyto" "map.sqlite.gz" "remote:xlhost/testbak"]
2022/01/04 20:50:52 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2022/01/04 20:50:52 DEBUG : map.sqlite.gz: Sizes differ (src 2539 vs dst 15716352)
2022/01/04 20:50:53 DEBUG : map.sqlite.gz: MD5 = 4724a898fc69696f47a185e9f5ce7e50 OK
2022/01/04 20:50:53 INFO  : map.sqlite.gz: Copied (replaced existing)
2022/01/04 20:50:53 INFO  :
Transferred:        2.479k / 2.479 kBytes, 100%, 3.479 kBytes/s, ETA 0s
Errors:                 0
Checks:                 0 / 0, -
Transferred:            1 / 1, 100%
Elapsed time:       700ms

2022/01/04 20:50:53 DEBUG : 4 go routines active
2022/01/04 20:50:53 DEBUG : rclone: Version "v1.49.2" finishing with parameters ["rclone" "-vv" "copyto" "map.sqlite.gz" "remote:xlhost/testbak"]

Can you please test with the latest version? You have a very old version.

felix@gemini:~/test$ rclone copyto /etc/hosts /home/felix/test/dir -vvv
2022/01/04 22:45:32 DEBUG : Setting --config "/opt/rclone/rclone.conf" from environment variable RCLONE_CONFIG="/opt/rclone/rclone.conf"
2022/01/04 22:45:32 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "copyto" "/etc/hosts" "/home/felix/test/dir" "-vvv"]
2022/01/04 22:45:32 DEBUG : Creating backend with remote "/etc/hosts"
2022/01/04 22:45:32 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2022/01/04 22:45:32 DEBUG : fs cache: adding new entry for parent of "/etc/hosts", "/etc"
2022/01/04 22:45:32 DEBUG : Creating backend with remote "/home/felix/test/"
2022/01/04 22:45:32 ERROR : Attempt 1/3 failed with 1 errors and: is a directory not a file
2022/01/04 22:45:32 ERROR : Attempt 2/3 failed with 1 errors and: is a directory not a file
2022/01/04 22:45:32 ERROR : Attempt 3/3 failed with 1 errors and: is a directory not a file
2022/01/04 22:45:32 INFO  :
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:         0.0s

2022/01/04 22:45:32 DEBUG : 2 go routines active
2022/01/04 22:45:32 Failed to copyto: is a directory not a file

Thanks for the advice. I went ahead and updated rclone and performed another test, this time with a file named "map_meta.txt".

root:/pub/worlds/test% rclone --version
rclone v1.57.0
- os/version: centos 6.8 (64 bit)
- os/kernel: 2.6.32-642.11.1.el6.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.2
- go/linking: static
- go/tags: none

root:/pub/worlds/test% ls map_meta.txt
-rw-r--r-- 1 minetest minetest 2574 Jan  4 11:15 map_meta.txt

root:/pub/worlds/test% rclone -vv copyto map_meta.txt remote:xlhost/testbak
2022/01/04 23:02:01 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "-vv" "copyto" "map_meta.txt" "remote:xlhost/testbak"]
2022/01/04 23:02:01 DEBUG : Creating backend with remote "map_meta.txt"
2022/01/04 23:02:01 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2022/01/04 23:02:01 DEBUG : fs cache: adding new entry for parent of "map_meta.txt", "/pub/worlds/test"
2022/01/04 23:02:01 DEBUG : Creating backend with remote "remote:xlhost/"
2022/01/04 23:02:01 DEBUG : fs cache: renaming cache item "remote:xlhost/" to be canonical "remote:xlhost"
2022/01/04 23:02:01 DEBUG : map_meta.txt: Need to transfer - File not found at Destination
2022/01/04 23:02:02 DEBUG : map_meta.txt: md5 = 9f1a19df0625ad4a6572271342fccfa6 OK
2022/01/04 23:02:02 INFO  : map_meta.txt: Copied (new) to: testbak
2022/01/04 23:02:02 INFO  :
Transferred:        2.514 KiB / 2.514 KiB, 100%, 0 B/s, ETA -
Transferred:            1 / 1, 100%
Elapsed time:         0.9s

2022/01/04 23:02:02 DEBUG : 4 go routines active

root:/pub/worlds/test% rclone ls remote:xlhost | grep testbak
     2574 testbak
1617084343 testbak/mapBAK.sqlite.gz

As you can see, what ends up happening is that the remote directory and its contents are still there, yet now a file is created with the exact same name. On a normal filesystem this is entirely impossible, but for some reason on cloud file storage, you can have a directory and a file with clashing names.

It's very concerning that such serious bug has persisted for so many years, particularly since this entails the \simple oversight of using a directory name as the target for the copied file. Such a novice mistake can easily happen by using the copyto command when copy was clearly intended.

To further test this theory, I used FUSE to mount the remote filesystem and performed an ordinary ls shell command. Now only the copied file is shown, but not the directory that was clobbered (even though the directory and file do exist simultaneously in the cloud, even though the directory is no longer accessible).

root:/pub/worlds/test% ls /mnt/pcloud/test*
-rw-r--r-- 1 root root 2574 Jan  4 11:15 /mnt/pcloud/testbak

Also, it should be noted that the command you used in your example above doesn't test for the problem, because the destination is not cloud file storage. This issue specifically occurs when the destination is a remote directory.

Dropbox:

felix@gemini:~$ rclone mkdir DB:dir
felix@gemini:~$ rclone copyto /etc/hosts DB:dir -vvv
2022/01/05 08:58:57 DEBUG : Setting --config "/opt/rclone/rclone.conf" from environment variable RCLONE_CONFIG="/opt/rclone/rclone.conf"
2022/01/05 08:58:57 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "copyto" "/etc/hosts" "DB:dir" "-vvv"]
2022/01/05 08:58:57 DEBUG : Creating backend with remote "/etc/hosts"
2022/01/05 08:58:57 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2022/01/05 08:58:57 DEBUG : fs cache: adding new entry for parent of "/etc/hosts", "/etc"
2022/01/05 08:58:57 DEBUG : Creating backend with remote "DB:"
2022/01/05 08:58:57 ERROR : Attempt 1/3 failed with 1 errors and: is a directory not a file
2022/01/05 08:58:58 ERROR : Attempt 2/3 failed with 1 errors and: is a directory not a file
2022/01/05 08:58:58 ERROR : Attempt 3/3 failed with 1 errors and: is a directory not a file
2022/01/05 08:58:58 INFO  :
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:         0.7s

2022/01/05 08:58:58 DEBUG : 6 go routines active
2022/01/05 08:58:58 Failed to copyto: is a directory not a file
2022/01/05 08:58:58 INFO  : Dropbox root '': Commiting uploads - please wait...
felix@gemini:~$

Certain cloud providers allow for duplicate names so that's probably more the use case as it isn't "all cloud providers".

I'd say it's probably a pcloud only item:

etexter@seraphite rclone % rclone mkdir GD:dir
etexter@seraphite rclone % rclone copyto /etc/hosts GD:dir -vvv
2022/01/05 09:04:46 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "copyto" "/etc/hosts" "GD:dir" "-vvv"]
2022/01/05 09:04:46 DEBUG : Creating backend with remote "/etc/hosts"
2022/01/05 09:04:46 DEBUG : Using config file from "/Users/etexter/.config/rclone/rclone.conf"
2022/01/05 09:04:46 DEBUG : fs cache: adding new entry for parent of "/etc/hosts", "/etc"
2022/01/05 09:04:46 DEBUG : Creating backend with remote "GD:"
2022/01/05 09:04:46 ERROR : Attempt 1/3 failed with 1 errors and: is a directory not a file
2022/01/05 09:04:47 ERROR : Attempt 2/3 failed with 1 errors and: is a directory not a file
2022/01/05 09:04:47 ERROR : Attempt 3/3 failed with 1 errors and: is a directory not a file
2022/01/05 09:04:47 INFO  :
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Elapsed time:         0.7s

2022/01/05 09:04:47 DEBUG : 6 go routines active
2022/01/05 09:04:47 Failed to copyto: is a directory not a file

I think we've validated it's not on larger providers and seems a very one 'off' type situation so best to get data rather than make larger claims.

If you want to log a bug on github and link the post and that it is a pcloud related item, that would be great. I don't have a pcloud account so I have no ability to test.

For the bug report, if you collect --dump headers,requests and it will show what pcloud is spitting back.

Issues ยท rclone/rclone (github.com)

"Certain cloud providers allow for duplicate names so that's probably more the use case as it isn't "all cloud providers"."

If that is true, I would say that qualifies as a highly esoteric feature that no ordinary user would expect, moreless want.

Even a quick Google search for "duplicate file names on cloud" turns up no relevant results. I can't imagine any real world scenario where such behavior is actually desired.

I would much prefer a warning is always produced when the target of the copyto command is a directory, and a special option must be supplied to override the behavior for cloud providers that allow duplicate names. It seems that this is a very simple test, of just checking if the target is a directory.

If it's best to get data rather than make larger claims, then why is it acceptable for you to make the larger claim "Certain cloud providers allow for duplicate names so that's probably more the use case as it isn't "all cloud providers". -- even though it turns out it isn't certain cloud providers, but likely a one off type situation.

So only I am held to that standard of having to get data before making larger claims, but not you? Strange how that works.

We have a list of the cloud providers if you want to check the 'duplicate file names' column:

Overview of cloud storage systems (rclone.org)

I didn't make a claim, I was sharing a piece of data.

I was sharing a data point and not a claim. Link above.

Now I am even more confused. First you say it's certain cloud providers, then you switch to saying "it's a very one-off type situation" And next, you link to list of cloud providers and there are at least six that allow for duplicate names, which means it isn't even remotely a "very one-off type situation."

If it were truly a pcloud-only feature, then I could see this being an edge-case, but given that it's so common a feature that it needs to be documented within a tabular dataset, then yes I think this is a serious bug -- if anything deserving of better documentation, such as a warning in the copyto command reference page:

Suggested edit: "Warning: Some cloud providers allow for duplicate files (link to list), so double-check the destination path to ensure that a file or directory with an identical name doesn't already exist, unless that is your intention.."

I will go ahead and open an issue on GitHub, and note that I am using pcloud. Thanks for your suggestion.

I think you are combining a few things here which is the confusing part.

There are a number of cloud providers that allow for duplicate file names. That's listed on the link here:

Overview of cloud storage systems (rclone.org)

When I did my use case, you shared I needed to test with a cloud provider so my hypothesis was that it was related to duplicate file names.

I tested with Dropbox (does not allow duplicate files) and had no issue.

I tested with Google Drive (does allow duplicate files) and had no issue.

As we collect data, things get refined and we chart new things to test, validate and progress through problem solving.

I did not test every provider on that list as I don't have an account for every provider.

I made another hypothesis based on the data present that it seems to be a pcloud issue based on all the evidence and my experience so far and since I don't have an account for pcloud, I can do no testing myself and asked you to open a bug report and requested some other information based on my experience that will help identify the issue.

As the data shows, it might be a pcloud only bug so not sure that's relevant to change the page yet until we get more data on the issue and can identify what the root cause and fix it or make any other changes.

In my opinion, it's technically doing exactly what it was asked to do (albeit I don't think it should and should warn you / abort the process like it does for other remotes/local storage).

Wrong commands do have consequences and running the wrong thing can cause some damage.

I think we've throughly gone through this and opening an issue/bug report would be your best course of action.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.