Automatic backup when "failed to copy: is a directory not a file"

What is the problem you are having with rclone?

When syncing my Documents directory, I've been having some problems of this kind:

2022/04/03 15:47:20 ERROR : Code/C++/boost/libs/array/.git: Failed to copy: is a directory not a file
2022/04/03 15:47:20 ERROR : Code/C++/boost/tools/docca/.git: Failed to copy: is a directory not a file
2022/04/03 15:47:22 ERROR : Code/C++/boost/libs/utility/.git: Failed to copy: is a directory not a file
2022/04/03 15:47:22 ERROR : Google drive root 'Documents': not deleting files as there were IO errors
2022/04/03 15:47:22 ERROR : Google drive root 'Documents': not deleting directories as there were IO errors
2022/04/03 15:47:22 ERROR : Attempt 1/3 failed with 3 errors and: is a directory not a file

This is very similar to Rclone Failing to copy File because it thinks it is a directory, with the difference in the use case: these files and directories are automatically created by git. It's usually a file when you clone the repo but it becomes a directory once you start collaborating with the repo.

  • Is there an option I can use to just remove the .git file from the remote and copy the new directory in this case?
  • Or is there an option to at least still delete the files when there are IO errors?

Just deleting the files from google drive manually, in this case, is a poor solution because

  • This happens kind of automatically with git.
  • So I would need to go to google drive manually every time I collaborate with a repo.
  • The consequences of forgetting to do that are also kind of dangerous, since this is running in the background as a cron job
  • Forgetting to check the cron log for a period of time would mean I'm not backing up my files properly for some time.
  • I wouldn't be able to trust these automatic backups as much

Run the command 'rclone version' and share the full output of the command.

rclone v1.57.0
- os/version: ubuntu 21.10 (64 bit)
- os/kernel: 5.13.0-22-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google drive.

The command you were trying to run (eg rclone copy /tmp remote:tmp)

/usr/bin/rclone sync \
    --update \
    --fast-list \
    --transfers 30 \
    --checkers 120 \
    --contimeout 60s \
    --timeout 300s \
    --retries 3 \
    --low-level-retries 10 \
    --stats 1s \
    --skip-links \
    --log-level NOTICE \
    --delete-excluded "/home/alandefreitas/Documents" "remote:Documents" \
    --exclude "Code/**/node_modules/**" \
    --exclude "Computacao/**/node_modules/**" \
    --exclude "Code/C++/**/cmake-build-*/**" \
    --exclude "Code/C++/**/cpp_modules/**" \
    --exclude "Code/C++/**/cpp_modules" \
    --exclude "Code/C++/boost/boost/**" \
    --exclude "Code/Python/**/venv/**" \
    --exclude "Code/C++/**/bin.v2/**" \
    --exclude "Code/C++/boost/stage/**" \
    --exclude "Scripts/log/**" \
    --exclude  ".DS_Store" \
    --exclude "._.DS_Store" \
    --exclude "**/_MACOSX/**" \
    --exclude "_MACOSX/**" \
    --config /home/alandefreitas/.config/rclone/rclone.conf

The rclone config contents with secrets removed.

[remote]
type = drive
client_id = *********************
client_secret = *********************
scope = drive
token = *********************
team_drive = 

A log from the command with the -vv flag

// INFO: (Files copied correctly)
2022/04/03 15:47:20 ERROR : Code/C++/boost/libs/array/.git: Failed to copy: is a directory not a file
2022/04/03 15:47:20 ERROR : Code/C++/boost/tools/docca/.git: Failed to copy: is a directory not a file
2022/04/03 15:47:22 ERROR : Code/C++/boost/libs/utility/.git: Failed to copy: is a directory not a file
2022/04/03 15:47:22 ERROR : Google drive root 'Documents': not deleting files as there were IO errors
2022/04/03 15:47:22 ERROR : Google drive root 'Documents': not deleting directories as there were IO errors
2022/04/03 15:47:22 ERROR : Attempt 1/3 failed with 3 errors and: is a directory not a file

Are you saying on the destination it made a file when it should be a directory?

I tested locally and saw no issues.

Can you share a debug log?

No no. This is how it happens:

  1. git correctly created a file at a subdirectory
  2. rclone correctly copied the file to the remote when syncing the parent directory.
  3. git correctly replaced the file with a directory in the same subdirectory
  4. rclone couldn't sync the parent directory anymore because there was a file already at the location

The only solution I could find is to delete the files from google drive manually. However, this has the drawbacks I listed above.

That's why I'm asking if there's an option in rclone to automatically handle this, be it by replacing the file or at least ignoring the problem.

I'd imagine you can ignore it with an exclude. Not sure the goal for syncing the repo anyway so that's probably fine.

All my .git are directories.

This seems to be a general issue that can be provoked by this sequence:

rclone mkdir anyRemote:testfolder
rclone touch anyRemote:testfolder/fileder
rclone sync anyRemote:testfolder anyRemote:testfolder2
rclone deletefile anyRemote:testfolder/fileder
rclone mkdir anyRemote:testfolder/fileder
rclone touch anyRemote:testfolder/fileder/file
rclone sync anyRemote:testfolder anyRemote:testfolder2

It seems like a bug to me.

The opposite sequence also causes an error:

rclone mkdir anyRemote:testfolder
rclone mkdir anyRemote:testfolder/fileder
rclone touch anyRemote:testfolder/fileder/file
rclone sync anyRemote:testfolder anyRemote:testfolder2
rclone purge anyRemote:testfolder/fileder
rclone touch anyRemote:testfolder/fileder
rclone sync anyRemote:testfolder anyRemote:testfolder2

It seems like sync (and possibly also copy) are unable to gracefully handle situations where a file becomes a folder and vice versa.

Short answer - use --delete-before

Long answer:

I ususally ask what rsync does in this situation.

Initial sync

$ mkdir a b
$ echo hello >a/fileordir
$ rsync -av a/ b/
sending incremental file list
./
fileordir

sent 131 bytes  received 38 bytes  338.00 bytes/sec
total size is 6  speedup is 0.04
$ ls a b
a:
fileordir

b:
fileordir

Attempt to overwrite file with dir - success

$ rm a/fileordir 
$ mkdir a/fileordir 
$ touch a/fileordir/file.txt
$ rsync -av a/ b/
sending incremental file list
./
fileordir/
fileordir/file.txt

sent 166 bytes  received 46 bytes  424.00 bytes/sec
total size is 0  speedup is 0.00
$ tree a b
a
└── fileordir
    └── file.txt
b
└── fileordir
    └── file.txt

2 directories, 2 files

Attempt to overwrite dir with file - failure

$ rm -rf a/fileordir
$ echo helo > a/fileordir
$ rsync -av a/ b/
sending incremental file list
could not make way for new regular file: fileordir
cannot delete non-empty directory: fileordir
./

sent 78 bytes  received 64 bytes  284.00 bytes/sec
total size is 5  speedup is 0.04
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1330) [sender=3.2.3]

That kind of makes sense as overwriting one file is probably not a biggie, but overwriting a whole directory is. We could implement that in rclone and I'm pretty sure there is an issue about that already - yes here it is:

Thanks, very good explanation!

Would be nice, but seems tricky to combine with --track-renames (e.g. mv fileordir dir; touch fileordir) :wink:

It would be relatively easy to fix both the cases

  1. "overwrite file with directory"
  2. "overwrite directory with file"

We've seen that in case 1 rsync just overwrites the file. This is is probably OK for rclone sync but I'm not sure it is OK for rclone copy.

In case 2 we potentially lose an entire directory tree which seems less acceptable.

What I could do is add try --delete-before to the error message to give a hint as to a workaround? This will only work with rclone sync though not rclone copy.

What do you think?

Thanks. --delete-before is probably going to help here. I think it's the default option but it probably gets cancelled by --delete-excluded. In any case, this should solve the issue.

That kind of makes sense as overwriting one file is probably not a biggie, but overwriting a whole directory is

I agree that overwriting a whole directory is a big issue when it's wrong. But not being able to overwrite the whole directory when this is the right thing to do looks like a bigger problem. It's like this rclone configuration is making the right thing impossible in fear of maybe doing the wrong thing.

:crossed_fingers:

--delete-after is the default option as this is the safest option in terms of data loss.

--delete-during is the most efficient and --delete-before uses the least amount of data on the remote, and also fixes this problem!

One persons doing the right thing is another persons data loss! It is tradeoffs all the way down :wink:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.