Rclone sync bug: files are being erroneously erased/moved on the remote, causing subsequent rclone syncs to copy them over and over again (Was: Repeated "rclone sync" commands from a read-only local FS copying the same files over and over again?!)

@atrocity that is very useful thank you! Can you send me the crypt mappings for Squeezebox Music/NotDTS/Ian Dury as well please, then I think I will have enough information to track down what is happening :slight_smile:

that is very useful thank you! Can you send me the crypt mappings for Squeezebox Music/NotDTS/Ian Dury as well please, then I think I will have enough information to track down what is happening

I don’t know if it’s related, but several attempts to get this only returned:
Failed to create file system for
“hubicrypt:WD6TBNAS01/Squeezebox Music/NotDTS/Ian Dury”: failed to make remote
“hubic:default/Encrypted/dpu944o9e4q6khe1id5qe7g5hs/fs21bfj0b66sclla3eldm8amsmnlj
jsdv5km07a1uoaoop0llrvkg/9gchvbsi6baf2onkba2ijn5d38/rt8r5pnucq6qnon0kvj67cktfo”
to wrap: error authenticating swift connection: Get
https://api.hubic.com/1.0/account/credentials: invalid character ‘<’ looking for
beginning of value

But eventually it worked and I’ve placed the output at http://www.wywh.com/rclonelogs/IanDuryCrypt.txt

Thank you!

I’ve seen that before - I think it is the hubic server returning a 500 error… Not sure what I can do about that…

I’ve realised I need a bit more from you - I need the mapping of the “Ian Dury” directory itself too. I should have worked that out before, sorry!

Can you do an rclone lsd "hubicrypt:WD6TBNAS01/Squeezebox Music/NotDTS" --crypt-show-mapping which should show that please?

Can you do an rclone lsd “hubicrypt:WD6TBNAS01/Squeezebox Music/NotDTS” --crypt-show-mapping which should show that please?

Here you go:
http://www.wywh.com/rclonelogs/Mapping.txt

Thank you!

@atrocity

Thank you for your excellent work making logs for me.

I finally tracked down a bug in the listing paging code. Swift delivers listings in pages of 1,000 items. What was happening is that if the last item was a directory (in this case the encrypted “Ian Drury/”) then rclone removed the / from the name. However that name without the slash was used to get the next listing page which caused the encrypted “Ian Drury/” to be returned again - hence the duplicate we’ve been seeing in the log. I’m pretty sure (but not 100% confident) that this will cause the “Ian Hunter” directory not to be skipped too.

Here is a new beta which has that bug fix - this should definitely fix the duplicates, and I’m hoping it will fix the missing too! Done in commit ce1b9a7d.

https://beta.rclone.org/v1.36-237-gce1b9a7d/ (uploaded in 15-30 mins)

Let me know how you get on :slight_smile:

Note that this is a swift, hubic fix only so it doesn’t fix @durval problems. However I’m pretty sure that @durval problems are caused by duplicate files on drive which the new beta will warn about.

I’m giving it a try right now with a dry run. The log so far looks completely rational, but it’s taking an abnormally long time because it looks like hubiC is being slow this morning (well, morning for me!). I probably shouldn’t have bothered with a dry run.

Once (if!) the dry run completes I’ll do a live backup and then another dry run and report back.

Thank you for all your hard work!

1 Like

Everything looks good! The live backup transferred the files that had mistakenly been moved to the --backup-dir and the subsequent --dry-run didn’t find anything it needed to do.

Thank you for a great piece of software!

Excellent! Let me know if it does go wrong again.

That was quite a tricky bug to track down which has been in rclone probably since the very beginning!

You are welcome :slight_smile:

Folks,

Sorry for the long time before getting back on this.

I just posted an update, see https://github.com/ncw/rclone/issues/1431#issuecomment-321244537

In short, it seems that @ncw was right, and my entire trouble was due to duplicated directories.

I asked two questions, repeating them here for the sake of completeness (and possibly more eyeballs):

  1. How to solve that problem? Delete those duplicated directories and then repeat the sync? Would a simple “rclone purge” on each one of them (at the remote side, of course) suffice?

  2. I understand that Google Drive produces these duplicated directories “semi expontaneously” (ie, independent of rclone), is that correct? In that case, would running a rclone lsd --max-depth 999 egd: | cut -b 44- | sort | uniq -c | sort -n | grep -v ' *1 ' periodically (say, once a week) be enough to detect them so I could bring the above purgehammer to bear?

Thanks in advance for your response(s).

Cheers,
– Durval.

I wrote this on the issue, but I’ll repeat it here…

How to solve that problem? Delete those duplicated directories and then repeat the sync? Would a simple “rclone purge” on each one of them (on the remote side, of course) suffice?

If you try the latest beta you’ll find that rclone dedupe now merges identical directories. Try it with --dry-run first as it is new code!

I understand that Google Drive produces these duplicated directories “semi expontaneously”

Yes it seems to be random. I suspect some sort of not up to date caching in drive.

Running rclone dedupe --dry-run will detect them also quite well.

Hello Folks,

I just noticed I forgot to post the link to the github issue; here it is: https://github.com/ncw/rclone/issues/1431

Cheers,
– Durval.