Sync to B2 modification time issue (replacing existing identical files)

This may be a bug. Or me not understanding the tool well enough.

I seeded a backup (generated by BorgBackup) on B2 last week. Because this was a large transfer [and my home upstream bandwidth is abysmal], I first rcloned the files to my laptop and from work, rcloned it again to B2. That part went smoothly enough. I used the MacOS version of Rclone (v1.42) to accomplish this.

The problem manifested itself when I did my incremental backup on Borg and attempted to sync the result to B2. This was done from my backend server (Ubuntu, also v1.42 of Rclone). I used the following command:

rclone sync -v /path/to/borg/backup b2:bucket-name

I immediately started seeing these:

2018/07/28 10:10:51 INFO : README: Copied (replaced existing)
2018/07/28 10:10:51 INFO : config: Copied (replaced existing)
2018/07/28 10:10:52 INFO : data/0/10: Copied (replaced existing)
2018/07/28 10:10:52 INFO : data/0/100: Copied (replaced existing)

I was fairly confident some of these files had not changed since my initial seeding. I checked on backblaze.com and indeed, I had several versions of these files. The modified time (metadata on B2) and SHA1 were identical. It seems like those files should have been skipped during the sync process.

Workarounds:
I was able to get around the issue several ways using the following flags:

The --checksum flag
rclone sync -v --checksum /path/to/borg/backup b2:bucket-name
This made the sync operation much slower so it’s not a great work around…

By using the --update flag
rclone sync -v --update /path/to/borg/backup b2:bucket-name
This was faster than the --checksum flag above. It felt a little too open ended to me so I opted for the next one.

By using the --modify-window 1s flag
rclone sync -v --modify-window 1s /path/to/borg/backup b2:bucket-name
I tried smaller windows like 500ms but those didn’t work.

This is caused by the time precision on the linux machine being 1ns but on the mac only 1s.

Unfortunately b2 can’t update the timestamp of an existing object (it is part of the immutable object) so rclone has to upload it again to update the timestamp. -vv will give more detail on what is going on.

It looks like you’ve found all the work-arounds. I would use the --modify-window one, and 1s is the correct value since that is the precision of most macOS file systems.

Thanks for the insight. The behavior that I saw now makes perfect sense. Rcloning to and from my Mac stripped the exact modtime from the files.

Yes that is correct.

Do you know which file system you are running on your mac? I know that the old HFS+ file system only has 1second resolution but I think I read that there is a newer file system that mac’s use now.

Rclone dynamically detects the precision of the file system. It looks like the go runtime supports setting ns precision timestamps so if the filesystem supported ns precision it should have used it.

If you use ls -l --full-time on a Linux machine that will show you the full precision of the timestamps. I don’t think that is supported by the macOS ls though. You can use rclone to see what it thinks rclone lsl --max-depth 1 . too.

@ncw The filesystem on my mac is APFS.

Executing the command from my mac (the remote is a Linux system):
rclone lsl --max-depth 1 home:/zfs/backups/borg/pictures-backup/

   26 2018-07-25 22:49:31.000000000 README
  164 2018-07-25 22:49:31.000000000 config
191065 2018-07-28 12:18:18.000000000 hints.47875
10485898 2018-07-28 12:18:18.000000000 index.47875

Only yields second resolution.

From what I read online, APFS uses nanosecond based timestamps. I’m guessing that rclone doesn’t support APFS’s higher resolution timestamps yet?
Or perhaps a regression related to this:


What version of the go runtime is used in MacOS Rclone?

Never mind, the version of Go used is 1.10.1 and should include the fix to the issue above…

rclone --version
rclone v1.42

  • os/arch: darwin/amd64
  • go version: go1.10.1

It would probably be worth touching a file (or making a new file with touch) just to make sure as these files might have been copied from a FS with second precision?

Ah, I just remembered stat should show you the full time precision - what do you get if you stat those files?

I made an rclone beta compiled with a recent go beta compiler:

https://pub.rclone.org/rclone-v1.42-062-gf72eade7-macOS-go-tip.zip

Does that make any difference?

It might be that we’ve discovered a bug in the go runtime. Alas I don’t have a mac to test stuff myself though!

Running stat directly on my home Linux box:

andre@synapse:/zfs/backups/borg/pictures-backup$ stat README
File: ‘README’
Size: 26 Blocks: 9 IO Block: 512 regular file
Device: 26h/38d Inode: 18046 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ andre) Gid: ( 1000/ andre)
Access: 2018-07-29 21:29:19.527709755 -0600
Modify: 2018-07-25 22:49:31.185911462 -0600
Change: 2018-07-25 22:49:31.185911462 -0600

andre@synapse:/zfs/backups/borg/pictures-backup$ stat config
File: ‘config’
Size: 164 Blocks: 9 IO Block: 512 regular file
Device: 26h/38d Inode: 18048 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ andre) Gid: ( 1000/ andre)
Access: 2018-08-01 07:00:34.367660591 -0600
Modify: 2018-07-25 22:49:31.185911462 -0600
Change: 2018-07-25 22:49:31.185911462 -0600

Which indicates that the original files do have the ms resolution.

I’ve tried the dev version of rclone you provided on my mac:

./rclone --version
rclone v1.42-DEV

  • os/arch: darwin/amd64
  • go version: devel +97fa89497d Tue Jun 26 09:55:39 2018 +0000

./rclone lsl --max-depth 1 home:/zfs/backups/borg/pictures-backup/
26 2018-07-25 22:49:31.000000000 README
164 2018-07-25 22:49:31.000000000 config
191097 2018-08-01 06:59:48.000000000 hints.47885
10485898 2018-08-01 06:59:48.000000000 index.47885

Unfortunately, still not working…

Does stat work on the mac - I think it should?

Can you try this binary?

https://pub.rclone.org/chtimes.zip

It is a compiled version of the code in the chtimes issue

It will create a file called example in the current directory, so can you try it in an APFS directory?

It should print something like this

go version: go1.10.1
mtime after creation: 2018-08-01 20:56:14.053417532 +0100 BST
setting mtime to: 2017-10-31 20:04:35.123456789 +0000 GMT
mtime after set: 2017-10-31 20:04:35.123456789 +0000 GMT

Running chtime on mac:

./chtimes
go version: go1.10.1
mtime after creation: 2018-08-01 13:59:33.452570867 -0600 MDT
setting mtime to: 2017-10-31 14:04:35.123456789 -0600 MDT
mtime after set: 2017-10-31 14:04:35.123456789 -0600 MDT

OK so the reading and setting of ns precision timestamps is working. rclone uses the same API as that test program so it should work too. I’m not sure why not though. Is it something to do with the export of the APFS? I don’t really understand how your system is setup!

The stat command on the mac doesn’t output resolution greater than a second:

stat README
16777220 4316533406 -rw-r–r-- 1 turgeona staff 0 26 “Aug 1 13:57:04 2018” “Jul 25 22:49:31 2018” “Aug 1 13:57:03 2018” “Jul 25 22:49:31 2018” 4194304 8 0 README

Correct me if my thinking is wrong. The above command is ran from my mac and reading a remote (Linux) system. We verified that the remote system contains the ms resolution. How does the mac filesystem come into play during this command? Does rclone determine the resolution supported by the local filesystem and display the supported local resolution (even though the resolution on the remote is greater)?

Is this the code which checks for the precision?

I extracted the method into a small program and ran it on my box. I got 1ns precision.

What happens if the source (Linux) has ms precision and the destination has ns precision? I’m not familiar with the go language and I’m having a hard time reading through the code…

After digging a little more, I believe I understand the problem a little more clearly. My initial rclone command between my mac and my linux backend used the SFTP backend. According to this code:

rclone assumes the precision of the timestamps to be 1s when using sftp.

So it appears that the loss of timestamp precision did not come from the mac’s filesystem but rather from the SFTP backend. This also explains why the rclone ls command only displayed timestamps at 1s resolution.

The network file system will pass through the macOS VFS layer so if that doesn’t support ns timestamps then nothing will. However we’ve determined that it does.

For all operations except checking remote time vs local time, rclone will just read whatever the go runtime and hence the OS provides for the timestamp.

Rclone only uses the resolution when it is doing a sync and is trying to work out if the local file is different in time to remote time.

Yes that is right.

That is what I’d expect.

rclone will read the timestamps at full resolution, but it only uses the precision for the comparison as explained above.

The SFTP protocol only allows setting timestamps to the nearest 1s - you can see this in the RFC

https://tools.ietf.org/html/draft-ietf-secsh-filexfer-13#section-7.7

Ah ha! Mystery solved :slight_smile: I hope you’ve found it an interesting journey!

In https://github.com/ncw/rclone/issues/1505 we are discussing flags for allowing the refresh of the modification time. However B2 does not support setting modification times for existing objects (they are immutable) so this won’t help you with B2.