This may be a bug. Or me not understanding the tool well enough.
I seeded a backup (generated by BorgBackup) on B2 last week. Because this was a large transfer [and my home upstream bandwidth is abysmal], I first rcloned the files to my laptop and from work, rcloned it again to B2. That part went smoothly enough. I used the MacOS version of Rclone (v1.42) to accomplish this.
The problem manifested itself when I did my incremental backup on Borg and attempted to sync the result to B2. This was done from my backend server (Ubuntu, also v1.42 of Rclone). I used the following command:
2018/07/28 10:10:51 INFO : README: Copied (replaced existing)
2018/07/28 10:10:51 INFO : config: Copied (replaced existing)
2018/07/28 10:10:52 INFO : data/0/10: Copied (replaced existing)
2018/07/28 10:10:52 INFO : data/0/100: Copied (replaced existing)
I was fairly confident some of these files had not changed since my initial seeding. I checked on backblaze.com and indeed, I had several versions of these files. The modified time (metadata on B2) and SHA1 were identical. It seems like those files should have been skipped during the sync process.
Workarounds:
I was able to get around the issue several ways using the following flags:
The --checksum flag rclone sync -v --checksum /path/to/borg/backup b2:bucket-name
This made the sync operation much slower so it's not a great work around...
By using the --update flag rclone sync -v --update /path/to/borg/backup b2:bucket-name
This was faster than the --checksum flag above. It felt a little too open ended to me so I opted for the next one.
By using the --modify-window 1s flag rclone sync -v --modify-window 1s /path/to/borg/backup b2:bucket-name
I tried smaller windows like 500ms but those didn't work.
This is caused by the time precision on the linux machine being 1ns but on the mac only 1s.
Unfortunately b2 can’t update the timestamp of an existing object (it is part of the immutable object) so rclone has to upload it again to update the timestamp. -vv will give more detail on what is going on.
It looks like you’ve found all the work-arounds. I would use the --modify-window one, and 1s is the correct value since that is the precision of most macOS file systems.
Do you know which file system you are running on your mac? I know that the old HFS+ file system only has 1second resolution but I think I read that there is a newer file system that mac's use now.
If you use ls -l --full-time on a Linux machine that will show you the full precision of the timestamps. I don't think that is supported by the macOS ls though. You can use rclone to see what it thinks rclone lsl --max-depth 1 . too.
From what I read online, APFS uses nanosecond based timestamps. I'm guessing that rclone doesn't support APFS's higher resolution timestamps yet?
Or perhaps a regression related to this:
What version of the go runtime is used in MacOS Rclone?
It would probably be worth touching a file (or making a new file with touch) just to make sure as these files might have been copied from a FS with second precision?
Ah, I just remembered stat should show you the full time precision - what do you get if you stat those files?
I made an rclone beta compiled with a recent go beta compiler:
OK so the reading and setting of ns precision timestamps is working. rclone uses the same API as that test program so it should work too. I'm not sure why not though. Is it something to do with the export of the APFS? I don't really understand how your system is setup!
Correct me if my thinking is wrong. The above command is ran from my mac and reading a remote (Linux) system. We verified that the remote system contains the ms resolution. How does the mac filesystem come into play during this command? Does rclone determine the resolution supported by the local filesystem and display the supported local resolution (even though the resolution on the remote is greater)?
I extracted the method into a small program and ran it on my box. I got 1ns precision.
What happens if the source (Linux) has ms precision and the destination has ns precision? I’m not familiar with the go language and I’m having a hard time reading through the code…
After digging a little more, I believe I understand the problem a little more clearly. My initial rclone command between my mac and my linux backend used the SFTP backend. According to this code:
rclone assumes the precision of the timestamps to be 1s when using sftp.
So it appears that the loss of timestamp precision did not come from the mac’s filesystem but rather from the SFTP backend. This also explains why the rclone ls command only displayed timestamps at 1s resolution.
The network file system will pass through the macOS VFS layer so if that doesn't support ns timestamps then nothing will. However we've determined that it does.
For all operations except checking remote time vs local time, rclone will just read whatever the go runtime and hence the OS provides for the timestamp.
Rclone only uses the resolution when it is doing a sync and is trying to work out if the local file is different in time to remote time.
Yes that is right.
That is what I'd expect.
rclone will read the timestamps at full resolution, but it only uses the precision for the comparison as explained above.
The SFTP protocol only allows setting timestamps to the nearest 1s - you can see this in the RFC
Ah ha! Mystery solved I hope you've found it an interesting journey!