I have a previous backup on B2 made with rsync. I want to migrate to rclone and avoid re-uploading the data as much as possible. However, the --dry-run output from the sync command seems to indicate that it would update modification times on some files, but copy others, even though they haven't changed.
Using rclone v1.52.2 on Ubuntu 16.04LTS VM with 1.5G RAM.
But why is it using checksums for only some of the files?
Also, ideally I'd like rclone to update the mod time on the remote according to the local files so that going forward I don't have to use --size-only and such.
After some more digging, turns out that the previous backup did not keep the modification times on the files, but updated them to be the time of the backup. So on the remote end, the files have a newer time stamp than on the source.
I tested sync on 2 files, one 39M in size and one 140M. For the first one rclone simply updated the modification time on the B2 remote (setting it correctly to the older timestamp that the source file had), but for the second one it uploaded the file and also modified the time. Then I touched the large file and ran rclone again; this time it simply updated the modification time and did not re-upload the file.
Any idea why rclone isn't simply doing "update the timestamp" operation for all the files?
andrei@docker-vm:/mnt/Photos/5DayDeal$ rclone -vv --fast-list --progress --b2-chunk-size=40M --b2-upload-cutoff=200M --transfers 16 --exclude .DS_Store sync pb-01-freebies-and-bundle-info.zip b2:qnap-media-sync/Photos/5DayDeal
2020/06/25 20:20:48 DEBUG : rclone: Version "v1.52.2" starting with parameters ["rclone" "-vv" "--fast-list" "--progress" "--b2-chunk-size=40M" "--b2-upload-cutoff=200M" "--transfers" "16" "--exclude" ".DS_Store" "sync" "pb-01-freebies-and-bundle-info.zip" "b2:qnap-media-sync/Photos/5DayDeal"]
2020/06/25 20:20:48 DEBUG : Using config file from "/home/andrei/.config/rclone/rclone.conf"
2020/06/25 20:20:48 DEBUG : fs cache: renaming cache item "pb-01-freebies-and-bundle-info.zip" to be canonical "/mnt/Photos/5DayDeal"
2020-06-25 20:20:48 DEBUG : pb-01-freebies-and-bundle-info.zip: Modification times differ by 7h14m55.3635648s: 2019-02-08 20:48:33.6364352 -0600 CST, 2019-02-09 10:03:29 +0000 UTC
2020-06-25 20:20:48 DEBUG : pb-01-freebies-and-bundle-info.zip: SHA-1 = 8af0d7ea0d28bf600f49fc252363d814bc04931a OK
2020-06-25 20:20:50 INFO : pb-01-freebies-and-bundle-info.zip: Updated modification time in destination
2020-06-25 20:20:50 DEBUG : pb-01-freebies-and-bundle-info.zip: Unchanged skipping
Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA -
Checks: 1 / 1, 100%
Elapsed time: 0.0s
And the larger one.
andrei@docker-vm:/mnt/Photos/5DayDeal$ rclone -vv --fast-list --progress --b2-chunk-size=40M --b2-upload-cutoff=200M --transfers 16 --exclude .DS_Store sync pb-03-travel-pro-kit-viktor-elizarov.zip b2:qnap-media-sync/Photos/5DayDeal
2020/06/25 20:22:02 DEBUG : rclone: Version "v1.52.2" starting with parameters ["rclone" "-vv" "--fast-list" "--progress" "--b2-chunk-size=40M" "--b2-upload-cutoff=200M" "--transfers" "16" "--exclude" ".DS_Store" "sync" "pb-03-travel-pro-kit-viktor-elizarov.zip" "b2:qnap-media-sync/Photos/5DayDeal"]
2020/06/25 20:22:02 DEBUG : Using config file from "/home/andrei/.config/rclone/rclone.conf"
2020/06/25 20:22:02 DEBUG : fs cache: renaming cache item "pb-03-travel-pro-kit-viktor-elizarov.zip" to be canonical "/mnt/Photos/5DayDeal"
2020-06-25 20:22:02 DEBUG : pb-03-travel-pro-kit-viktor-elizarov.zip: Modification times differ by 7h14m50.3825645s: 2019-02-08 20:48:43.6174355 -0600 CST, 2019-02-09 10:03:34 +0000 UTC
2020-06-25 20:22:38 DEBUG : pb-03-travel-pro-kit-viktor-elizarov.zip: SHA-1 = d125385715ad983193804c53cb980102b53b359d OK
2020-06-25 20:22:38 INFO : pb-03-travel-pro-kit-viktor-elizarov.zip: Copied (replaced existing)
Transferred: 111.033M / 111.033 MBytes, 100%, 3.135 MBytes/s, ETA 0s
Transferred: 1 / 1, 100%
Elapsed time: 35.4s
So it looks like the smaller file had SHA-1 checksum on the remote, but the larger one didn't.
I thought rclone wouldn't be using the checksums by default, see this in the docs:
-c, --checksum
Normally rclone will look at modification time and size of files to see if they are equal. If you set this flag then rclone will check the file hash and size to determine if files are equal.
Is there a way to force rclone to update the larger file's mod time?
I think the problem is how the files got uploaded when you used rsync. How did you upload them - to an rclone mount? With which version of rclone?
I don't think these files got checksums (checksums have to be added by the client for large files ) so rclone is refusing to just set the modtime as it isn't sure the files are identical.
However if you try the latest beta then you can do a sync with
--refresh-times Refresh the modtime of remote files.
This will set the modtime even if the files don't have a checksum. I suggest you try with --dry-run, try on a few files then run. You won't need that flag again once the modtimes are synced.
I think the problem is how the files got uploaded when you used rsync. How did you upload them - to an rclone mount? With which version of rclone?
Actually, I was wrong, the initial backup was seeded with QNAP's HybridBackupSync tool, so I think you're right that it didn't add the checksums for large files.
--refresh-times Refresh the modtime of remote files.
I'll give it a shot!
Can rclone add checksums to those large files too without uploading or is that impossible?
It is theoretically possible. You'd have to download the file to checksum it or get the checksum from a local copy. Rclone can't do it right now though, sorry!