Atime/relatime support?

rucklex · July 27, 2023, 7:25pm

Hi there! I'm using Proxmox Backup Server to backup some VMs and would like to use a rclone mount as a datastore, and have that working. However, after 24h or so the garbage collection wipes out all chunks (machines split up into XMB for deduplication purposes) thus destroying every backup's data.

This turns out to be because the stage 2 of garbage collection relies on atime of the file to check when it was last accessed. So it sees nothing updated atime wise and then goes and deletes literally everything that wasn't created recently. (Including, I think, parts of the backup that would have been deduplicated before). Not great for backups :(. It seems that stage 1 enumerates every backup on record and then goes around and resets the atime of every 'in use' chunk. Kind of an interesting design, but seemingly incompatible with rclone.

Is there a way with, say wrapping the remote in crypt/compress/something, to keep the atime PBS requires? Or perhaps this can be a metadata field that's (optionally) turned on? I'm using B2 as my backend (which doesn't support metadata directly, though maybe via their S3 endpoint?), but wouldn't be opposed to wrapping everything in crypt.. the backups are encrypted with PBS but a second layer wouldn't hurt.

Thanks

nb for others: I had to do one silly work-around, which is to create a .placeholder file in every directory PBS made in .chunks, because it separates chunk files into prefixed directories e.g. (.chunks/eb8c/eb8ca....) and when the cache expired the backup process would crash as it expects those directories to always be there and won't recreate them.

kapitainsky · July 27, 2023, 7:39pm

PBS is designed to work on local disk - and ideally SSD. It is IMO very bad idea to do what you try. rclone mount is only illusion of local disk - it will never be perfect. On top of this you will never be sure if your cloud "backup" is consistent or not. Maybe some network glitch and what you have in the cloud is useless.

What I do (I use PBS too) I run proxmos backup - when finished I take ZFS snapshot - and use this snapshot to use restic (rclone is not backup program at all) to create cloud snapshot. Is it finished or not does not matter. I always have last fully finished backup of consistent PBS datastore state.

rucklex · July 27, 2023, 8:04pm

Hmm, yeah I guess PBS doesn't verify the checksums of the data on disk does it? Just that the files exist?

Thanks for the restic pointer! I don't have a lot of extra local disk, but in this case I'll need to make it a priority then. I can start with the 'most important' ones first, though! If you have any tips/tricks for the restic direction with PBS too I'm all ears

kapitainsky · July 27, 2023, 8:15pm

define your goals:)

PBS is very nice product thx to very close integration with PVE. I have 2 TB disk for VMs and PBS is using 8TB disk. It works beautifully when I have to roll back any VM. I run hourly live PBS backups.

Cloud part is more for disaster recovery - if all my servers burn. I run it overnight - once a day - using restic to have cloud backup.

I use ZFS everywhere and local shapshots - it is especially critical for any cloud - you never know how long it takes to sync. But if you run it against snapshot you do not care (disk space permitting of course).

rucklex · July 27, 2023, 8:28pm

Ah I see now, so you snapshot PBS' datastore itself when all is finished then ship that to the cloud. That makes sense.

I guess due to deduplication at its core I could crank that up as well once I expand local storage to cover most everything.

And yeah, just in case of catastrophic failure is probably where it's going to end up for me too now.

Initially, I was hoping to have once a day backups of VMs sent off to B2 in case the world burned here and I lost my (RAIDed) discs (RAID isn't backup! ) or hardware failure, or theft/fire/etc but also in case of any accidents.

Just sort of testing the waters. I did manage to recover from a 'everything died' from my simple rclone sync job from PBS' local storage to b2 in the past, but was wondering if I could do more wild (or expanded) backups with just B2 as the VMs have outgrown the little PBS I have. But as you said, not the best idea.

Hopefully PBS implements their postponed Object Storage backend proposal at some point as well!

kapitainsky · July 27, 2023, 8:32pm

Yes but it was purely by luck that your sync was successful - we all know that disasters come in pairs:) And simple sync is invitation for perfect storm. What I mean by this is that when you start sync until it finishes you do not have cloud backup at all. As it is in undefined state. Especially for something like PBS - it is like russian matryoshka. One software data enveloped in other software package. If one layer breaks - it is domino - everything will be useless.

Good solution should take into account problem at any stage - simple sync does not - it assumes that it is successful.

rucklex · July 27, 2023, 8:41pm

I'm very glad I was lucky then! Thank you, will look into restic going forward

kapitainsky · July 27, 2023, 9:05pm

And do not forget this "classic" example of sync vs backup with versioning. Some evil people encrypted your data to try to ransom you (or you just deleted something by mistake). You run your next sync - now all catastrophe is synced. There is no way back. But all is in sync:) This was definitely not a goal.

system · September 25, 2023, 9:06pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.