Notes. This uses OS-level exclusive access to the file.
This is a very very common strategy on Linux even for professionally coded software (like Linux itself), but it works equally well on Windows.
The benefit of this it will be active as long as the script runs
but it will unlock as soon as the script stops for ANY reason - including a crash or process-kill. Thus it is failsafe. Any other process hitting an active writelock will just fail (and retry later if they are scheduled to do so). If you want that script to "wait in line" then you can code a simple loop for it, but that's usually not necessary unless you are looking for some kind of efficiency optimimization.
Also note - the file-lock here is on the script itself (denoted by ~f0 , which means "the full pathname of this script). It does not have to be this (but it is often convenient and what you need). If you need multiple different scripts to never run at the same time then use a separate file somewhere that will then be the common lock for all the scripts in that group. Any file in any permanent location will do. I don't think it needs to exist beforehand... but only 90% sure on that.
Ask if there is anything you want me to elaborate on
::Before we start, get a file-lock on this script to prevent concurrent instances call :getLock exit /b :getLock :: The CALL will fail if another process already has a write lock on the script call :main 9>>"%~f0" exit /b :main :: Body of script goes here. Only one process can ever get here
nice script, thanks,
One more note:
The reason there are 2 functions here rather than one (the first one seems pretty redundant right?) is a workaround for a batch-spesific quirk that may cause it to fail in certain circumstances if you omit it. (ie. the Linux version can be even simpler). But this double-call method makes it safe in batch in all circumstances. If you replicate this in python (which should work just fine) I don't think you will need it.
well, since i use vss for the source file and date.time including seconds, i do not need the file lock.
EDIT: Script updated to 1.3 to correct the 4 issues mentioned by asdfdsa later down. Now 1.3 for more fixes and improvements...
Well, since you knew the magic word (BOTH of them in fact!!) then I must oblige...
Here you go, all done! (now where do I send the bill? )
Script has been tested, but of course I recommend you also test it thoroughly yourself and see that everything works as you expect before putting it into a live environment. Also check that you feel the formatting of time/date is to your liking. Note that " : " (normally in time formats) is an illegal character on google drive and most filesystems so I had to replace that. I opted for human readability over easy computer parsing.
Makes a "mirror backup" on the cloud of your the folder you designate as the the source. This includes deletion of files, ie. if you delete something locally then it gets removed from the "current backup" also.
However, these files are not lost. Instead of being deleted or overwritten they are moved to the "archive" and timestamped. Thus you have a full "back in time" archive for all file revisions ever made in case you should need to recover anything - either due to malware, errors, mistakes or dead harddrives.
The script does not clean up old archive data automatically, so you may want to remove the oldest stuff manually once a year or something (but it is not really required when your storage is unlimited, up to you...).
The system is efficient and does not require any duplicate data to be stored.
New entries in the archive only happen if anything actually changed. Otherwise nothing needs to be done and rclone will just exit after checking everything is up to date.
- Change the settings so they are correct for you (ask if needed but I commented it pretty robustly I think)
- Use task scheduler to schedule the script to run as often as you wish. For example once every hour, but that is totally up to you. There is little to no waste of computer resources to run it often if you wish. Ask if you need help with that.
If the files you want to backup&archive include files with weird or strict permissions that your own user can not normally access then you may want to run the script as SYSTEM account instead of yours. However, if you do then we may need to check that rclone uses the right config. I don't expect this will be needed, but I mention it for the sake of completeness.
Google drive has a maximum upload limit of 750GB/day. That means that you may initially run into that limit and probably need several days to get up to speed. From there it should not be a problem as you most likely won't add or change 750GB pr day. Hitting the upload limit is the only circumstance under which rclone will fail do perform the backup&archive (which is not much we can do about really...)
It won't require any restart or interaction on your part though. It will keep running as scheduled even if that happened.
@asdffdsa Feel free to give this a once-over to see if I missed anything important
:: Archivesync v1.3 by Stigma, credit to asdfdsa(Jojo) for pseudocode assistance, debugging and emotional support ;) @echo off ::Before we start, get a file-lock on this script to prevent concurrent instances call :getLock exit /b :getLock :: The CALL will fail if another process already has a write lock on the script call :main 9>>"%~f0" exit /b :main :: Body of script goes here. Only one process can ever get here :: --------------SETTINGS START ---------------- :: set this to the folder where rclone.exe is located on your system set "rclonepath=C:\rclone" :: Set this to the folder (or driveletter) you want to protect with backup and archive set "sourcepath=F:\testsource" :: Set this the the folder (usually on a remote) you want to save the backup and archive to (does not necessarily have to be an rclone remote) set "destpath=TD5C1:\scripttest\CurrentBackup" :: Set this to the folder (usually on a remote) which will contain old "deleted" or "overwritten" revisions of files. I suggest keeping it next to your backup folder but it could be put anywhere. set "archivepath=TD5C1:\scripttest\archive" :: Set this path to where you want the logfile to be made. set "logfilepath=F:\logfiles" :: Set the detail of logging - from least verbose to most: ERROR or NOTICE or INFO or DEBUG (default, NOTICE, is usually sufficient) :: see documentaion for more info : https://rclone.org/docs/#log-level-level set "loglevel=INFO" :: Set any other non-essential flags you want rclone to use. Can leave as empty set "flags=" if you want none set "flags=--fast-list --progress --drive-chunk-size 64M" ::------------------SETTINGS END------------------ ::----------------MAIN SCRIPT START -------------- ::Various timestamp formatting in separate function (you can change timestamp there if you want). call :FORMATTIME ::Make the logfile directory if it doesn't already exist if not exist "%logfilepath%\" mkdir "%logfilepath%" echo Archivesync is starting, stand by ... echo rclone is gathering listing from the destination. This may take a minute if there is a lot of data. echo: :: Now let us sync. This makes a mirror of sourcepath to destpath (including removing files if required), and any files that get "overwritten" or "deleted" as a :: result from destpath, will be moved into archive and and timestamped instead - effectively creating a full archive of all revisions of files you have ever had. %rclonepath%\rclone sync "%sourcepath%" "%destpath%" %flags% --backup-dir="%archivepath%\%date%" --log-file="%logfilepath%\%date%.log" --log-level=%loglevel% echo: ::If exit code of above command was anything but normal, display an error and pause if not %ERRORLEVEL% equ 0 ( echo rclone reported an error during the sync. Check that settings are correct. Check rclone logfile for spesific info about the error. exit /b 1 ) else ( echo Sync completed sucessfully! exit /b 0 ) ::----------------MAIN SCRIPT END ----------------- ::--------------HELPER FUNCTONS START-------------- :FORMATTIME for /f "usebackq skip=1 tokens=1-6" %%g in (`wmic Path Win32_LocalTime Get Day^,Hour^,Minute^,Month^,Second^,Year ^| findstr /r /v "^$"`) do ( set day=00%%g set hours=00%%h set minutes=00%%i set month=00%%j set seconds=00%%k set year=%%l ) set month=%month:~-2% set day=%day:~-2% set hh=%hours:~-2% set mm=%minutes:~-2% set ss=%seconds:~-2% :: This can be easily modified to your liking if you prefer another timestamp format (for archive and logs) - credit to asdfdsa(Jojo) set date=%year%.%month%.%day%_%hh%.%mm% exit /b 0 ::----------------HELPER FUNCTONS END--------------
I want to thank each and every one of you for helping me with this.
You guys and gals / they / them's or whatever, are awesome people.
Now I've got something to work with, or at least a framework to build from.
Hopefully, with my limited knowledge of scripting, I'll learn something in the process without sporting the "Mr. Clean" look from pulling my hair out.
Also, if you get the commercial version of your backup going, let me know, and I'll purchase it. Hopefully, you'll sell me a lifetime license.
Currently, the data I am planning on backing up is only 75 gb. So, I won't have to worry about the google backup limit.
I do have another PC I want to backup, that has about 4 TB of data on it, mainly cad files and architectural drawings, revit files, and other boring stuff that sucks up gobs of storage space. Is there a way to rate limit the rclone process in this case? I think the upload is like 10 to 20 mbps. I don't think I can push up more than about 300 gigs a day, if memory serves me correctly, so it may be a non issue.
My Plex server... dear god. It will be easier to just re-download and sync from my seedbox and then using my vps to push up to my rclone drive, than attempting to move 20TB of "files" from my connection to a crypted rclone drive. I'm not really THAT worried about files I can simply download again. I'm more worried about work files I actually spent time creating.
hah! thanks for that. Maybe you could betatest for us or something? Have a chat with @arnoldvilleneuve about that if interested. Testers are kind of something we will need very soon. We already have a working prototype going, but we have lots more planned in terms of alternate deployment options too.
As for scripting, feel free to ask - or just request a change if there's something you need. I have plenty of hair left so I can save you some of yours
You can run the same script there. Just make a second location for the second PC in the settings.
You can limit rclone's speed with --bwlimit 10M (megabytes/sec, not mbit). Setting to approx 8.5M will make it never hit a limit with 24/ uploading going on. Unfortunately we still lack a truly elegant way of detecting and dealing with hitting the quota limit. rclone will just keep trying and kind of stall, not that it matters since it won't be able to upload more anyway. it will work again as soon as it resets (sometime around midnight, but it varies from server to server).
If you want to mass-transfer easily and your local bandwidth is not quite suitable for that much data - I'd recommend using a Google Cloud microinstance VM (Linux based) to do the job for you. Importing data to google is free, and the free-use limits are more than enough for the task, so you can actually do it free (or spend a couple of dollars on a windows server for a week or two if you feel more comfortable in that environment). It's quite cheap really... even if you go above free-use limits. Besides, you get 300USD in free cedits to spend the first yet so yea...
On a windows box you could literally just run the same script there and leave it running for 27 days - and those 20TB will be done on it's own... but of course it's not exactly hard to a much simpler Linux script for a simple mass-transfer if needed.
This sounds like the perfect job for Restic. I don’t have access to my configs right at this moment. But you would be able to setup pretty easily. It will do a snapshot of the data at the OS level. Which once the first backup is done is very quick. I backup around 30g of data from my VPS to pcloud every few hours and it only takes minutes.
This would achieve what your are looking for (dated backup etc) and you could even increase the frequency. I use it to backup a folder that contains applications, configs etc. So if I have a catastrophic failure of my VPS I can just download a copy!! I keep multiple hour copies for a short time, then daily’s and also weeklies. My retention periods are shortish as that is my requirement.... but easily tweaked to hold data for weeks, months, years!!
You can mount the backups as a local mount if you need to restore an individual file, or you can just stream it all out!! The data is in a protected vault so don’t lose that key!!
I am happy to do a quick write up on what I do if people are interested?? It has saved me a couple of times from ID 10 T errors!!
there is a possible bug, you do not set the filename.
and in the sync command you have
add a variable for flags
set flags=--fast-list -P --drive-chunk-size 64M
i format the date as yyyymmdd, makes it easier to sort in file manager.
add the time and date to the filename of the log file, so each run has it own log file.
it seems that restic is only on version 0.9.5, which is beta correct?
there are many other options that are more mature.
imho, i would not trust my backing up my critical file servers using such software.
Good eye my man (or monkey as it may be)
- is a straight up mistake - opsie
- I guess... nice to have maybe, easy to make
- Yea I did consider this briefly. Re-thinking about it I think you are right that it would sort a long archive much better that way. Shame that this also makes the date kind of unintuitive. Maybe I should make it display ****y.**m.**d---**h-**m to retain good human readability? (ie. add a letter identifier to each, reducing confusion). You and me would not be stumped by this, but it's not very intuitive for less technical users if the order is suddenly opposite to the normal. Adding those letter identifier should not impact sorting as they would be static across the board.
- Yes, this is a no-brainer improvement.
@Charles_Hayes I will see about making an improvement to this with a day or two - then I will ping you to notify. If you are already using the script your old archive timestamps won't match the new ones (which is not an issue aside from it being messier). Since you won't have that many yet you could probably just manually rename them if that bothered you...
well, even the monkey in me can parse 20191104.1231
the main thing is to have the folder and log files sorted consistently.
03.01.2019 would be listed next to 03.01.2020 and that is confusing and not good human readability.
Agreed, and also that's probably a good compromise at the end there.
@Charles_Hayes As the testing end-user here, feel free to chime in with your opinion. Do you think that is easily readable?
I just got a chance to "play" with this script.
I could not figure out how to get it to work, without mounting the drive.
Maybe I wasn't holding my mouth right?
Anyway, with some lite tweaking, and some cutting and pasting, I got it to backup.
As far as human readable, the way the directory structure is forming on my backup is a folder, in this format, in the archive folder.
Th 11 ---> 07--->2019.20.0.0.1
Yeah, I can make it out, and what it is doing.
I'm using a windows server, not a linux server.
That's the reason I didn't devote a lot of time or attention to restic to begin with, to be honest.
I've had some weird issues with linux inside of windows using the subsystem... and the machine I am backing up is relatively low on ram... 4 gbs total. So, I don't want to toss a gig or so at a vm / subsystem on this machine, it just isn't the ideal solution in my case.
I honestly have 50, 4tb drives, with "crap" I've accumulated over the years, that I'd like to push up. I just know that, if backblaze was any indication... it would take 3 years to transfer 20 tb... let alone 200TB.
yeah, i agree about restic and linux subsystem.
as a windows user, did i mention how to use VSS with rclone.
if you have any questions about veeam and rclone, let me know your questions.
i use veeam agent and veeam backup and replication, free community edition.
the quickest way to wisdom is to learn from the mistakes of others!
save time, learn from my mistakes...
if you want fast uploads for a good price, check out wasabi, get a free trial and make sure to use their new us-east-2 location.
i have a fios gigabit connection and it get great upload speeds.
I had a "nightmare" scenario with Veeam, ages ago, where the hard drive you were restoring to, had to be equal to, or greater than, the storage of the original drive, not the original data.
I had 4, 1 TB drives, in raid 10, so 2TB total storage, 150 gigs of data... that was originally on a 4TB hard drive. Needless to say, the backup refused to push back to the "smaller" drive, even though the space was more than adequate. This was at like 3 am, after a malware infection at a business client, and my first taste of "Veeam" as I was doing this as a favor for a friend. Odd thing is, there's nobody with 4 or 8 tb hard drives laying around at 3am when you need them.
So, before we go into tutorials, does it still have this limitation?