[web] Automated Website Backup?

Started by
4 comments, last by leiavoia 15 years, 2 months ago
I'm currently planning a new website, and I'm trying to decide on a good way to regularly back it up. The backup will need to deal with regular files as well as MySQL databases. It will also need to be automated, as if it involves manual steps then I'll keep forgetting to do it. My current idea is to automate a backup to my home computer (an iMac running Mac OS X) and into my local backup system. I'm thinking something along the lines of this:
  1. Write some kind of cron job script to daily make copies of all the files and the database tables, and store them in a backup directory off the public site.
  2. When my home computer is on, it also does some kind of cron job to regularly check if there's a backup available, or otherwise knows one will be ready. It then connects through some kind of file transfer protocol automatically and recreates a copy of the backup directory here.
  3. Once the directory is here on my computer, it will automatically get backed up by my own system (I've got an Apple Time Machine system and do weekly backups of the repository)
The tricky parts I haven't yet decided on are:
  • I'll need to make multiple backups, otherwise if something gets corrupted it's likely to be backed up before I get a chance to fix it. This might mean I make multiple directories based on date.
  • The files won't change very often, so I probably only need to backup the changes, such as using rsync. However, I need to have the website in a form I can easily restore, so I need a method of keeping snapshots of the whole file system.
  • The database tables however are likely to change. If the SQL table dump is a single file, then it's always going to be different. It's also likely to be the largest file after a while. Sending the whole file over the internet every time seems overkill, so I'd prefer to send some kind of diff.
  • My bandwidth isn't the best, so I'd prefer to send across only a small amount of data at a time. Sending diffs instead of everything and using compression would be useful. Security in the compression would be nice too.
  • Copying files across is all very well and good, but if I don't have them in a form that I can restore from then they're useless. Ideally I could recreate a copy of my website here locally for testing.
  • My home computer isn't going to be on all the time, so I can't rely on it being on at any specific time.
  • I'm sure there's a system out there that already does something like this, but I don't know what it is!
The other alternative would be to get some kind of online backup service separate from my web host, and send regular copies across to them. I'm sure there's services out there that do this. Only downside is this method wouldn't send me a local copy, which might be good for testing, but I could use this in tandem with the first method if I wanted to be really safe. My queries:
  • Is there some kind of library out there that can deal with this for me, or will I have to roll my own?
  • If I'm writing this myself, what do you recommend for the tricky parts? (dealing with multiple backups, sending small amounts of data, reconstructing restorable copies)
  • Or is there a good online service that could deal with this for me?
Thanks in advance!
Advertisement
I would:
  1. Set up an SVN server on my local machine, with an identity certificate if the data to be stored is critical.
  2. Set up a cron job on the server, which uses rsync locally to move changed files to an SVN checkout directory. It should also use mysqldump with appropriate options to generate a diff-friendly dump of the database there.
  3. Set up a cron job on my local machine, which pings a certain URL on the server. This lets the server know the address of the local machine, and thus of the SVN server.
  4. In response to the URL ping, the server commits the SVN backup directory. You now have the latest backup stored in your SVN repository.
  5. Use the SVN administration tool to erase revisions that are too old to reduce memory usage.
That's not a bad idea! This is essentially a repository, just configured slightly different from what I'm used to seeing.

The only downside from that configuration is that there's only one copy on the remote server. If I don't get that copy from my local computer (say I go on holiday for a week), then it will get wiped in the next update.

Maybe I could mirror the SVN repository on both the local and remote servers? I don't know if that's possible. Or I could have the repository on the remote server, check out a copy here locally, and then that copy would get backed up by my local systems. I'd have to think about it a bit, but it's an interesting solution.
You could also back up to a different server instead of to your home PC. I have broadband with a fixed IP address at home so I put my development server at home. This simply uses rsnapshot to create backups of the live server. Since both servers run 24/7 I don't have to worry about checking dates ro connections. Just put rsnapshot in a cron job and sleep happily :-)

<hr />
Sander Marechal<small>[Lone Wolves][Hearts for GNOME][E-mail][Forum FAQ]</small>

Ah, thanks! rsnapshot looks exactly like the sort of thing I need. I'll put it on my list of things to tinker with.
I would actually recommend against the SVN idea unless you enjoy wasting time and losing hair.

I have something similar setup at work using SVN, rsync, and a small handful of test, demo, and production servers. It gets a bit messy.

The biggest issue is that SVN leaves a cute little .svn hidden directory in every single directory under control. If you wanted to put that on production servers, you get all those stupid little things up there as well. You could block access to them or something, but it gets messy, like i said. You can also use the "publish" function of SVN, but that was ugly too (can't remember why i didn't like that idea though - it's been a while). And finally, you have to put everything under version control and jump through all add/commit hoops.

For your project i would just use rsync. Rsync has a little-used flag called "archive" that stores dated copies of every file transfer you make. In case you need to go back in time, it's not too hard, but you won't be able to "roll back" like with SVN. Normally, that's not an issue because you are just moving files around, not developing new features.

In fact, there are some simple web-based publishing scripts (in perl and such) out there that let you select the files you want to move and then you just click GO and it runs. I've used them before, so i know they exist!

This topic is closed to new replies.

Advertisement