Rsnapshot backup system for remote network backup

Background/Overview

I have been interested in using rsync and hard links to backup my computer system for a long time. Something very similar to Mike Rubel’s idea. I dabbled here and there for a long time, but never finished putting something together. Well I finally got it done. I used a number of different sites for help and even a couple posts to the Rsnapshot website.

The backup was for both my personal storage sites and for my church. The requirements were such:

  1. Ability to backup systems located on a remote network over a non secure network
  2. Automatic backup that requires no interaction/reminders from the user accept to leave the computer powered on!
  3. Provide the ability to recover files that have changed over a long period of time (upwards of 12 months)
  4. Minimize the amount of storage space needed to store files.
  5. I would really like to have a push model from the client

A number of open source/very inexpensive software exists, but I ran into a few issues. All of the solutions that I looked at had one major issue, inability to operate over a remote network or they didn’t have a client for windows. If someone has a suggestion of something that might work please let me know.
I broke the solution down into 3 parts.

  1. Server
  2. Client
  3. Connection and Automation

Server

My “server” in both cases were old machines that had been updated with a relatively speaking large 300GB hard drive. They each had a 1GHz processors and 384MB of RAM. They were recently taken out of use, when replaced by new PCs at my home and the church.

In starting this project my exposure to Linux had been limited to some quality time with IPCop and few other purpose built distros. However, for this I need something a little more full featured, on recommendation from a friend I choose Ubuntu. Being new to linux I downloaded the desktop editions, since it included a GUI, but after this experience I don’t think I would if I were to start all over again.

There are a number of install instructions for Ubuntu on the web and the install itself is rather straight forward. After you have installed the machine you need to install Rsnapshot. It might be available under the add/remove scripts section, but you might also have to install it via apt-get. Good instructions for this are available on the Rsnapshot website.

After you have completed the install of rsnapshot you need to go find the rsnapshot.conf file. It is likely located in the /etc/rsnapshot directory.

TIP: Something I found a little frustrating throughout the first hour or two of this is that this file is owned by the root user, which means you have to open it differently to edit. I used the following command line to change it to my user. From the terminal screen

sudo chown BACKUPUSER  /etc/rsnapshot.conf

At this point you need to edit the rsnapshot.conf file. A really great resource for configuring the config file is here. It explains most of the options pretty well. There are a couple that have a few hidden secrets/gems that you can use to make your life easier. I will explain them below.

snapshot_root: can be anywhere on the machine, but it has to be writeable by the user running the process. I decided to run Rsnapshot, not as root so I made sure the BACKUPUSER had premission to write to a directory called /backup.

verbose and loglevel: It can be very helpful to set these at 4 or 5 when you first run, it makes it a lot easier to see what is happening if you do have errors. May want to set them to lower levels for a production setup to save disk space.

rsync_short_args: I decided to use the -az options to take advantage of compression since network bandwidth was more important to me than processor utilization.

Finally you have to configure the clients. I tried to connect directly via ssh and kinda got it to work, but found that errors in Cygwin caused things to be less than reliable. This was confirmed by a very quick response on the Rsnapshot mailing list (This list is very helpful, but be sure to read the archives. There is a lot of good information in the mailing list archives)

So the approach I took was to port forward. This took me a while to figure out, but again I found a good resource on the web here. This will make more sense in just a second in the Connection and Automation section of the documentation. However, I can provide a quick explanation of what is going on in this section.

To run reliably we connection to a port on the server that has been forwarded to an rsync machine running on the client. Thus to backup PC1, port 4651 on the server was forwarded to port 873 on PC1. Then each module of rsync running on the machine was backup to a folder in the backup directory. I choose the ports based on convenience and used a number scheme where the ip of the machine in the remote network indicated the port that would be used. (ie. xxx.xxx.xxx.51 get backed up via port 4651)

###############################
### BACKUP POINTS / SCRIPTS ###
###############################

#Backup of PC1
backup    rsync://127.0.0.1:4651/quickbooks/    PC1/quickbooks
backup    rsync://127.0.0.1:4651/staff/    PC1/staff

#Backup of PC2
backup    rsync://127.0.0.1:4652/d/    PC/d/
backup    rsync://127.0.0.1:4652/desktop/    PC/desktop/

At this point I would recommend running rsnapshot -t daily (assuming your most frequent backup is daily). The error messages are pretty helpful and will tell you if you have made any mistakes.

Setup SSH connections on Server

To allow the server to connection to the remote machines, you need to set it up so the server can access them without a password. I actually used two separate site to set this up. The first made it real easy for the beginner to setup with step by step instructions. I didn’t use step 2 for reasons to be explained below, but you can use it if able.

The second site I used was here. I started using then at the section labeled Making A SSH Host Configuration Entry. You will edit a file called config in you .ssh directory. It can likely be found at /home/BACKUPUSER/.ssh/config. I have provided a sample of my file below to demonstrate:

Host PC1
Hostname server1.dyndns.org
User SvcwRsync
Port 4653
HostKeyAlias Port4653
LocalForward 4653 localhost:873
CheckHostIP no
Host PC2
Hostname server1.dyndns.org
User SvcwRsync
Port 4660
HostKeyAlias Port4660
LocalForward 4660 localhost:873
CheckHostIP no

This file allows you to connect to remote machines by simply typing the Host name in the command line!

Tip: All the machines I backup are behind a single real IP address. To get this to work I had to add the HostKeyAlias and the CheckHostIP no paramenter to get it to work. I am not 100% sure of the security implications, but it worked.

I lost the links that helped me figure it out, but most of the information came from the SSH man page.

After you have done all of this you are ready to setup your individual clients

Client

I tried a couple of different things, but the item that ended up working out the best for me was to have rsync running on the local machines. This was facilitated by using the program cwRysnc from ITeF!x. I wrote some quick directions as I was doing this to make sure I didnt make a mistake on the tenth machine.

1. Install cwRsync Server

  • Don’t forget to check the SSH option

2. Setup Rysnc and SSH

  • Set OpenSSH and Rysnc server to auto start as services in windows. This can be found via the Start>Control Panel then select Administrative Tools, then services

3. Update Authorized_keys and Authorized_key2

  • I was never able to figure out how to log in via ssh, seemed to have some password issue, even when using the password cwRsync gave at install. This was a problem for others as reported in the forums and at the time I did the work no resolution was posted. If you can get this to work try step 2 in the tutorial from Cory, but you can work around it just fine.
  • From your server copy the /home/BACKUPUSER/.ssh/authorized_keys and authorized_keys2 to a thumb drive or disk to install on all the clients. I made a quick edit to the files to include the following information from=”XXX.XXX.XXX.XXX” This means that the ssh server on the clients should only except connections from the connection above. This could be an issue if you have a non static IP address, but mean is “sticky” and hasnt changed in over 18 months. I can always update it if needed
  • Copy and paster the information in the authorized_keys and authorized_keys2 file to the files with the same name in C:\Program Files\cwRsyncServer\var\SvcwRsync\.ssh directory.

4. Prepare rsync.conf file

  • Don’t forget to backup desktops and email! Especially when you have windows computers, files can be

5. Secure SSH Server

  • To make it more difficult to break into the remote machines I changed the sshd_config file located C:\Program Files\cwRsyncServer\etc. I uncommented and changed the following line # PasswordAuthentication yes to PasswordAuthentication no. From my understanding this should make it so the SSH server on the client doesnt accept logins based on passwords.

6. Firewall Configuration

  • Depending on your installed OS you may need to configure the local firewall to enable the SSH connections. I almost went crazy with problems related to SSH connections being refused, until I opened up the port on the local firewall and it worked just fine.

Connection and Automation

I took advice from this page when trying to figure out how to automate everything. I missed the cmd_preexec and cmd_postexec options in the rsnapshot.conf when I started doing this, but I imagine the script I will describe below could be split in two pieces and run by these commands.

FILE: BACKUPPLAN

#!/bin/bash
# Get ProcessID Function
GetPid()
{
# get process ID of running proxy, if any
PID=`ps x | grep "ssh -f -N" | grep -v grep | awk '{print $1}'`
}

#Connection to PC1
SSH_STRING1="-f -N  PC1"
ssh $SSH_STRING1

#Connection to PC2
SSH_STRING2="-f -N  PC2"
ssh $SSH_STRING2

#Run rsnapshot program to backup!
rsnapshot $1

#Kill all ssh tunnels
GetPid
kill $PID

This script establishes the connections kicks off rsnapshot and then closes the connections. It works pretty well for me and I just schedule it via crontab and it works well. Below you can see the crontab settings.

# m h  dom mon dow   command
0 2 * * * /bin/bash /home/backup/BackupPlan daily >> /home/backup/backup.log
0 18 * * 6 /bin/bash /home/backup/BackupPlan weekly >> /home/elmills/backup.log
0 21 2 * * /bin/bach /home/backup/BackupPlan monthly >> /home/elmills/backup.log

Firewall

You have to setup your firewall to forward the various ports that you open to the respective machines. So you can forward port 4653 on your firewall to port 22 on machine 192.168.0.53. I use IPCOP and just edited via the GUI. One thing I did do to add a fair amount of security is to set it so it only forwards the ports if they come from the correct location (ie the right IP address). I have “pretty static” connections that don’t change except once or twice a year so I just manually update this value when they change. If your machine that is making the requests is behind a firewall that changes regularly you may consider changing some of the security settings in the ssh server on the client.

Rsnapgraph

I also setup Rsnapgraph to create some nice visual representations of the usage. I actually just added it to the script described above and just added the line rsnapgraph $1 as the last line to run the program each time that rsnapshot was run.

Conclusion

This systems seems to work pretty well. I have trained all the users to leave there machines on over night so the backup happens. A fine minor issues with permissions and other items have sprung up, but otherwise it has been a pretty good solution so far.

The only challenge I have know if figuring out how backup my one user with a laptop. They move around and are on different networks all the time. If anyone has suggestions please let me know.

No Comment

No comments yet

Leave a reply