Update to Malware Protecting Script

In an attempt to be a little more friendly in terms of bandwidth to the strapped folk over at Malware Domains, I’ve retooled the script I wrote about in the post To Be Protecting Against the Malware.

I’ve added some lines to take advantage of remote zipped files (.zip), which will help them by reducing the number of bits we’re pulling from them.

I’ve added some lines to copy the downloaded malware zones file to other servers behind my firewall, which will help them by not making individual connections from each server to pull the files. I just set up a cron job on each internal “slave” server to bounce named every morning timed for after this process is complete.

Here’s the updated code. It is, as is my wont, rather verbose. It is considerably more verbose than other examples out there that take care of this same problem, but as I said, such is my wont.

The URLS array is filled with fake hosts right now b/c the zipped format is still in testing. When the folk at malwaredomains.com think it’s ready for public consumption, I’ll put the real hosts back in.

Also, it’s relatively untested, and I expect to be tweaking it. Use at your own risk.

On the “master” server, I’m using this…

#!/usr/local/bin/bash

# To know where script is running
HOSTNAME=$( hostname )

# To put file where named can see it
BLACKHOLEDIR=/var/named/etc/namedb/blackhole

# To name file so we know what named seeing
TMPZONEFILE=tmp.malwaredomains.zones
ZONEFILE=malwaredomains.zones
ZONEFILEBACKUP=malwaredomains.zones.bak

# To get updated file from remote server
URLGRABBER=/usr/local/bin/curl
USERAGENT="Malware Domain Grabber ( ${HOSTNAME}; unix; BASH )/0.1"

# To keep quiet while am getting file
URLGRABBEROPTS="-s -f"

# To know where file is hosted
URLS=( host1 host2 host3 host4 )
TIMESTAMPFILE=timestamp

# To know how to decompress the file
UNZIPCMD=/usr/local/bin/unzip
UNZIPOPTS="-o -qq"

# To copy files to other servers so that we are only
# pulling the files once, though we have multiple
# DNS servers in house

HOSTS=( host1 host2 host3 host4 )

# MOUNTCMD: The mount command
MOUNTCMD=/sbin/mount
UMOUNTCMD=/sbin/umount

# FSTYPE: The filesystem type of the mounted partition
FSTYPE=nfs

# MOUNTDIR: The directory that the dumps will be written to
MOUNTDIR=/mnt

# To control bind
NAMEDCMD="/usr/sbin/rndc reload"

#==============================================================

# Get start time so we can know how long this thing runs
START=$( date +%s )

# Make our working directory the location of the blackhole files
cd ${BLACKHOLEDIR}

# Copy the current timestamp file to ${TIMESTAMPFILE}.old so we can
# make a comparison between what we have and what's out there now.
if [ -f ${BLACKHOLEDIR}/${TIMESTAMPFILE} ]; then
	cp ${BLACKHOLEDIR}/${TIMESTAMPFILE} ${BLACKHOLEDIR}/${TIMESTAMPFILE}.old
fi

# Attempt to download the timestamp file and zone file from each mirror.
# Break out of the loop at the first successful download of a zone file,
# otherwise, try each one in turn

# Assume there are no updates available
NEW=0

for URL in "${URLS[@]}"; do
	echo "Attempting to download from ${URL}"
	echo " Checking timestamps..."
	${URLGRABBER} ${URLGRABBEROPTS} -A '${USERAGENT}' -o ${BLACKHOLEDIR}/${TIMESTAMPFILE}.zip ${URL}/${TIMESTAMPFILE}.zip

	if [ $? -ne 0 ]; then
	echo "  ... timestamp download from ${URL} failed! Code: $?"
	# Move on to next URL so we keep the timestamp/zonefile pair intact
	continue
	else
	if [ -f ${BLACKHOLEDIR}/${TIMESTAMPFILE} ]; then
		# Unzip the new timestamp file over the old old one
		${UNZIPCMD} ${UNZIPOPTS} ${BLACKHOLEDIR}/${TIMESTAMPFILE}.zip

		# Do a little cleanup
		rm -f ${BLACKHOLEDIR}/${TIMESTAMPFILE}.zip

		OLDTIMESTAMP=$( cat ${BLACKHOLEDIR}/${TIMESTAMPFILE}.old )
		NEWTIMESTAMP=$( cat ${BLACKHOLEDIR}/${TIMESTAMPFILE} )

		if [ ${OLDTIMESTAMP} -ge ${NEWTIMESTAMP} ]; then
			echo " ... no new updates."
			# No new updates on this server... but how well are the various mirrors
			# kept in sync?  Let's try the others. This is a tiny transfer, and it's
			# only once a day, so it's pretty cheap.
			continue
		fi
	else
		# Timestamp file does not exist. Create it.
		${UNZIPCMD} ${UNZIPOPTS} ${BLACKHOLEDIR}/${TIMESTAMPFILE}.zip
		rm ${BLACKHOLEDIR}/${TIMESTAMPFILE}.zip
	fi
	fi

	# Backup and copy file to final location for named to find
	# (via "include" directory in named.conf)
	echo "Backing up zone file"
	cp ${BLACKHOLEDIR}/${ZONEFILE} ${BLACKHOLEDIR}/${ZONEFILEBACKUP}

	echo "Retrieving new zone file from ${URL}..."
	${URLGRABBER} ${URLGRABBEROPTS} -o ${BLACKHOLEDIR}/${ZONEFILE}.zip ${URL}/${ZONEFILE}.zip

	if [ $? -ne 0 ]; then
		echo "  ... zonefile download from ${URL} failed!  Code: $?"
		# Oops.  Try the next server.  If this is the last, then ${NEW} is still
		# set to 0, and we'll be done. Better luck tomorrow...
		continue
	else
		# We have a new timestamp, and were able to download the zone file from
		# the same server we downloaded the timestamp from.  Set ${NEW} to 1 and
		# get out of the loop. No need to check further.

		echo "Unzipping new zone file..."
		if [ -f ${ZONEFILE}.zip ]; then
			${UNZIPCMD} ${UNZIPOPTS} ${BLACKHOLEDIR}/${ZONEFILE}.zip
			rm ${ZONEFILE}.zip
			# Rename the zone file temporarily to allow sed to work on it later, and
			# and in that process, rename it back to the name that named knows.
			mv ${ZONEFILE} ${TMPZONEFILE}
		else
			echo "No new zone file..."
			exit
		fi

		NEW=1
		break
	fi
done

# If ${NEW} hasn't been set, then we either error'd out of all servers, or there are no
# new files. Either way, we're done.
if [ ${NEW} == 0 ]; then
	exit 1
else
	# Disable name checking for only those domains with underscores,
	# so we don't have to turn off name checking globally.

	SEARCH='_'
	FIND='blockeddomain.hosts";};'
	REPLACE='blockeddomain.hosts"; check-names ignore;};'

	# Get a count of the zones from the last update
	OLDZONECOUNT=$( cat ${BLACKHOLEDIR}/${ZONEFILEBACKUP}|grep "^zone"|wc -l )

	echo "Disabling checking on domains with underscores"
	sed "/${SEARCH}/ s/${FIND}/${REPLACE}/g" ${BLACKHOLEDIR}/${TMPZONEFILE} > ${BLACKHOLEDIR}/${ZONEFILE}
	rm -f ${BLACKHOLEDIR}/${TMPZONEFILE}

	# Get a count of the zones from the current update
	NEWZONECOUNT=$( cat ${BLACKHOLEDIR}/${ZONEFILE}|grep "^zone"|wc -l )
	echo "${OLDZONECOUNT} Previous Zones"
	echo "${NEWZONECOUNT} Current Zones"

	echo "Reloading named"
	${NAMEDCMD}

	if [ $? -ne 0 ]; then
		echo "  ... failed! Restoring zone file"
		cp ${BLACKHOLEDIR}/${ZONEFILEBACKUP} ${BLACKHOLEDIR}/${ZONEFILE}

		echo "Reloading old zones in named"
		${NAMEDCMD}

		if [ $? -ne 0 ]; then
			echo "  ... failed again!! You'll want to see to that."
		fi
	fi

	echo "Copying files to other internal network servers..."

	for HOST in "${HOSTS[@]}"; do
	DUMPDEVICE=${HOST}:${BLACKHOLEDIR}
	MOUNTRESULTS=$( ${MOUNTCMD} | grep "${DUMPDEVICE} on ${MOUNTDIR}" )

	if [ "${MOUNTRESULTS}" == "" ]; then
		echo ""
		echo "Mounting ${DUMPDEVICE} on ${MOUNTDIR}"
		${MOUNTCMD} -t ${FSTYPE} ${DUMPDEVICE} ${MOUNTDIR}
		if [ $? = 1 ]; then
			echo " ... failed. Files will not be copied."
			continue
		else
			echo " ... succeeded"
		fi
	else
		echo "${HOSTNAME}:${DUMPDEVICE} already mounted on ${MOUNTDIR}"
	fi

	# Copy the files to ${MOUNTDIR} as a temporary file. On the remote server,
	# we'll manage bouncing named if necessary.
	echo ""
	echo "Copying ${BLACKHOLEDIR}/${ZONEFILE} to ${TMPZONEFILE}"
	cp ${BLACKHOLEDIR}/${ZONEFILE} ${MOUNTDIR}/${TMPZONEFILE}
	if [ $? = 1 ]; then
		echo "... Failed to copy ${ZONEFILE}! You might want to see to that."
	fi
	# Umount the backup filesystem
	echo ""
	echo "Unmounting ${MOUNTDIR}"
	${UMOUNTCMD} ${MOUNTDIR}
	if [ $? = 1 ]; then
		echo " ... failed. You might want to see to that."
	else
		echo " ... succeeded"
	fi
	done

	END=$( date +%s )
	RUNTIME=$(( ${END} - ${START} ))
	H=$(( ${RUNTIME}/3600 ))
	M=$(( ( ${RUNTIME}/60 ) % 60 ))
	S=$(( ${RUNTIME} % 60 ))

	echo "Malware zonefile download on ${HOSTNAME} complete in"
	echo "${H} hrs, ${M} mins and ${S} secs (${RUNTIME} secs)"

	exit
fi

On the “slave” servers, I’m using this…

#!/usr/local/bin/bash

# To put file where named can see it
BLACKHOLEDIR=/var/named/etc/namedb/blackhole
ZONEFILE=malwaredomains.zones
TMPZONEFILE=tmp.malwaredomains.zones

# To control bind
NAMEDCMD="/usr/sbin/rndc reload"

if [ -f ${BLACKHOLEDIR}/${TMPZONEFILE} ]; then
	echo "New zone file exists..."
	# Rename the zone file to back it up
	echo "Backing up current zone file."
	mv ${BLACKHOLEDIR}/${ZONEFILE} ${BLACKHOLEDIR}/${ZONEFILEBACKUP}
	# Rename the tmp file to the name the daemon can find
	echo "Replacing it with the new zone file and removing the temp file."
	mv ${BLACKHOLEDIR}/${TMPZONEFILE} ${BLACKHOLEDIR}/${ZONEFILE}

	# Reload named.
	${NAMEDCMD}

	if [ $? -ne 0 ]; then
		echo "    ... failed! Restoring zone file"
		cp ${BLACKHOLEDIR}/${ZONEFILEBACKUP} ${BLACKHOLEDIR}/${ZONEFILE}

		echo "Reloading old zones in named"
		${NAMEDCMD}

		if [ $? -ne 0 ]; then
				echo "    ... failed again!! You'll want to see to that."
		fi
	fi
else
	echo "No update.  Quitting..."
fi

To Be Protecting Against The Malware

Last night, my wife called me into the office with an alarming “It says it’s infected with malware!” Needless to say (and yet I’m going to say it anyway) I hurried into the room to see what the hullabaloo was all about.

Sure enough, there was a window exclaiming the existence of not one or two, but quite a few malware infections.

It fooled her, and damn if that stupid pop-up didn’t nearly fool me too! Truth be told, it did, if only for a second. Those malware serving fake malware pop-up warnings are clever.

It got me to thinking.

Then Osama bin Laden was shot in the head, and malware peddlers started leveraging our insatiable appetite for news about it (the sick bastards).

That got me thinking more.

It reminded me of the malware peddlers that took advantage of the quake in Japan recently. Now those are some seriously sick bastards.

Those events all in quick succession and all that thinking led me to this.

A little ditty that downloads the bind formatted zone file from MalwareDomains.com, moves it to where Named can see it, and reloads Named zone files if the download is complete. I’d verify the file if they provided an md5 of the zones file. But they don’t. Not that I could find, anyway.

I don’t even begin to hope to eliminate the risk of malware infected sites, but I think this is a positive step towards cutting off malware source domains which might, in turn, help against sites on legitimate domains that happen to be infected. As of today, May 3rd, 2011, there are nearly 10,000 domains in the latest file. That has to be nearly all of them.

Right?

I’ll try it out for a while and see what happens.

BTW, this only works if you’re running your own DNS. If not, you’re at the mercy of your ISP or whatever DNS you choose to use. There are plenty of options out there, and they’re not all horrible.

First, the script, which pulls down the latest malware domains zones file from malwaredomains.com, fixes some problems with underscores in the subdomains, copies the fixed zones file to the named chroot, and reloads the named configs.

#!/usr/local/bin/bash

# To know where script is running
HOSTNAME=$( hostname )

# To put file where named can see it
NAMEDDIR=/var/named/etc/namedb

# To name file so we know what named seeing
ZONEFILE=malwaredomains.zones

# To have a file for sed to work on
TMPZONEFILE=tmp.malwaredomains.zones

# To get updated file from remote server
URLGRABBER=/usr/local/bin/curl

# To keep quiet while am getting file
URLGRABBEROPTS="-s -S"

# To know where file is hosted
#URL=http://www.malwaredomains.com/files/spywaredomains.zones
URL=http://mirror1.malwaredomains.com/files/malwaredomains.zones

# To control bind
NAMEDCMD="/usr/sbin/rndc reload"

#==============================================================

# Get start time so we can know how long
START=$( date +%s )

# Get directory we're running from
SCRIPTDIR=$( dirname $0 )

cd ${SCRIPTDIR}
if [ $? -ne 0 ]; then
    echo "ERROR: Unable to cd to ${SCRIPTDIR}! AbOrTinG!!"
    exit 1
fi

# If we were executed like "./whatever.sh" - set SCRIPTDIR to the pwd
if [ "${SCRIPTDIR}" == "." ]; then
    SCRIPTDIR=$( pwd )
fi

echo "Script is running from ${SCRIPTDIR}"

# Download the zones file in bind format to a temporary location.
# We don't want to overwrite what we already have until we're sure
# the download worked

echo "Downloading file from ${URL}"
${URLGRABBER} ${URLGRABBEROPTS} -o ${SCRIPTDIR}/${ZONEFILE} ${URL}

# Check for errors.  If the file downloaded, then move on, but if not
# we don't want to reload named without the previously updated
# malware domain list

if [ $? -ne 0 ]; then
    echo "    ... download failed! Error: $?"
    exit 1
else
    # Disable name checking for only those domains with underscores,
    # so we don't have to turn off name checking globally.
    SEARCH='_'
    FIND=';};'
    REPLACE='; check-names ignore;};'

    echo "Disabling checking on domains with underscores"
    sed "/${SEARCH}/ s/${FIND}/${REPLACE}/g" ${SCRIPTDIR}/${TMPZONEFILE} > ${SCRIPTDIR}/${ZONEFILE}

    # Get a count of the zones from the last update
    OLDZONECOUNT=$( cat ${NAMEDDIR}/${ZONEFILE}|grep "^zone"|wc -l )

    # Copy file to final location for named to find
    #(via "include" directory in named.conf)
    echo "Copying file from ${SCRIPTDIR} to ${NAMEDDIR}"
    cp ${SCRIPTDIR}/${ZONEFILE} ${NAMEDDIR}

    if [ $? -ne 0 ]; then
        echo "    ... failed! AbOrTinG!!"
        exit 1
    fi

    echo "Reloading zones in named"
    ${NAMEDCMD}

    if [$? -ne 0]; then
        echo "    ... failed! You'll want to see to that."
    fi

    # Get a count of the zones from the current update
    NEWZONECOUNT=$( cat ${NAMEDDIR}/${ZONEFILE}|grep "^zone"|wc -l )
    echo "${OLDZONECOUNT} Previous Zones"
    echo "${NEWZONECOUNT} Current Zones"
fi

END=$( date +%s )
RUNTIME=$(( ${END} - ${START} ))
H=$(( ${RUNTIME}/3600 ))
M=$(( ( ${RUNTIME}/60 ) % 60 ))
S=$(( ${RUNTIME} % 60 ))

echo "Malware zonefile download on ${HOSTNAME} complete in"
echo "${H} hrs, ${M} mins and ${S} secs (${RUNTIME} secs)"
exit 0

Then, the cron job to update the list on a daily basis:

35 0 * * * /root/bin/malwaredomains/malwaredomains.sh 2>&1 | mail -E -s "Malware Domain Named Update" me@here.com

Then, the blackhole host file that all those zones in the malwaredomains.com download refer to. Careful with this one, and you’ll want to replace the domains with something a little more relevant:

$TTL    86400           ;one day
@ IN SOA ns0.example.net. hostmaster.example.net. (
        2011050100  ; serial number YYYYMMDDNN
        28800       ; refresh 8 hours
        7200        ; retry 2 hours
        864000      ; expire 10 days
        86400       ; min ttl 1 day
)
        NS      ns0.example.net.
        NS      ns1.example.net.
        A       127.0.0.1
*   IN  A    127.0.0.1

Finally, the line in the named.conf file (in my case, in the internal view) to call on the recently downloaded zones file:

include /etc/namedb/malwaredomains.zone

That should do it!

This is what I receive in my inbox after every update (daily for me):

Script is running from /root/bin/malwaredomains
Downloading file from http://mirror1.malwaredomains.com/files/malwaredomains.zones
Disabling checking on domains with underscores
Copying file from /root/bin/malwaredomains to /var/named/etc/namedb
Reloading zones in named server
reload successful
   10116 Previous Zones
   10116 Current Zones
Malware zonefile download on [hostname] complete in
0 hrs, 0 mins and 2 secs (2 secs)

Gas Mileage TinyApp

If this keeps up, I’m going to have to put together a dedicated page for web projects. I hope it does keep up. I really love doing it…

My latest project following the FreeBSD backup script (bash), a host of system admin scripts too small to bear mentioning (bash, tcsh, perl), and the twice-built RAID crash victimized wine database (PHP, Smarty & MySQL) (in addition to the many projects no longer online), is a gas mileage tracker. It’s also written in PHP, delivered through Smarty, and backed by MySQL, but now with delicious pChart.

I originally used GoogleDocs and a GoogleForm to record the data, and I have data going back a few years. That worked well enough and did most of what I wanted it to do, but it didn’t do everything. Now that I have my own server(s) up and running again, I have the luxury of being dissatisfied.

What it Didn’t Do

  • Provide meaningful feedback after submitting the form. I want to know what was submitted immediately on the “Form Submission Successful” screen, and if applicable, how it relates to information previously submitted. That’s just good UI feedback that it’s missing.
  • Allow me to maintain and build upon my scripting/database chops. Working with GoogleDocs is easy, fairly extensible, and has the benefit of having a preexisting world class infrastructure (and all that entails) and environment in which to work. However, I’m a bit of a maverick in these things, and for my own projects I want things set up the way I want them. Their infrastructure and environment doesn’t allow me to build on the skills I want to build on. I have my own environment. I want to take advantage of it.
  • Allow me to maintain and build upon my UI design chops. I’ve always loved the UI design aspect of building web apps. There’s a few things you can do with Google, but I felt too constrained by their system, and wanted, again, things the way I wanted them.

It wasn’t all bad, though. Setting it up in GoogleDocs did effectively (if not intentionally) serve as a sort of rough draft for rebuilding it on my own server.

What it Did Do

  • Gave me a firm sense of what I wanted, and what I didn’t want, if not a solid workflow to follow.
  • Gave me a good sense of the basic information I wanted to collect, which was then easy to translate to MySQL tables.
  • The visualizations in the form of charts and graphs were good enough that I decided I couldn’t do without them, which gave me the desire to research PHP graphing/charting libraries. I settled on pChart.

Now, I have a small web app that lets me…

  • Track the gas mileage I’m getting for my car and how many miles I’m getting per tank.
  • Track how the price of gas is changing. Always up, but still…
  • Track how much I’m spending per mile on gas.
  • Track how often I fill the tank.
  • Keep tabs on how often I’ve changed the oil (I change it myself… a pox on paying someone else to do it), and how long until I need to change it again.

Soon, it will let me…

  • Track how many miles I drive over time, per week, month, year, etc.
  • Track how long between major service visits I don’t care to do (tires, brakes, etc.)
  • Implement multiple vehicles in the hopes that I can get my wife on board and using it.
  • Authenticate access so I don’t have to worry about the data being mangled by miscreants and malcontents. Obfuscation and low-profile domains will only work for so long (stay away, Noah! :) )…
  • Better UI flow… it’s good enough for me, but it’s rough. It’s fine on the desktop, but I need to clean it up so it works on the mobile platforms better.

Here are the charts I’m generating. I know, the gas mileage isn’t that hot. But the payments are $0, so there’s that.





rc.d != magick

There are certain things over which I want tight-fisted control, and other things over which I want neither control nor intimate knowledge.

When it comes to keeping my FreeBSD systems up to date, I relinquish control for the most part and let the ports system do the work, with portmaster and portaudit running the show for me. Sure, I run them manually when I need to based on nightly portsnap runs, with a close eye on what’s going on, but I let them take the reins of upgrading and auditing.

But when it comes to Apache and supporting modules (php, mod_perl), I want to do things the way I want to do them, not the way the ports system wants to do them. I want to compile them myself, with the options I want, and put the whole thing where I want it. I’m sure the ports system allows for that, but I’ve not dug in deep enough to figure it out yet.

That’s worked for me. For the most part.

The one part that hasn’t worked for me has been getting Apache to start at system startup as part of the rc.d framework. That is, until today. I finally hunkered down and figured it out. It’s nowhere near as magickal or mysterious as I initially thought.

Here’s what’s in my rc.conf file:

http_enable="YES"
http_flags="-k start"

Here’s what’s in my /usr/local/etc/rc.d/httpd startup script:

#!/bin/sh

# PROVIDE: httpd
# REQUIRE: NETWORKING SERVERS DAEMON LOGIN
# KEYWORD: shutdown

. /etc/rc.subr

name="httpd"
basedir="/home/www"
rcvar=`set_rcvar`
command="${basedir}/bin/${name}"
extra_commands="config"

pidfile="${basedir}/logs/${name}.pid"
required_files="${basedir}/conf/${name}.conf"

start_precmd="${name}_prestart"
config_cmd="${name}_config"

httpd_prestart() {
	if [ -f ${pidfile} ]; then
		echo "${pidfile} exists.  Deleting..."
		rm -f ${pidfile}
	fi
}

httpd_config() {
	echo "Apache configtest..."
	${command} -t
}

load_rc_config ${name}
run_rc_command "$1"

It took a few reboots of my dev server to get it right (that’s one reason one has a dev server), but it’s working like a champ now, and I don’t have to worry about manually starting Apache anymore.

That said, I’m sure there are improvements that could be made, and I welcome any suggestions (do I need all those requires? I don’t know… but it works to wait for them).

To make things a little easier on me I’ve set up the following aliases (and these date back many many years when I was fussing with Apache and conf files on an hourly basis):

alias apstart /usr/local/etc/rc.d/httpd start
alias apstop /usr/local/etc/rc.d/httpd stop
alias aprestart /usr/local/etc/rc.d/httpd restart
alias apconfig /usr/local/etc/rc.d/httpd config
alias aptest "/usr/local/etc/rc.d/httpd status; ps aux | grep httpd | grep -v grep"

Somewhere in my archives, I have an httpd.conf file written entirely in perl… maybe I’ll dig that out some day, just to see it again…

But first, I want to get all this working in jails.

Near Native FreeBSD Full and Incremental Backups to a Removable USB Storage Drive

UPDATE 2011/03/09 – I updated the code to backup to an NFS mount, and to include the “-h 0″ flag to skip all nodump flags. That was causing me serious problems.

Summary

I’ve given quite a bit of thought to backup procedures at home since my FreeBSD 8.1 box dropped my mirrored filesystem. The signs of impending apocalypse were there, I just didn’t pay them proper heed. Fortunately, all of my data was salvaged; unfortunately, I lost all the custom PHP code I wrote over the last 6 months, my wordpress themes, plugins and modifications, and everything else that actually DID anything with all that data. So, while I’ve been rewriting that, I’ve been giving equal, if not more attention to backing it up. I’ll catch up again, but before I do that, I’ll make sure I won’t fall behind again.

I did a few searches for FreeBSD backup solutions, and rolled my own little backup script using dump. It was decent, but it didn’t do everything I wanted as well as I wanted it to. Every night was a full backup, and there were no incrementals. I had to implement some pretty inelegant code to accomplish a couple things simply b/c I didn’t know how else to do it. So I kept looking and eventually zeroed in on David Andrzejewski’s work. He clearly states what he put out there is a use-at-your-own risk kind of script. I took it anyway as a starting block, and fleshed it out for my own purposes.

My requirements were similar to his, with the exception that I don’t have a cloud based storage account at the time of this writing and instead will be using a removable USB connected storage drive.

Project Goals

  • Run with native or easily accessible tools.
  • Full off-system backup of entire system once a week.
  • Incremental off-system backup of entire system nightly.
  • Separate off-system backups of individual critical files to make future restores easier.

Future Goals

  • Play with ${DUMPCACHE} to see how it affects the time to execute in my environment. Drop it back to 8MB for a week. Ramp it up to 64MB for a week. Recommended is 32MB, but it’s a party! Let’s see what happens.
  • Continue monitoring and fine tuning the hardware, OS environment and script to ensure maximum performance and stability. I haven’t recompiled a kernel in a while, maybe I’ll see about that.
  • These are “as money allows” goals. I’m sure my wife is getting tired of me spending money on hardware. Then again, she does appreciate that I have a hobby that keeps me off the streets and out of the brothels.
    • Continue looking for consumer level, but sufficiently robust NAS solutions featuring RAID5 mirroring and access via secure and/or open protocols (ssh, smb, rsync, etc.) to replace (or augment) the removable drives I’m using now. No Windows-Only solutions please.
    • Evaluate cloud based storage for off-site backups. I’m looking at SpiderOak right now at the recommendation of a friend. I like their zero-knowledge solution and pricing, but more research is required. We’ll have upwards of 500GB of storage requirements, so we’ll have to weigh the monetary costs of cloud storage and bandwidth usage carefully against the risk of my solution failing when (!if) I need to restore. For the moment, I’m relatively comfortable with dumping the filesystems to removable drives, and keeping certain ultra-critical bits of recovery text (bsdlabels, fstabs, choice config files, etc.) in Google Docs.

My Environment

Two physically identical servers built from the ground up running FreeBSD 8.1. Each system houses a 150GB system drive (/dev/ad4s1) and a 500GB data/storage drive (/dev/ad6s1), and runs with 2GB of RAM.

I have /, /usr and /var mounted individually on the 150GB drive, and /home (containing /users and /www) mounted on the 500GB drive. I thought about getting separate drives for /www and /home, but decided I didn’t want to deal with planning for storage allocation. Instead I created /home/www for web files, and /home/users for user accounts. It’s not exactly standard, but it’s not unprecedented, and I make it work.

Off my “production” server I’ve hung a 2.5″ 320GB USB2.0 removable drive. Off my “development” server I’ve hung a 2.5″ 100GB USB2.0 removable drive. I’ll adjust the size of the drives as needed. That’s just what I had on hand. Both were UFS formatted using fdisk.

During the backup job, those drives are mounted at /backup. The rest of the time they’re plugged in, but not mounted.

The Script

You’re welcome to this, but be warned, if it borks up your machine, destroys your pr0n collection, or sends terrifying space monkeys into your engine room, don’t blame me. Use at your own risk. There, now that I’m all disclaimed…

#!/usr/local/bin/bash

# Much appreciation to David Andrzejewski, and the work he started at
# http://www.davidandrzejewski.com/2010/03/01/freebsd-backup-using-dump-and-duplicity/
# I'm sure his current script/processes far outstrip this, but this
# is my (f)stab at it

# Version: 0.5
# * Provides 1 set of full backups and 6 associated incrementals
# * Backup files stored on mounted USB drive only

# I would like to see...
# * Writing to NAS with RAID5 and standard access (ssh, SMB, etc.)
# * Retrieve the cloud based storage interaction I stripped out

# DUMPLVL: provided via a command line flag ${1})
# WEEKDAY: provided via a command line flag ${2})

# HOSTNAME: The host being backed up. Used in informational messages
HOSTNAME=$( hostname )

# FSLIST: The list of file systems that will be dumped along with the
# name of the dump Example: /dev/ad4s1a=root will dump the /dev/ad4s1a
# volume and name it DDD.root.dump.levelN.bz2 where "N" is the dump level
# and "DDD" is the weekday
FSLIST="/dev/ad4s1a=root /dev/ad4s1d=var /dev/ad4s1f=usr /dev/ad6s1d=home"

# BSDLABEL_PARTITIONS: The list of partitions to run `bsdlabel` on
# This will be saved in the backup directory during runtime as
# ${WEEKDAY}.bsdlabel_${PARTITION}.txt
BSDLABEL_PARTITIONS="ad4s1 ad6s1"

# DUMPDEVICE: The location the files will be dumped to
DUMPDEVICE=sosaria:/home/dumps/${HOSTNAME}

# DUMPDIR: The directory that the dumps will be written to
DUMPDIR=/backup

# STAGINGDIR: The directory where dumps are stored before being written
# to ${DUMPDIR}
STAGINGDIR=/home/dumps/stage

# ARCHIVEDIR: The local directory dumps are stored after being written
# to ${DUMPDIR}
ARCHIVEDIR=/home/dumps/${HOSTNAME}

# NODUMP_DIRS: List of directories to set the nodump flag
NODUMP_DIRS="/usr/ports /usr/obj /usr/src /home/www/logs /home/www/src /home/dumps"

# DUMPCACHE: The amount of memory to give dump
DUMPCACHE=32

# DUMPFLAGS: The flags to feed dump
DUMPFLAGS="uanL -h 0 -f"

# FSTYPE: The filesystem type of the mounted partition
FSTYPE=nfs

# These should be standard

# BSDLABELCMD: The bsdlabel command
BSDLABELCMD=/sbin/bsdlabel

# DUMPCMD: The dump command
DUMPCMD=/sbin/dump

# MOUNTCMD: The mount command
MOUNTCMD=/sbin/mount

# UMOUNTCMD: The mount command
UMOUNTCMD=/sbin/umount

##---------------------------------------------------------------------
# Shouldn't have to edit anything below here

# Get the start time so we can gauge how long this is taking. Useful in
# tweaking ${DUMPCACHE}
START=$( date +%s )

# Get the directory we're running from
SCRIPTDIR=$( dirname $0 )

cd ${SCRIPTDIR}
if [ $? -ne 0 ]; then
       echo "ERROR: Unable to cd to ${SCRIPTDIR}! Aborting!"
       exit 1
fi

# If we were executed like "./whatever.sh" - set SCRIPTDIR to the pwd
if [ "${SCRIPTDIR}" == "." ]; then
       SCRIPTDIR=$( pwd )
fi

echo "Script is running from ${SCRIPTDIR}"

# Check the command line to make sure we have what we need from it
# First check for the dump level
if [ "${1}" == "" ]; then
       echo "Must specify dump level. Aborting!"
       exit
else
       DUMPLVL=${1}
fi

# Sanity check
if [ "${DUMPLVL}" == "" ]; then
       echo "ERROR: For some reason DUMPLVL never got set! Aborting!"
       exit 1
fi

# Then get the weekday name off the command line
if [ "${2}" == "" ]; then
       echo "Must specify weekday name. Aborting!"
       exit
else
       WEEKDAY=${2}
fi

# Sanity check
if [ "${WEEKDAY}" == "" ]; then
       echo "ERROR: For some reason WEEKDAY never got set! Aborting!!"
       exit 1
fi

# Create the flag file so we can't run more than one instance
if [ -f "${SCRIPTDIR}/myself.flg" ]; then
       echo "Script running?! ${SCRIPTDIR}/myself.flg exists! Aborting!"
       exit 1
else
       echo "Touching myself at ${SCRIPTDIR}/myself.flg"
       touch ${SCRIPTDIR}/myself.flg
fi

# Check for the existance of ${STAGINGDIR}
if [ ! -d "${STAGINGDIR}" ]; then
       mkdir ${STAGINGDIR}
       if [ $? = 1 ]; then
               echo "Could not create ${STAGINGDIR}!  Aborting!"
               echo "Removing ${SCRIPTDIR}/myself.flg"
               rm -f ${SCRIPTDIR}/myself.flg
               exit 1
       fi
fi

# Check for the existance of ${ARCHIVEDIR}
if [ ! -d "${ARCHIVEDIR}" ]; then
       mkdir ${ARCHIVEDIR}
       if [ $? = 1 ]; then
               echo "Could not create ${ARCHIVEDIR}!  Aborting!"
               echo "Removing ${SCRIPTDIR}/myself.flg"
               rm -f ${SCRIPTDIR}/myself.flg
               exit 1
       fi
fi

echo ""
for DIR in ${NODUMP_DIRS}; do
       echo "Setting nodump on ${DIR}"
       chflags -R nodump ${DIR}
done

echo ""
echo "Dump Level: ${DUMPLVL}"

# Preserve a copy of root's crontab (/root/crontab is
# manually created with `crontab -l > ~/crontab` with every change
echo ""
echo "Copying /root/crontab to ${STAGINGDIR}/${WEEKDAY}.root_crontab"
cp -f /root/crontab ${STAGINGDIR}/${WEEKDAY}.root_crontab

# Preserve a copy of fstab
echo "Copying fstab to ${STAGINGDIR}/${WEEKDAY}.fstab.txt"
cp -f /etc/fstab ${STAGINGDIR}/${WEEKDAY}.fstab.txt

# Preserve a week's worth of bsdlabel copies for each partition
for PARTITION in ${BSDLABEL_PARTITIONS}; do
       echo "Writing bsdlabel for ${PARTITION} -> ${STAGINGDIR}/${WEEKDAY}.bsdlabel_${PARTITION}.txt"
       ${BSDLABELCMD} ${PARTITION} > ${STAGINGDIR}/${WEEKDAY}.bsdlabel_${PARTITION}.txt
done

# Dump the filesystems!
for FSITEM in ${FSLIST}; do
       # Get the devicename
       FS=$( echo ${FSITEM} | awk -F= '{ print $1 }' )
       # Get the filesystem name
       NAME=$( echo ${FSITEM} | awk -F= '{ print $2 }' )
       DUMPFILE=${WEEKDAY}.${NAME}.level${DUMPLVL}.dump
       echo ""
       echo "Dumping ${FS} to ${STAGINGDIR}/${DUMPFILE} at dump level ${DUMPLVL}"
       echo ""
       echo "${DUMPCMD} -C${DUMPCACHE} -${DUMPLVL}${DUMPFLAGS} ${STAGINGDIR}/${DUMPFILE} ${FS}"
       ${DUMPCMD} -C${DUMPCACHE} -${DUMPLVL}${DUMPFLAGS} ${STAGINGDIR}/${DUMPFILE} ${FS}
done

# Test for an existing backup device mount and either use the existing
# mountpoint or mount our backup directory

MOUNTRESULTS=$( ${MOUNTCMD} | grep "${DUMPDEVICE} on ${DUMPDIR}" )

if [ "${MOUNTRESULTS}" == "" ]; then
       echo ""
       echo "Mounting ${DUMPDEVICE} on ${DUMPDIR}"
       ${MOUNTCMD} -t ${FSTYPE} ${DUMPDEVICE} ${DUMPDIR}
       if [ $? = 1 ]; then
               echo "  ... failed. Aborting!"
               echo "Removing ${SCRIPTDIR}/myself.flg"
               rm -f ${SCRIPTDIR}/myself.flg
               exit 1
       else
               echo "  ... succeeded"
       fi
else
       echo "${HOSTNAME}:${DUMPDEVICE} already mounted on ${DUMPDIR}"
fi

# Copy the files to ${DUMPDIR} and archive them to {$ARCHIVEDIR}
cd ${STAGINGDIR}
echo ""
for FILE in *; do
       echo "Copying ${FILE} to ${DUMPDIR}"
       cp ${FILE} ${DUMPDIR}/${FILE}
       if [ $? = 1 ]; then
               echo "... Failed to copy ${FILE}! You might want to see to that."
       else
               echo "Moving ${FILE} to ${ARCHIVEDIR}"
               mv ${FILE} ${ARCHIVEDIR}/${FILE}
       fi
done

# Get a snapshot of how the dump directory looks for verification
echo ""
echo "Recent Additions to ${DUMPDIR}:"
echo ""
ls -lt ${DUMPDIR} | tail -n +2 | head -n 8

# Umount the backup filesystem
echo ""
echo "Unmounting ${DUMPDIR}"
${UMOUNTCMD} ${DUMPDIR}
if [ $? = 1 ]; then
       echo "  ... failed. You might want to see to that."
else
       echo "  ... succeeded"
fi

# Clear the running flag
echo ""
echo "Removing ${SCRIPTDIR}/myself.flg"
rm -f ${SCRIPTDIR}/myself.flg
if [ -f "${SCRIPTDIR}/myself.flg" ]; then
       echo "  ... failed. You might want to see to that."
else
       echo "  ... succeeded"
fi

echo ""
echo "Backup of ${HOSTNAME} Complete"

END=$( date +%s )
RUNTIME=$(( ${END} - ${START} ))
H=$(( ${RUNTIME}/3600 ))
M=$(( ( ${RUNTIME}/60 ) % 60 ))
S=$(( ${RUNTIME} % 60 ))

echo "It took ${H} hrs, ${M} mins and ${S} secs with -C${DUMPCACHE} (${RUNTIME} secs)"
exit 0

The Crontab

Here’s how I’ve set up my crontab. Like Mr. Andrzejewski, I opted to keep the specifics regarding the type of backup and the day it’s run in cron, rather than build it into the script. While it does make for a slightly longer crontab, it simplifies the logic in the script considerably. At the end of the day, I just feel better about telling the script what kind of backup to run (full or incremental), and the weekday name to embed in the resulting filenames, rather than letting it determine it itself. It’s a control thing.

# Daily Backups of filesystems
# Full backups on Sunday. Incremental backups every other day.
30 0 * * 0 /root/bin/backup/backup_script.sh 0 Sun 2>&1 /dev/null | mail -s "System Backup" dvicci
30 0 * * 1 /root/bin/backup/backup_script.sh 1 Mon 2>&1 /dev/null | mail -s "System Backup" dvicci
30 0 * * 2 /root/bin/backup/backup_script.sh 1 Tue 2>&1 /dev/null | mail -s "System Backup" dvicci
30 0 * * 3 /root/bin/backup/backup_script.sh 1 Wed 2>&1 /dev/null | mail -s "System Backup" dvicci
30 0 * * 4 /root/bin/backup/backup_script.sh 1 Thu 2>&1 /dev/null | mail -s "System Backup" dvicci
30 0 * * 5 /root/bin/backup/backup_script.sh 1 Fri 2>&1 /dev/null | mail -s "System Backup" dvicci
30 0 * * 6 /root/bin/backup/backup_script.sh 1 Sat 2>&1 /dev/null | mail -s "System Backup" dvicci

This will finally result in a list of files looking something like this come Sunday morning. Sort to taste.

backup/Sat.usr.level1.dump
backup/Sat.var.level1.dump
backup/Sat.root.level1.dump
backup/Sat.fstab.txt
backup/Sat.bsdlabel_ad6s1.txt
backup/Sat.bsdlabel_ad4s1.txt
backup/Sat.root_crontab.txt
backup/Fri.home.level1.dump
backup/Fri.usr.level1.dump
backup/Fri.var.level1.dump
backup/Fri.root.level1.dump
backup/Fri.fstab.txt
backup/Fri.bsdlabel_ad6s1.txt
backup/Fri.bsdlabel_ad4s1.txt
backup/Fri.root_crontab.txt
backup/Thu.home.level1.dump
backup/Thu.usr.level1.dump
backup/Thu.var.level1.dump
backup/Thu.root.level1.dump
backup/Thu.fstab.txt
backup/Thu.bsdlabel_ad6s1.txt
backup/Thu.bsdlabel_ad4s1.txt
backup/Thu.root_crontab.txt
backup/Wed.home.level1.dump
backup/Wed.usr.level1.dump
backup/Wed.var.level1.dump
backup/Wed.root.level1.dump
backup/Wed.fstab.txt
backup/Wed.bsdlabel_ad6s1.txt
backup/Wed.bsdlabel_ad4s1.txt
backup/Wed.root_crontab.txt
backup/Tue.home.level1.dump
backup/Tue.usr.level1.dump
backup/Tue.var.level1.dump
backup/Tue.root.level1.dump
backup/Tue.fstab.txt
backup/Tue.bsdlabel_ad6s1.txt
backup/Tue.bsdlabel_ad4s1.txt
backup/Tue.root_crontab.txt
backup/Mon.home.level1.dump
backup/Mon.usr.level1.dump
backup/Mon.var.level1.dump
backup/Mon.root.level1.dump
backup/Mon.fstab.txt
backup/Mon.bsdlabel_ad6s1.txt
backup/Mon.bsdlabel_ad4s1.txt
backup/Mon.root_crontab.txt
backup/Sun.home.level0.dump
backup/Sun.usr.level0.dump
backup/Sun.var.level0.dump
backup/Sun.root.level1.dump
backup/Sun.fstab.txt
backup/Sun.bsdlabel_ad6s1.txt
backup/Sun.bsdlabel_ad4s1.txt
backup/Sun.root_crontab.txt

Viva la Vino!

Over the weekend (and one of the reasons I didn’t get to post about Saturday’s ride until today) I rolled a quick wine database app for my wife and I. I’d been toying with the idea in the *very* back of my mind for a little while, and when Jami said “That would be really cool!” Saturday night, I understood how I would spend my Sunday.

It’s sad, really, how much PHP I’d forgotten. I used it about 10 years ago to write a quick and dirty rough draft for a mod_perl driven LAN party organizing site that helped us, get this, organize LAN parties. The people involved in those awesome times were spread between Lawrence, KS and Kansas City, and we wanted an online RSVP system for our weekend frag-fests. Before it fizzled out due to people moving away, moving on, and generally growing out of it, it was a very nice community blog with event organization as it’s core. Hewn from scratch, it was also a great vehicle of personal growth in all things web design for yours truly.

The rough draft for that site was the last time I’d used PHP. It’s amazing how much one forgets. In building this little ditty, I had to relearn such language specific basics as assignments and conditionals, much less deeply nested hashes and objects (all arrays in PHP are, apparently, associative – who knew?). Most of the concepts are fresh from my work (at work) with Powershell, but the syntax, naturally, differs.

The engine, if I can be so bold, is still rough. In fact, it’s a tangled mess of burnt spaghetti code sticking to the bottom of the hard drive that doesn’t deserve the name “code”, but it works.

Our requirements are pretty simple:

  1. Must work easily in mobile browser for updating while we’re out with nothing but our smart phones.
  2. Must have rudimentary rating system so we know which wines we’ve liked, and which we haven’t.
  3. Wines must each be sufficiently described so we know what they are:
    • Vintage (year)
    • Varietal(s) (grapes)
    • Winery (producer)
    • Region (geographic location of winery)
    • My Wife’s Opinion
    • My Opinion
    • Comments (general notes to jog our memories)
  4. Must be able to search on a variety of fields seamlessly and simply.

And that’s pretty much it. Everything else is gravy.

While it’s fully functioning, and in working order now, like I said, it’s a hideous ugly mess behind the scenes. From here, I want to clean it up, separate logic from presentation as much as feasible, utilize more fully objects and classes, incorporate more best practices, and generally do what I can with what it is to learn as much as I can.

It’s really been a lot of fun!

This is the search form. It’s the header, basically, and appears at the top of the screen no matter what. Very simple.

This is the form for adding new wines to the database, available via the “Add” button at the top of every page. Again, very simple.

This is the form for updating wines that already exist in the database. I just call the add form code with the right values to populate the form. Code reuse is a beautiful thing.

This is the results when you search for something. I like the pretty. You gots yer yellers, yer reds, yer sparkleys, and yer pinks. Sum’pin’ for everyone!