Cool things I’ve gotten to play with at my startup

For the last year or so I’ve been working on a tech startup with some friends. In doing so, I’ve gotten to work with some pretty cool stuff, and I thought I’d make a list of some of them. Basically, I wanted to extole the virtues of working on a startup as a great way to get real-life practice projects to work on – I have every expectation that we will have at least some success, but even if it ends up being a failure, here are some of the projects that I’ve gotten to work on:

  • wrote an spidering application
  • setup a mysql cluster
  • setup zimbra
  • researched several virtualization options
  • setup apt-proxies
  • wrote a custom smtp daemon / parser
  • learned a boatload about bayesian analysis and other pattern recognition and predictive tools
  • setup joomla
  • setup drupal
  • setup linux natting/routing firewall
  • wrote project plans
  • managed & motivated
  • learned how to incorporate
  • setup zimbra
  • setup dnsmask for dhcp/dns masqerading
  • setup bind dns & replication
  • learned how to motivate people and lead weekly conference calls
  • worked on project management
  • marketing
  • sales
  • setup ldap athentication
  • setup openNAS

Now – I’ve done a bunch of these things before in previous jobs, but its still good practice and there were several I hadn’t played with before. It’s a great opportunity to learn, and even if it doesn’t end up succeeding, the time I’ve put in will not have been wasted. I’ve become a better programmer, a better leader, and I’ve had a lot of fun doing it.

If you are interested in keeping track of the morale at your company, project, or keeping track of how positive people are about your brand, or a search term, we’re looking for beta customers. Feel free to ping me @nickbernstein on twitter if you think you might be interested.

Posted in tech, Uncategorized | Tagged , , , | Leave a comment

A letter to my congresswoman

I thought I would put up a letter I recently wrote to my congresswoman, Maxine Waters, regarding the recent information that has come to light about torture under the Bush administration.

I am writing you today in regards to the recent information about the torture which has taken place in the name of the American people. The atrocities, a word I do not use lightly, which have taken place *must* be investigated, and this needs to be done in a manner that is as fast and impartial as possible. I believe that there needs to be an investigation done not by Americans but by a international third party such as the UN. This cannot be perceived as a partisan attack against republicans, it is too important, but an investigation and prosecution of this torture must be undertaken. Our country is nothing without the ideals and principals under which it was formed, and this is must be dealt with.

Thank you for your time,
Nicholas Bernstein

If you are not familiar, here is some background information:

to balance it out, here is a video of baby pigs being cute:

Posted in Uncategorized | Leave a comment

A quick perl script for used car research on craigslist

So my 2001 VW jetta is getting a bit up there in the miles – it’s about 95k at the moment, and while this isn’t too much for a VW, I’ve been wanting to get a new car and I figure I should get rid of it while I can still feel good about selling it to someone else. Besides, I want to get a convertible – I live in southern California, and if it’s not the appropriate climate for one, I don’t know where is.

Initially I went to carmax and they offered me $2000. Yeeps! I was shocked. Could it really be worth that little? Kelly Blue book said it was at least worth $4k. So I thought I would test the open market and write a quick perl script to give me the average price for an item on craigslist. Here’s how it works and the code is below. It should be really easy to modify for anyone who could use something like this:

./ 2001+jetta
lowest:         1200
highest:        9999
Average:        6134.48214285714

another example where I search for porsche boxster:

nick:~$ ./ porsche+boxster
lowest:         7600
highest:        33100
Average:        16865.6341463415

Anyway the code is below, and I’ll put a link to the actual perl script:

$wget .= $ARGV[0];
print $wget . "\n";
$html = `wget -q -O - $wget`;
        @words = split(' ', $html);
        foreach $word (@words)
                 if ( $word =~ m/^\$/)
                                $word =~ s/(\$|,)//g ;
                                if ( $word =~ m/^\d+$/ )
                                        if ( $lowest eq '') {
                                                $lowest = $word ;
                                        } elsif ( $word < $lowest ) {
                                                $lowest = $word ;
                                        if ( $highest eq '') {
                                                $highest = $word ;
                                        } elsif ( $word > $highest ) {
                                                $highest = $word ;
                                        $amt += $word ;
$average = ( $amt / $count ) ;
print "lowest:\t\t$lowest\nhighest:\t$highest\n";
print "Average:\t" . $average ."\n";

Here’s a link to the actual script you can download;. If you find it useful, let me know.

Posted in Life, programming, Uncategorized | Tagged , , , , , , | 6 Comments

Playing with Bayes

For the last day or so I’ve been playing with moving over my simple word-count-analysis of blogs to actually creating a database with manually ranked training data and extrapolating from that. There were some hiccups and I’ve still got to go back and replace a lot of code, but it’s effectively categorizing new blog entries based on previous rankings. YAY! I’ve been using the perl Algorithm::NativeBayes cpan module, and it’s pretty great – although the the documentation is really poor. The main thing to get is that it returns a hash reference, which means you end up referring to your result as something like:

<pre>print “Sport’s ranking: \t ${$result}{sports} \n” ; </pre>

Which, lets face it, is kinda ugly, but it’s really the only good way to do it, really. It should really be better documented, though. Aside from that, as long as you get the back-end math, you’re pretty OK. Just because you’re doing AI stuff doesn’t mean that you’re automatically familiar with how perl handles references to hashes, though. One of my to-do items is to go back and update the perldoc on it. Anyway, with that in effect I’ve gone and updated the database and I’m now able to get positivity over time. This means I’m actually getting closer to building an internet happiness index, and prediciting how “happy” the internet is as a whole. The next steps are:

  • incorporate new bayes functions into existing codebase
  • add more sources to the rss feedlist scripts
  • optimize the blogparser
  • put a nice (fusioncharts?) front end together
  • add more hardware for doing the catagorization
  • get more people to do more training data
  • ???
  • profit!

Actually, the “???” is pretty well defined, but honestly, this project will have been fun even it it doesn’t make a dime. Anyway, one step closer.

Posted in programming, tech | Tagged , , , , , | 1 Comment

Dear Bestbuy, why don’t you value me as a customer?

On my birthday, I drove from one store to another in the boston area trying to find a dell mini 9 netbook. I eventually did, the last one they had, which was an open box return, mid wipe. I purchased it @ full price, and even had to sign something saying I was OK with it not having the OS/Drivers installed.

I was fine with this, I was planning on installing ubuntu on it anyway. The problem was that w/in a few days the “p” and “o” keys had gone dead. No problem, I think, I’ll just take it back. I had left the box at my GF’s house, so I asked her to ship it to my apartment. Once I got home to california, and gotten the box, I took it back to my closest best buy in el segundo, ca. and they refused to do an exchange. An exchange. I don’t want my money back, or anything fancy – they keys don’t work, I just wanted to swap it out, no data transfer even, I wiped it for them. I cajoled, I bargained, I tried to explain what an example of bad customer service this was, but to no avail. The manager wouldn’t even come out to see me. ( !! )

In the last year I have purchased (off the top of my head):

  • a laptop (alum. macbook)
  • several mice
  • multiple laptop cases
  • two network routers
  • several video adaptors
  • a printer
  • two 160g usb hd drives
  • 1 terrabyte usb hard drive
  • several video games
  • ethernet cables out the wazoo
  • a gigabit switch
  • a non-gigabit switch
  • extended power strips

I’m not going to enumerate everything I’ve purchased, but I buy a *lot* of computer equiptment. I can’t imagine why bestbuy would have a policy in place that would be so strict that it would willingly loose a customer over an *exchange* – I mean, this is exactly the type of service that drove compusa out of business. I spent a bunch of money, value me as a customer. If anyone from bestbuy reads this, this is the resolution I would like: contact the bestbuy in el segundo, and ask them to exchange my netbook. That’s it. Like for like. Otherwise, there’s a fry’s electronics a block away, and I’m sure they’d be happy to take my money.

-Nick Bernstein.

Posted in Uncategorized | 1 Comment

An alternative approach to snaprestore rollbacks for virus outbreaks


Figure 1

Figure 1

Netapp’s snap restore product is a fantastic tool in a storage admin’s arsenal. It works well. It’s fast, and it doesn’t need to “restore” data, it just makes a previous snapshot the active file system. That said, I keep seeing the same scenario put forth by netapp and various other folks as a use case for snaprestore, and it’s one where snaprestore would be my second choice. The scenario is this: We’re taking hourly snapshots over the course of a day. A virus breaks out. The files on our cifs shares have been compromised. We’re sure that our current data is infected, and we know that at least some of the data was infected an hour ago. Two hours ago is uncertain, but we are sure that the virus hadn’t broken out three hours ago, so we know the file system at this point is clean. The recommendation you always hear is to use snap restore to roll back the file system to three hours ago, and you’re now clean and not infected. Unfortunately, as a side effect, we’ve lost all the data after that point – which, I should mention, is the intended result: if we hadn’t our files would still be infected, right? My solution is to instead clone the volume using a really neat tool called flexclone.

some snapshot basics

I’d like to suggest an alternate method, but before I get into it, it probably makes sense to talk about how snapshots work. The idea is very simple. Each file on a file system has an inode. An inode contains information about a file – the owner, when it was created, etc. If you’re on a windows box, and you view the “properties” of a file, you’re seeing the information contained in an inode. A hard drive is made up of blocks of data, and another thing that an inode does is point to the blocks that make up the data for a given file. If a file is big, the inode won’t actually point to the data blocks themselves, it will point to indirect blocks which then point to data blocks ( inode -> indirect level 1 -> data ). If a file is really big we can have multiple levels of indirection, like in figure 1. A snapshot is, for all intents and purposes, a copy of just that top level inode block. This block then contains pointers to the indirect blocks, and those indirect blocks to the data blocks. ( Is this fun yet? ) Once we’ve made a copy of the inode (parent) the indirect and data blocks (children) are effectively frozen, since a block cannot be changed unless it’s parent, or parents (original inode and now the snapshot) agree. The way we handle modifications to files in the netapp world is to write those changes to a special reserved section of the volume called the “snapshot reserve” (apt) and to update the original inode to point to those blocks. It’s pretty slick, really.


In order to understand how this alternative solition works, we need to understand how flexclone works. Flexclone – to over simplify things – is a writeable snapshot. Actually, it’s really not that much of an over simplification, really. If we go back to our snapshot basics where we talked about how each file has an inode which points to data blocks, I should mention that under the hood, everything on a filesystem is actually just, well, a file… even directories. A directory is just a file that contains a list of file names and where the inodes for those files are, that’s the data in that “file’s” data blocks. The top of any volume (think of this as a drive) is a directory, and that directory in turn, has an inode. This one we call the “root inode”. When we take a snapshot, of volume, this is the inode we’re basing our snapshot of. So, what we end up with is: [ root inode ] [ snapshot copy of root inode ]. What we can do is add one more copy to make a “clone” of the volume. this leaves us with: [root inode ] [ snapshot copy of inode] [clone copy of inode]. Now we can still write to our normal volume by writing to that snapshot reserve that we talked about, and what we do when we create a flex clone is associate a *new* snapshot reserve with the clone. The cool thing is initially the clone takes up no space. Ok, that’s a lie, it takes up a coupe kilobytes, but we’re talking storage systems here, a couple of kilobytes is nothing. This is really useful for things like backing up a database, or doing QA -theres a whole bunch of use cases, such as, oh, avoiding the unneccessary dataloss of using snap restore in the event of a virus outbreak.

the actual solution

Ok, now that we’ve brushed up on our netapp basics, lets review our problem:

  • we’ve taken hourly snapshots
  • our active file system is infected with a virus, we’ll assume the volume name is vol1
  • so is our previous hourly snapshot and probably the one before that
  • the normal recommeded solution would be to snap restore to hourly.2 (three hours previous) before the virus broke out

Using this method would mean that we would lose three hours worth of data. That’s not good. Preventing data loss is the whole job of a storage admin, as far as I’m concerned. So, an alternate method:

  1. take cifs offline, using the cifs terminate command
  2. on your admin host copy the file /vol/vol0/etc/cifsconfig_share.cfg -> cifsconfig_share.cfg.YYMMDD.pre-virus.bak
  3. make a clone of vol1 called “vol1_clone” with the command: vol clone create vol1_clone -b vol1 hourly.3
  4. open /vol/vol0/etc/cifsconfig_share.cfg and do a find and replace, changing every instance of vol1 to vol1_clone
  5. run the command cifs restart
  6. run the command cifs shares -add oldvol1 /vol/vol1
  7. lock it down with cifs access <your user> “Full Control”

So what did this do?

We quarantined the file system by stopping cifs and removing client access. Then we created a new clone of the volume that had been infected off of the snapshot we new was clean. We updated all of our shares using a quick find and replace giving our users access to their data, so they can get back to work. We also exposed our old volume via a new cifs share, which we’ve restricted only to our user, who theoretically knows better than to muck with infected files. So why did we do all this work? Later, when new virus definitions come out, we can scan that volume over night and have our anti-virus program go clean up our mucky infected files. Once we do, we can open up the share, send out an email, and give our users the ability to recover any important documents that had been created during those three hours, saving extra work, and potentially saving compliance headaches.

There is a final part of this, in which we split off the clone using the vol split command. I would note – you will (at least temporarally) need to have enough free space on your aggregate to contain both of the volumes, w/o any space savings. Once your split is finished, you can do a vol offline vol1 ; vol destroy vol1 to get rid of the old volume and free up that space.

I think this is a better solution to the problem, and one that’s much more elegant than using a snap restore. I’d love to hear any feedback or improvments to this process, so if you found this useful, or can think of a better way, please let me know.

Posted in tech, Uncategorized | Tagged , , , , , , | 1 Comment

Teachin’ ain’t easy.

I am very, very jet lagged. I’m not entirely sure what it is, but there’s something that knocks you on your ass when you travel in the middle of the week. I think it’s that everyone’s a *little* tired on Mondays so the world feels like it’s running at a slightly slower pace. Oh well. Short week anyway, and I’m off to Boston on Friday, which is a much shorter flight than heading back to LA. 

I think the fact that this is a new course I’m teaching this week – I taught the previous version, but the slides have all been changed and the content moved around. I try to get into a flow with the classes I teach, and build up entertaining stories and adages and whatnot- try to be the kind of teacher I liked when I was in school- and it takes a while to get that going with new materials, I think. Maybe I’m just hyper sensitive, but the first time I teach a new class, I never feel like I’m doing it well enough, even if it seems to be going well. I’ve kept myself up re-reading the slides and my previous notes and trying to see if there are any issues by re-doing the labs, but I probably won’t like where it’s at until right before this class gets retired and a new version comes out. Oh well – I guess if it felt perfect the first time though, I’d be bored out of my mind. I really do like teaching, it’s probably one of the most fun jobs I’ve had. 

It’s also fun coming up with notes for the classes. I tend to go all a little nuts with supplemental materials, but I think it’s a fun thing to do. Right now I’m on a career development kick- I supplemental to find things that tie into the classes but also help the students do things like put together Business Continuance Plans and pull together requirements for other groups – get visibility w/in an organization and whatnot. 

Anyway, time for sleep. Hopefully tomorrow I’ll be awake enough to go try and find some cuban food. mmm…. plantains… 🙂

Posted in Uncategorized | Tagged , , , , | Leave a comment

Heading to Ft. Lauderdale

I’m blatantly ignoring the fact that I need to get up early tomorrow to fly to Ft. Lauderdale, and I’m not packed. I’m looking forward to it. Depending on how busy I am, there’s a chance I may get to see my grandmother, who’s awesome, and makes delicious pasta, which tastes unlike any other “gravy” I’ve ever had. SO good.

They re-did all the courseware for data on tap 7.3, and while it’s better than the previous versions, it’s a HUGE hassle to try and memorize the slide decks and labs for the new courses. Especially since I’ve been studying like a mad-man for the LPI certification so I could pass it by the time the Ubuntu train the trainer sessions get going. FYI – I see almost no value in the LPI certification. It’s a requirement to be a ubuntu trainer, and I love the idea of teaching people linux, so it’s a necessary evil, but in my ten or so years of system administration, I don’t think I’ve ever said, shit – someone tell me what the IRQ for ttyS0 is — and no looking at google! Google exists – this idea of memorizing things for the sake of passing tests is just silly. I’d much rather lab based testing. Install the OS. Better yet, setup a tftp boot server, and get an install going that way. Troubleshoot this card not working… use whatever fricken resource you want, hell, call support… just get it done in 30 minutes. Someone needs to make a certification which is 100% lab based and takes like 40 hours, but you can do it all from home via the web. Aaaaanyway. I digress.

It will be nice to be in Ft. Lauderdale though. I really like that town, and I’ve had a couple students from there in other classes – all of whom I’ve liked, oddly, so maybe I’ll see some familiar faces. Plus – CUBAN food. Cuban food is fucking delicious, and outside of cuba, you can’t get better cuban food than florida. Feh. I should really pack though, I’m off to Boston the week afterwards, and then to DC (?) for the ubuntu TTT session, which means flipflops and overcoats. 🙂

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

I don’t want to take my shoes off.

I’ve thought about all the different things that I’d like to see from an Obama Administration, and I think that’s the one I’d like to see first. This fixation on (false) security at airports is one of the most visible changes we’ve seen with the bush administration. I’d like to not have to take my shoes off anymore: I think that would be a tangible sign of “going back to normal.” I’d want the rest of course, repealing the patriot act, an end to warrant-less wiretapping, getting out of Iraq and eventually Afghanistan, but initially, just initially, lets stop with the politics of fear, and let old ladies keep their dignity at the airport.

Posted in Uncategorized | Leave a comment

Ops 101: Change Control


A while back I decided I wanted to put down on paper some of the lessons I learned working as a systems admin and having worked in an enterprise environment with thousands of servers. Some lessons I learned in the small shops, some I learned at bigger shops. This is going to be the fist in my “Ops 101” series, and a more broad series of essays about lessons I’ve learned.

The Processes

Process. Procedures. We all hate doing them. Without them, however, all the other stuff never gets done. Sure ? you?ll start off documenting the changes you make, and checking in that script, but unless there?s a process in place and accountability to back that process up, a couple of months down the line you?re going to be asking, ?wait, where does that script run again?? and ?that was upgraded? Since when?!? — without good process, you?ll lose all of those good habits that you?ll thank god you have when things start to break.

Change Management

Change management, at is actually a very simple thing: What. When. Why. Where. How. Specifically: What do we want to change? Why do we want to change it? When do we want to make this change? Why do we want to make this change? and Where do we want to make it. It?s that simple. The hard part is getting people to do it. The benefits are huge however. I can?t count the number of times we?ve found an issue, started tracking it back, dug through the logs and found that it happened yesterday at three pm. Four out of five times you can go back and see that some app was deployed the previous day at about the same time, now the problem is easy to solve. In addition, no matter how good the documentation you?re going to get from dev is (and lets face it, in services… usually it ain?t so good) it is worth all the hassle in the world to be able to be able to look up how you did something a year ago when it randomly comes up again.  

Change management basics

The following is all the things you will need to have a successful change management system:

  • A meeting. Sorry: you?ll just have to live with it. 30 minutes a day won?t kill you
  • A ticketing system
  • Signoff from the ops team that they will use it 100% of the time
  • Signoff from the rest of the company that they will only escalate things using the ticketing system

The changes

Changes fall into three basic buckets: Emergency Change Requests (ECR), (Scheduled) Change Requests (CR), and Standard Operating Procedures (SOP). These should be fairly self explanatory: ECRs happen in emergencies. If an army of zombies breaks into the office, your first thought should rightly be: how do I deal with the zombies. Afterwards, providing you live, you would create an ECR. This will enable the next guy to do a quick search for ?zombies? in the ticketing system and see that there is an emergency shotgun hidden behind the UPS in the server room, thus not having to go through the harrowing ordeal of sacrificing all of those sales guys before remembering it was there. He could, instead, just sacrifice *some* of the sales guys.

CRs will come up a lot of different ways. Client Services will request things. Deployments will need to be done. Sysadmins will think of better ways to do things. Change happens. Anything you don?t need to do *right now* goes into the CR bucket. You look at these in your change meeting, decide when and if they should be executed and if the process of the change can be improved. Then you dole them out to your various system admins to do the actual work.

SOPs are the basic ?can you run that script that you put together that fixes the mailserver again? type stuff. You do it often enough that it?s ?no big deal?. The ticket is just there a) for tracking purposes b) when that guy who runs the script is out, so you can look it up. These will eventually be a good chunk of the stuff in your wiki, but more on that later.
The Change Review Meeting

ECRs generally get phone approval from someone in management and then are documented afterward. They generally are followed by a meeting to explain what the heck happened, and a formal root cause analysis.

The agenda is simple: is it approved? Who?s doing it? When do we want to do it? Next. This should be a quick meeting. It won?t be. Who should be there: Senior Systems Admin, Director of Operations and a representative from the other teams. Dev & CS, at least should have a seat at the table, others are probably optional.

The Software

There are many out there and as you grow you might want to look at purchasing a commercial ticketing system or making modifications to your existing one, but given the pricetag of free and how widely used it is, I?d recommend RTi from It?s simple, it?s good and it works.


RPM Install page ii Current version-release: 3.4.5-2

Summary: This aims to be the solution to an easy RPM install of RT on RHEL4/CentOS4. /(Although this packages have been reported to run under Fedora Core 4, seems that they have they own now see section bellow)/


Old releases: rt-3.0.10-3 still available under 3.0.x directory.

WARNING: This packages were built on the assumption that SELinux is turned off (*Any help on make it support both modes would be great!!!*).
Package Description


It was built with mysql and apache2/modperl2 (2.0.1), it has no patches at the moment, but might have to correct known problems, to see details, at any moment do:

rpm -qp –changelog rt-<version>-<release>.noarch.rpm


rt-mail-dispatcher  This is a setup for a RT mail dispatcher using sendmail and procmail. It is based on the assumption that you use one domain for all your RT queues, e.g.
This allows you to setup queues in RT, using the following convention syntax:




without having to reconfigure everytime your mail settings.
‘postmaster’ is reserved to be RFC822 compliant, and should be setup correctly, defaults to user postmaster. You can always change it to be a RT queue as well.

Installation Notes

With [yum]

RT’s three step install procedure:

  1. Download the file:
  2. Copy it to /etc/yum.repos.d/ or

rt-3.4.x.repo >> /etc/yum.conf





3. Then type, as ‘root’:

install rt rt-mail-dispatcher


You’ll have rt installed in no time… then all you have to do is configure a few settings as the messages suggest.  

Note: Depending upon which Perl modules you had installed in the past, you may have to update before installing via yum. If a whole lot of dependency errors display when you run yum install, then type the following:


install rt rt-mail-dispatcher

        Without yum

          Just download everything to a directory and do:

rpm -Uvh *.rpm


Post Installation

A user pointed me
out that he was in such a hurry to try it out he lost the messages
that appeared after install. He also suggested I created a file with
those messages inside. Meanwhile here they are:

  • rt

/etc/rt/ /etc/rt/


generate an editable site config file.


must now configure RT by editing /etc/rt/ and



will definitely need to set RT’s database password before continuing.

Not doing so could be very


that, you need to initialize RT’s database by running


–action init

–dba root


something goes wrong you can always drop everything, by executing


–action drop

–dba root

  • rt-mail-dispatcher

must now configure somethings by editing /var/rt/home/.procmailrc,

read /usr/share/doc/rt-3.4.5/README.mail-dispatcher.



i –

ii –

Posted in ops 101, tech, Uncategorized | Tagged , , , , , , | Leave a comment