I’ve thought about all the different things that I’d like to see from an Obama Administration, and I think that’s the one I’d like to see first. This fixation on (false) security at airports is one of the most visible changes we’ve seen with the bush administration. I’d like to not have to take my shoes off anymore: I think that would be a tangible sign of “going back to normal.” I’d want the rest of course, repealing the patriot act, an end to warrant-less wiretapping, getting out of Iraq and eventually Afghanistan, but initially, just initially, lets stop with the politics of fear, and let old ladies keep their dignity at the airport.
A while back I decided I wanted to put down on paper some of the lessons I learned working as a systems admin and having worked in an enterprise environment with thousands of servers. Some lessons I learned in the small shops, some I learned at bigger shops. This is going to be the fist in my “Ops 101” series, and a more broad series of essays about lessons I’ve learned.
Process. Procedures. We all hate doing them. Without them, however, all the other stuff never gets done. Sure ? you?ll start off documenting the changes you make, and checking in that script, but unless there?s a process in place and accountability to back that process up, a couple of months down the line you?re going to be asking, ?wait, where does that script run again?? and ?that was upgraded? Since when?!? — without good process, you?ll lose all of those good habits that you?ll thank god you have when things start to break.
Change management basics
The following is all the things you will need to have a successful change management system:
- A meeting. Sorry: you?ll just have to live with it. 30 minutes a day won?t kill you
- A ticketing system
- Signoff from the ops team that they will use it 100% of the time
- Signoff from the rest of the company that they will only escalate things using the ticketing system
Changes fall into three basic buckets: Emergency Change Requests (ECR), (Scheduled) Change Requests (CR), and Standard Operating Procedures (SOP). These should be fairly self explanatory: ECRs happen in emergencies. If an army of zombies breaks into the office, your first thought should rightly be: how do I deal with the zombies. Afterwards, providing you live, you would create an ECR. This will enable the next guy to do a quick search for ?zombies? in the ticketing system and see that there is an emergency shotgun hidden behind the UPS in the server room, thus not having to go through the harrowing ordeal of sacrificing all of those sales guys before remembering it was there. He could, instead, just sacrifice *some* of the sales guys.
CRs will come up a lot of different ways. Client Services will request things. Deployments will need to be done. Sysadmins will think of better ways to do things. Change happens. Anything you don?t need to do *right now* goes into the CR bucket. You look at these in your change meeting, decide when and if they should be executed and if the process of the change can be improved. Then you dole them out to your various system admins to do the actual work.
SOPs are the basic ?can you run that script that you put together that fixes the mailserver again? type stuff. You do it often enough that it?s ?no big deal?. The ticket is just there a) for tracking purposes b) when that guy who runs the script is out, so you can look it up. These will eventually be a good chunk of the stuff in your wiki, but more on that later.
The Change Review Meeting
ECRs generally get phone approval from someone in management and then are documented afterward. They generally are followed by a meeting to explain what the heck happened, and a formal root cause analysis.
The agenda is simple: is it approved? Who?s doing it? When do we want to do it? Next. This should be a quick meeting. It won?t be. Who should be there: Senior Systems Admin, Director of Operations and a representative from the other teams. Dev & CS, at least should have a seat at the table, others are probably optional.
There are many out there and as you grow you might want to look at purchasing a commercial ticketing system or making modifications to your existing one, but given the pricetag of free and how widely used it is, I?d recommend RTi from bestpractical.com. It?s simple, it?s good and it works.
RPM Install page ii Current version-release: 3.4.5-2
Summary: This aims to be the solution to an easy RPM install of RT on RHEL4/CentOS4. /(Although this packages have been reported to run under Fedora Core 4, seems that they have they own now see section bellow)/
Old releases: rt-3.0.10-3 still available under 3.0.x directory.
WARNING: This packages were built on the assumption that SELinux is turned off (*Any help on make it support both modes would be great!!!*).
rtIt was built with mysql and apache2/modperl2 (2.0.1), it has no patches at the moment, but might have to correct known problems, to see details, at any moment do:
rpm -qp –changelog rt-<version>-<release>.noarch.rpm
rt-mail-dispatcher This is a setup for a RT mail dispatcher using sendmail and procmail. It is based on the assumption that you use one domain for all your RT queues, e.g. @rt.yourdomain.com.
This allows you to setup queues in RT, using the following convention syntax:
without having to reconfigure everytime your mail settings.
‘postmaster’ is reserved to be RFC822 compliant, and should be setup correctly, defaults to user postmaster. You can always change it to be a RT queue as well.
With [yum http://linux.duke.edu/projects/yum/download.ptml]
RT’s three step install procedure:
- Download the file: http://campus.fct.unl.pt/paulomatos/rt/repository/3.4.x/rt-3.4.x.repo
- Copy it to /etc/yum.repos.d/ or
rt-3.4.x.repo >> /etc/yum.conf
install rt rt-mail-dispatcher
Note: Depending upon which Perl modules you had installed in the past, you may have to update before installing via yum. If a whole lot of dependency errors display when you run yum install, then type the following:
install rt rt-mail-dispatcher
Just download everything to a directory and do:
rpm -Uvh *.rpm
A user pointed me
out that he was in such a hurry to try it out he lost the messages
that appeared after install. He also suggested I created a file with
those messages inside. Meanwhile here they are:
generate an editable site config file.
must now configure RT by editing /etc/rt/RT_SiteConfig.pm and
will definitely need to set RT’s database password before continuing.
Not doing so could be very
that, you need to initialize RT’s database by running
something goes wrong you can always drop everything, by executing
must now configure somethings by editing /var/rt/home/.procmailrc,
i – http://bestpractical.com/
I didn’t think I’d ever say that. I’m a long time apple user, dating all the way back to pre-performa macs – before the ipod, before osX, before macs were cool again. I’ve been using them for a long time. And as much as I hate to say it, apple’s increase in market share is ruining their customer support. In life of my current mac – a 2ghz white macbook, I’ve had three terrible customer service experiences, all in the span of a year.
The first experience was after I bought my laptop. I bought it at a store called Fry’s electronics. Frys electronics is a terrible place, and I should have known better. The laptop had been returned by someone else and they had put it back on the shelf. I didn’t notice at first, but over the first couple days I realized that someone had been using it, and that there was a hairline crack on the case on the edge near the trackpad. Unfortunately, I had bought it just prior to moving, and was unable to go back to the store where I had bought it. No problem, I thought, I’ll just go to the mac store: apple’s customer service is awesome; they’ll take care of it.
When I got to the mac store, they said they wouldn’t fix it because I bought it somewhere else. “It’s a week and a half old,” I said, “and the store where I got it is in Seattle!” – I was hoping that being in Los Angeles, this might take care of it. “Deal with Frys” was the only answer I got.
Next I bought a printer online, via the apple store. Everything went well, until – shit! – I picked the wrong address. I had bought my dad an ipod for his birthday and his address was the default, having the same last name, I didn’t notice the difference until the confirmation email popped into my email box a few minutes later. I tried to change it via the site, but couldn’t, so I called apple tech support. They were closed. I called the next morning as soon as the lines opened and asked to cancel my order. The rep was nice, but told me, “You can’t, it’s already in the warehouse.” Looking online, I could see the status showed it unshipped. “We can’t cancel it.” I ended up having to have my parents refuse the order –twice- pay the extra shipping, and ultimately, “deal with it,” which seems to be the apple customer service mantra of late.
My most recent poor customer service experience occurred today. The macbook air’s SSD capacity has been increased to 128g. I travel a lot and weight matters. I pretty much decided to shell out the three grand and buy one. To make the price tag a little more palatable, I thought I could sell my current macbook, to recoup some of the expense. It’s still go that crack though, so I felt guilty about passing it on in less than perfect condition. I went to the apple store and they told me, I’d have to wait for a “genius” to look at it, which would be a forty five minute wait. “I’d like to just buy the part, please,” I said, being very technical and more than comfortable fixing a broken case. “You can’t buy those,” the genius replied. Showing him a link to a site where they can be purchased, I asked again if I could buy it. Again he refused. “You can wait, or you can go to an authorized apple repair shop, you’re not allowed to open it up.” That’s the point where I left. I spent $1000 on this laptop. I own it. I can do whatever the damn well I please with it, even if that’s chucking it into a wall. I don’t have an extended warranty, so it’s none of their business what I do with it. I just want the part. In addition, I bought the original iphone, for the original price. I bought the new iphone and gave my old one to my sister. I bought my parents a new imac – top of the line configuration, and I was, until about 20 minutes ago, going to spend another $3k on a macbook air.
I’m a technical instructor. I talk to a lot of people. I’ve turned countless students on to mac buy showing how it’s got a real unix subsystem and I can still use my Microsoft apps. I’m pretty sure I won’t be doing that any more. As nice as their devices are, their customer service experience is so bad that I can’t in good conscience recommend their products any more.
First and foremost, MMS should have been a standard feature in te original iphone release. This is a given. The lack of it is all the worse because it’s such a basic feature, and it requires responding to the inevetable question, “did you get that” with “well, yes, but I have to log into the site and I don’t want to do it on the phone, cause I can’t cut and paste and…” this reminds us that we don’t have cut and paste and how damn annoying that is.
If you haven’t seen it before, when someone sends you a picture message on the iphone you get a message saying, “Someone has sent you a picture, go to www.viewmymessage.com and login with the username and password: blah and blah” The usernames they pick are randomly generated strings of characters and not easy to remember, and really, this is an unacceptable solution.
Fix: Change the message. Instead of a username and password, give a direct link: http://viewmymessage.com?u=xxxxxx?&p=yyyyyyyy same information, slightly different formatting – problem solved.
Dear ATT, this is such a simple simple problem to solve, and one with a lot of visibility. Please fix it.
The kindle is a great idea. It’s a small, single purpose device that replaces the book. Apparently the user interface isn’t all that great, but the screen, which appears much more similar to anetch-a-sketch than a LCD display, is so readable that users have been overjoyed with the deviced despite of the UI. Portable books, easy readability, a library in your hand, digital distribution: what’s not to love, right?
DRM, or Digital Rights Managment is crap. It’s what says that you don’t OWN the song you bought on itunes, you own a LICENSE to play it. This is why MP3 is actually winning. Those services like yahoo music and rhapsody (which I like) are going under. Yahoo just announced that as of the end of this month, songs bought at the yahoo music store will no longer be able to re-up their licenses. Your music will keep playing, but only if you keep it on the same computer. Ouch. This sucks, but you can work around it. It’s only music, right? Imagine if this happened with books. Maybe books are a little too important not to own. Right now it’s not a big deal, but imagine the kidle became the basis of future generations of book readers, and we essentially stopped making paper books at some point due to environmental factors and whatnot. Now imagine ::poof:: amazon goes under. After two hundered and fifty years of profitability they’re closing the doors and shutting down. And, oh, your books aren’t going to work anymore.
I generally don’t consider myself too much of a conspiracy guy. I believe in privacy and the fourth amendment, but I don’t go all crazy about it. That said, having my book phone home freaks me out — and it will phone home. That’s what DRM does; it won’t play unless it’s authorized, so something needs to check if it’s authorized to be played by you on your device. Things have gotten a lot more Orwellian since 9/11 in this country, and there was already that uproar soon after the patriot act about the government being able to pull people’s library records. Fortunately, some librarians who did their collegues proud refused and made a stink about it. Big corporations probably aren’t going to be nearly as concerned, as the phone companies so nicely showed us.
Can’t loan your books
“Dude, you’ve got to read this!” Isn’t that one of the best parts of reading? Passing on the gems that you’ve found? I mean, litterally, passing them on? You can’t do that with a kindle — it’s not legal. You can’t sell your “used books” when you’re done. You don’t own them.
So to summarize:
- DRM is scary. It’s scary on music and movies, but it’s INCREADIBLY scary when it’s books.
- DRM phones home to licensing servers.
- DRM means you can’t pass on your books when you’re done… you don’t own them anymore.
Here’s what I hope happens – we take the screen and put it on tablet pcs, or phones. The one laptop per child program has a dual screen mode which in it’s low power version looks very similar to the kindle. Once it’s on the computer, lets just buy fricken PDFs and be done with it.
Normally when discussing space and space exploration, the arguments in favor of it cite the expected: research and development has a trickle down effect to the free market, giving us inventions like velcro, microwave ovens, and mattresses.
The Human Disaster Recovery Plan
The fact of the matter is, at some point, the world is going to blow up. Ok, maybe it won’t actually blow up, we could just be hit by an asteroid and have a nuclear winter killing off our entire species, or experience a deadly plague which exterminates or species, or some nut in Israel or Pakistan launches a nuke… at some point humans on earth will be wiped out. I’ve been thinking about this a lot lately. Every single time I do, I keep coming to the same conclusion: we need a “disaster recovery plan”.
For those of you unfamiliar with the concept, a disaster recovery plan is something companies do to guard their assets. In it’s simplest form, it’s a back up copy of important documents like your birth certificate kept in a safety deposit box in the bank. At the other end of the spectrum is “Iron Mountain”. Iron Mountain was the bomb shelter of all bomb shelters. Out in Nevada (I think) the US government did a geological survey to find a mountain made mostly of iron ore. This mountain was hollowed out and inside there were shelters for important people to be evacuated to, and tons and tons of books, computer tapes – everything one would need to restart the US civilization in the case of a nuclear war. A more normal case is to have two smaller offices instead of one, or for google to have servers located in multiple areas of the country in case there is an earthquake or something.
We need a disaster recovery plan in case we loose earth. This may seem like a crazy concept, but it’s a natural phenomenon. If you look at how distributed and isolated pockets of humanity were from each other before we got high-speed transportation, it mimicked this fairly closely. Take the black plague for instance. Half of Europe was wiped out by it. Five in ten people died. Think about that for a minute. Think of the last ten people you’ve talked to on the phone – half of them would have died if it had happened now instead of then. In fact, it was one of the reasons the “dark ages” lasted so long – before the bubonic plague, Europe was getting close to getting back to an era of civilization – the plague set it back one hundred years. If you look at the rest of the world, however, it was relatively unaffected – the Middle East was untouched, as were the Americas. The Mayans and Incas were happily chugging along while Arabs were evolving math and science, and Europeans were dying by the score.
So a disaster recovery plan is a good thing, but what exactly should we be doing? We need to colonize. We need to embark on two fronts, exploration and terraforming. What in gods name is terraforming? – we’ll get to that, but first, lets look at exploration. Currently, the best we can do is launch antiquated rockets, each of which costs millions of dollars to put into space. Crazy talk. We need a full-fledged space station capable of building spaceships. The reason we need a space station is because the big cost is getting off the planet. Gravity. If you’re already in space, your cost of take off is zero. All you need to do is point yourself in the right direction, fire off $1 of propane and you’re on your way. Granted, you won’t get there very quickly, but the point is – since there little to no resistance in space, you don’t need a lot of energy to move heavy things. We could build a space ship the size of a city and run the propulsion off solar cells. If we were to build a space ship big enough to comfortably cart several thousand folks around space looking for earth 2 we couldn’t even get it into space. If we bring up one piece at a time and put it together up there – way easier.
The other option is terraforming. Terraforming is taking a planet, which can not currently support life, and turning it into one that can. It’s going to mars and installing solar generators to melt the frozen carbon dioxide and water on the polar caps, then dumping algae into the oceans, and eventually plants and trees and small furry animals. The most optimistic estimates of how long this would take is about three hundred years. The folks on the other side of the spectrum say millions of years. Does that mean we shouldn’t do it? Hells no.
If you look back a few centuries to the gothic churches of Europe, they took generations to build. It was incredibly rare for the initial architect to be alive when the church was finished. An interesting thing happened though – they rarely stuck to the original plan. As time passed, they figured out better ways of doing things: maybe they figured out the walls would be stronger if they used flying buttresses, or they figured out a more efficient way of building arches that saved time and used less resources – these new innovations would be incorporated into the church’s design, and it would end up getting done quicker, or would become bigger and more impressive. I think the same thing would happen if we began terraforming mars. Advances in technology would speed the process up. Even if it doesn’t get speeded up, three hundred years, or even a thousand years isn’t that long in the big picture. Some of those processes would likely even work their way back to earth, potentially helping fix the problems we have here. Heck, even if it doesn’t work, that doesn’t mean we shouldn’t try – that’s like saying, “eh, we’ll never cure cancer, screw it, lets not try.” If we try to do something next to impossible, there is a chance, however small, that we will succeed. If we don’t try, there’s no chance at all. Heck, the odds have to be better than a guy who looks like Lyle Lovett marrying Julia Roberts, but that happened.
The real problem we face is the lack of political will and money to spend. The Iraq war is currently costing the US $177 million per day. We’ve already, as of Sept. 30, 2007, spent over $400 billion on the iraq war. NASA’s current budget for 2007 is 16.8 billion. Conversely, the X-Prize, a contest setup by the founder of ebay and a lot of other rich geeks, gave away $10 million dollars to anyone who could build a private space ship, launch it into low earth orbit, come back down, and do it again the next day. Scaled Composites, a private company backed by Microsoft co-founder Paul Allen achieved this last year. Scaled Composites, after winning, was purchased by virgin atlantic and we should be seeing flights into space in the next few years. If we took $75 billion of the dollars we spent on the iraq war and gave it to NASA and took $25 billion and gave it as prizes to encourage private companies to invest in terraforming or space exploration on a large scale, I have no doubt that we could be on the road to accomplishing this in the next ten years. I’m not saying we would have this licked in ten years, but I’m sure we could have interplanetary space ships going back and forth to mars and winging around the galaxy.
The idea that we can survive another thousand years without some meathead launching a nuke, or global warming, or some uber-virus wiping us out seems like an incredibly scary bet to make. I don’t think there’s any good reason to make that bet. We need to secure another planet for human beings to live on, and I think if we as a species dedicate our resources to doing so, there’s nothing we can’t do. That includes living on multiple planets throughout the universe.
What’s the point of all this? Just because every news station in the country and all the politicians have already decided the issues that are important (healthcare, iraq, etc) it doesn’t mean that they’re the only issues that actually matter. Think about the big picture and the issues that are important to you, and whatever candidate you support, support one who believes in and supports science… it works, bitches.
You can find out more about presidential candidate’s positions here, the list of issues are growing: http://2decide.com/table.htm
I was recently reading a blog by David Hitz and he raised an interesting question about where we were in regards to server virtualization:
Something has been bugging me about the market share numbers for server virtualization. Is the trend is just getting started, or is it almost finished? The numbers I’ve seen say that under 10% of all X86 servers have been virtualized – maybe 7-8%. By that measure, the trend of converting physical servers into virtual ones seems to be quite early.
I’ve got to say, I agree, but with a caveat. It’s actually not the beginning of the trend in general, it’s the beginning of the second wave. When you look at systems holistically, when you get into enterprise applications or services, the applications far exceed the scope of a single physical machine. When you’re looking at pools of externalized storage married with blade servers and virtual machines, what we really have are new, open, mainframes.
If we look at a description of why people are still using mainframes, from an article over at serverwatch.com, we see a description of their advantages that sound eerily familiar:
Logical partitions, known as LPARs, can be used to run multiple operating systems, including the z/OS, z/OS.e, OS/390, Linux on zSeries, z/VM, TPF, VSE/ESA, zVSE and zTPF. All major databases and enterprise transaction processing environments run on the new mainframes, including CICS, IMS, WebSphere Application Server, DB2, and Oracle.
Sounds sorta like virtualization, doesn’t it? You centralize and thin provision, picking up resources for where you need them by using the non-utilized resources of other servers. In addition, you don’t need to have two of every single server — we’ve effectively been doing raid 1 for mission critical servers, having a “db1” and “db2” for those mission critical database servers – now, instead, if a blade dies, the server gets transparently moved to another. This was the same idea that drove the adoption of the mainframe. You just need to have enough extra capacity for the physical needs – cpu, ram, disk – not for the logical – “second webserver”, “second database server”- it’s also the same argument for san and nas if you thik about it.
A happy accident for netapp:
I meet a lot of netapp customers, and in fact, netapp is a big driver of the virtualization trend among their customers. They way they do flex cloning and read caching was a *very* happy accident in regards to virtualization on their devices, the performance is amazing; if you put your vms on a flexclone or use their deduplication stuff it ends up having incredibly high cache hit rates. Also, the blocks are all pointing at the same places so the disk utilization is small. It’s cool stuff they’re doing there.
The swing of the pendulum:
As for the overall trend, I think we’ll see a move towards more and more virtualization. Eventually, we’ll hit a point where we end up centralizing and vms are the norm and we’ll start creeping back in the opposite direction. Single server computing resources by that time will be increadible and we’ll see the pedulum swing in the opposite direction. For right now though, I think we’re going to see a rebirth of the mainframe, under the alias of virtualization, and as for me, I’m a pretty big fan of that idea.
How a delayed flight can make a customer for life:
I’m sitting at JFK airport in New York as I write this, waiting on a flight that was supposed to leave at nine thirty pm. It, instead, is leaving at 12:30am. This is a crappy situation, but it’s not how a company deals with a customer when things are going great that counts, it’s how a company deals with it’s customers when things go awry. Air travel is unpredictable. They need to deal with weather, and lets face it: no matter what we have waiting for us on the other end, the most important thing is to get there safe. We all, as customers, understand that bad weather happens, and it’s not American Airlines fault.
That said, the girl working at the ticket booth has been making announcments about the time being pushed back. If, instead, she had walked over to the seating areas and addressed us in small groups, it would have been much more personal and it would have come across less as a company, and more as a sympathetic face. Even better would be if they had a small budget set aside for food and drinks. Lets compare and contrast:
Announcement #1 via the intercom:
“Ladies and Gentlemen, The aircraft for flight 4750 has not yet left Boston, and is now expected to arrive at 11:20, pushing back our departure time to midnight. Flight 4721 should be arriving at 11:35 and is expected to depart at 12:30. If there are any further changes, we’ll make another announcment.”
Announcement #2, in person, walking over:
“Ladies and Gentlemen, I’m really sorry, but we’ve got some more delays. I know it’s been a long wait, so we’re going to put out a table with coffee, and some snacks. The aircraft for flight 4750 has not yet left Boston, and is now expected to arrive at 11:20, pushing back our departure time to midnight. Flight 4721 should be arriving at 11:35 and is expected to depart at 12:30. I know it’s been a long wait, and we’ll let you know as soon as we know anything more.”
Use what you’ve got:
The worst part of the whole situation is all it would take is a little initiative. Airlines already buy coffee in bulk. Airlines already have snacks. The only requirement is the industrial coffee machines and some of those magic creamers that don’t go bad at room temprature (how do those things work?!). I understand that most people choose air travel on discount websites based on the lowest fair, but if it was well known that if you missed your flight you’d at least get a cup of coffee and some snacks.
I just finished reading The Alchemist, by Paul Coelho. It wasn’t bad. Most notably, two cute girls stopped and commented about what a good book it was, which alone may be a reason to read it. A surly pizza guy over at Hogan’s Hero’s (best named sandwich shop ever) also struck up a conversation about it, but I think that was more because he wanted to tell me he had just read his first book a month ago (The Diary of Anne Frank).
The Alchemist is a young man’s spiritual journey. A shepherd goes on a quest to find treasure and learns valuable life lessons along the way. The book reads easy, is short, and is fairly quick to read in a few hours, or a day if you don’t have the chance to read it all at once. It’s a little heavy handed with it’s lessons, but it’s of that genre, and I suppose that’s just a given. It reminds me of one of the books my highschool art teacher would have handed out as if he were giving away secrets. It’s certainly a “feel good” book, advising following your dreams and not being distracted by naysayers.
It’s not the greatest book that I’ve read, but it’s not bad. It’s a little oversimplified, but it does actually have a decent moral – fear of failure is one of the biggest obstacles to happiness, and considering how many people work jobs they hate, maybe it’s a decent message. It’s a good bedtime story if you’re looking something to read to a kid.