Postmortems: they ain’t just for dead folks

Standard

If you’ve seen a cop drama in the last twenty years, odds are high you’ve watched a scene where the detectives and the coroner stand over a body on a stainless steel table.  Give us the straight scoop, doc, how’d he catch it?, they say (at least if the show is in black and white, anyway).  You think I’m a miracle worker? retorts the coroner.  Come back tomorrow and I might – might be able to fill you in.

The TV never shows what the coroner actually *does* – rarely anyway – because A) it’s really gross, and B) more entertaining to watch the detectives chase someone down.  But in the real world, the job the coroner does is critical to any crime investigation. How was the person killed? How tall was the killer?  What hand did they use?  What was the weapon?  Etc, etc.

Now, failure in the IT world can be pretty ugly.  Email went down.  Phones went down.  Files went missing.  Revisiting those wounds isn’t always an easy thing to do.  But if your IT group is healthy, and clicking on all cylinders, project and incident postmortems should be a regular part of your routine.  There are, however, a few rules to follow.

Number One: let some time pass.  This is tricky, because you have to walk a fine line between allowing enough time to pass to allow emotions to settle down, to let all parties (IT and non-IT) to return to normalcy, but also close enough to the event where everyone’s memory is clear about what happened.  If you’re going to err, err on the side of waiting longer for cooler heads to prevail.  By nature, a postmortem of a negative event has the potential to become emotional.  Keep the personalities involved calm.  This isn’t an opportunity to beat up on the guy who accidentally pulled the plug out of the wall.   Use the postmortem as a learning event.

Number Two: have a set procedure, and follow it.  What exactly happened?  What could we have done to prevent it?  What process needs adjustment, if any?  Do the costs of implementing preventative steps outweigh the costs of the event happening again?  Maintaining a format allows people to know what to expect, and maintains a sense of impartiality.

Number Three:  maintain accountability.  Not always possible, but don’t use a group setting of a postmortem to hide the fact that there’s a specific problem, or error, that occurred.  This is especially critical when a postmortem includes another business unit.  If it is the IT Group’s fault, take responsibility.  Sugarcoating, or passing the buck, won’t help you get to that Trusted Partner status.  This must be managed; postmortems are not a place for people to beat up on someone.

Number Four: follow up.  If the postmortem conclusion has a specific action item (i.e., move patching from weekly to monthly), do it.  If the p-m process becomes a show-and-tell display rather than a meaningful process, people will quickly learn to not approach it seriously.

Number Five – and perhaps the most important: do postmortems after every project, even successful ones.  For successful p-ms, the tone and approach are different; more of a “what could we have done better” attitude.  These are also a chance to give credit where credit is due.

If you think about it, the postmortem is really just an exercise that increases communication levels, be they internal to IT or with a business partner.   Of course, it is difficult to conduct these when you’re running around putting out fires or running from one project into the other.  As with all processes that are higher up Donahue’s Hierarchical Pyramid of IT needs, they rely on the fact that the lower levels are being adequately met.  But postmortems are a key indicator that an IT group is in a good place.

Gratuitous Thanksgiving Day Post

Standard

Ok.  It looks at the calendar.  It smells the turkey topping on the Domino’s pizza.  It pulls out the Underwood typewriter.  It writes the Gratuitous Thanksgiving Day Post.

So, once a year, I have to be thankful?  I can manage that.  Do I have to *pay* anyone?  No?  Perfect.  Are my thanks an indicator of future success or benevolence? No?  Perfect.  Will people read this and think how humble and great a person I am? Yes? Perfect.  Let’s get this over with, then.

I should thank WordPress for giving me the platform to write this with, but then I’d get into this loop/chain/black hole of thanks that includes AT&T for the connection, SoCal Edison for the power, my mother and father for not cutting off my fingers as a child, the United States for my unalienable rights, and the Creator for what passes as a marginal IQ, so I won’t thank them at all.

I *am* thankful for Google.  Sweet, clean google.  Savior of a thousand technical problems, high and low.  I’ve said a hundred times (mostly to the same people, over and over again ad naseum) that the key to being a good tech is being a good Googler.  My “thanking chain” logic says I should really be thanking the folks that screwed something up before I did, and took the effort to post about it, and even moreso the people who posted the fix, but that’s too much effort.  I will thank whoever the hell Boolean is, though.

I *am* thankful for the Internet.  O Deliverer of Porn, Solver of Problems, Bringer of Reddit.  Our calendar in a thousand years should reference B.I. and A.I.  I shop for toasters by including the term “wifi”.  I rage if the TV I want doesn’t have Netflix embedded, even though my blu-ray and my a/v receiver already do.  I’m not thanking Al Gore, I’m not thanking Darpa.  I will thank the “bunch of tubes” guy because he made it easy to explain the Internet to my parents.

I *am* thankful for Microsoft.  Yes, they gave us BOB, and WinME, and Windows 8, but they also gave us Windows 95, Windows XP, Windows 7, Server 2008, Exchange, Outlook, and Minesweeper.  Yes, it’s easy to kvetch about the Evil Empire, but I’m glad we have, for all intents and purposes, a single vendor for desktop OSes.

I *am* thankful for Mac Fanatics.  They make me feel grounded.

I *am* thankful for tech support in the United States.  My English is worse than my Canadian, French people think I’m speaking Spanish and Spanish speakers think I’m speaking German.  So I’m behind the 8-ball (il ocho oeuf) to begin with, and I don’t need Sachin Tendulkar’s second cousin telling me that his name is really Fred.

I *am* thankful for the career mistakes I’ve made, at least the ones that didn’t get me fired.  Mistakes – be they technical or managerial or operational – still provide the best learning experiences.  And boy, have I learned a lot.

That’s it.  I can’t think of a single other thing I’m thankful for.  How about you?

Sometimes, you just need a win

Standard

For better or for worse, success in IT is perversely  measured; the height of success is the height of ubiquity.  When Lord Grantham reaches in his coat pocket for his smoking-pipe, he doesn’t even think about it; but Bates anticipated the need earlier in the day and placed it there.  Sorry, I’ve been watching too much Downton Abbey for my own good.

IT is like that ubiquitous butler.  Things run smoothly *not just because*, but because of time and effort put in by silent stewards.  And in IT, our presence is most often requested when something is amiss.  We can work fiendishly to make sure that email server has 99.9999 percent uptime, but when it goes down on a Tuesday at 2 pm, watch out!

This is not a complaint, or a whinge, as the Earl would say.  It’s just how things are, and those in IT accept it as the nature of things.  People come to us when things don’t work; we are, by circumstance, associated with problems, and problem-solving.

All that said – that constant refrain of “when I’m doing my job really well, no one notices, and when something goes wrong, I’m the first they call” – while accepted – can be a negative over time if it is the *only* form of relationship between a firm and its IT staff.  Ubiquitous service is its own reward, but even the Butler gets to suit up for the cricket team and have a day in the spotlight.

Stop with the freaking analogies, you’re thinking, and speak plainly, man!  Sure.  Every once in a while, the firm needs to either introduce new technology, or upgrade a process, during which IT gets a “win” under it’s belt.  These actions – outside the realm of day-to-day maintenance – reinforce the perception that IT is good for the firm, and allows IT to continue along the path to Trusted Partner.

While a recent example of this for a client was a whole new phone system, these projects don’t need to be expensive.  A new open-source IM solution, or a firm-wide open source Wiki.  Usually the best ideas for Wins inside an organization come from the users or business unit heads.  It’s all about communication and creatively solving problems – the same as our ubiquitous butler, but on a macro scale rather than micro.  And it lets the users know – hey, you’re not just the monkey who swaps out my toner cartridge.

Fighting for these kinds of projects will keep IT morale up, will keep the staff engaged, and will help improve firm-IT relations.  They’re a key ingredient to having a happy, healthy IT group.

IT 2.0

Standard

In many ways, we in the IT industry are pretty lucky to be right here, right now.  We’re present at the birth of an industry that will forever change the world (and though I’m prone to hyperbole, that’s an understatement).  Two hundred, three hundred years from now, history will look back at the 90’s/oughts as when everything *changed*.

We’re kind of like the first employees of Edison Electric, or Bell Telegraph and Telephone, Wright Brother Airlines.  We’re able to witness, first-hand, the birth and first tentative steps of an industry.  And while we may * perceive* tech to have been around forever, to have made great and purposeful strides, the reality is that 25 years is nothing. In the great scheme of things, nothing at all.

We’re only now emerging from the embryonic stage, and the trademark for that is the passing of the 1.0 pioneers.  Gates has retired. Jobs is dead.  Ellison is active, but nearing 70.  Phillipe Kahn now dabbles.  Steven Sinofsky just left Microsoft.  These are huge, huge names – the Railroad Barons of our century – and they’re exiting the stage.  It’s as good a time as any to mark the end of 1.0 and the start of 2.0.

So what lies ahead?  Perhaps more importantly, *who* lies ahead?  Will the rules – as ethereal as they may be – that governed IT 1.0 still apply to 2.0?  Will the cyclical nature (client-server, server-client, back to client-server) continue? Will it still be a rule that marketplace dominance (WordPefect, AOL, MySpace) mean very little?  What’s the long term impact of everything and everyone being constantly connected?  What are the cultural and infrastructure implications of emerging tech – self-driving cars as a single example?

More importantly – we have our first generation that grew up with the internet starting to hit the streets.  You’ve heard the generalizations – email is too slow, everything txt, 2.0 rules, 1.0 is horrible, why can’t i schedule my toaster online?

You know, from experience, that there’s a general correlation that the easier something is for the user, the more effort needs to happen behind the scenes.  It’s a ratio that’s consistent in infrastructure design, application design, user interfaces, etc., etc.  And with the user *expectation* quickly becoming, “it needs to be easy, regardless of what I’m trying to do” – it means that more and more effort – more thought – need to go into providing that sense of ease.

There’s an unspoken reality that goes with the saturation of IT into people’s personal/home lives, and the incredible effort consumer product manufacturers put into things being “easy”.  With IT 1.0, people were just happy to have the tech, and would put up with large headaches to get that functionality.  Not with IT 2.0.   IT is no longer the domain of the business world, and the bar is being raised – and not by us.  Like the industry itself, it’s time to change.  To step up our game.  To start casting aside old mentalities, old rules.  Time to shift.

The psychology of conservatives and liberals in IT

Standard

Like many others, I’ve been pretty glued to this election.  I’m lucky enough to have a wide variety of friends, some for Obama, some for Romney, and a few waiting quietly in their cabins with a shotgun and a stack of gold coins.

The idea that personalities drive political affiliation has fascinated me, and the mother lode of this idea was a Ted talk given by Jonathan Haidt.  He studied – across cultures – moral motivators for conservatives and liberals, and the results were interesting.  I highly recommend you watch it here.

It got me to thinking – how does “conservative” and “liberal” thinking impact IT culture?  I’ve always thought that the IT culture within a firm is driven solely by the firm’s culture overall – i.e., conservative firm, conservative IT; but looking back, that’s not always the case.  Strong leadership *within* IT can allow for a culture shift within the group.

So what does “conservative” vis-a-vis IT mean?  Or liberal?  Top of mind, you start thinking conservative = locked down, rules, tickets, defined processes.  Liberal, you start thinking huge mailboxes, BYOD, BYOS(oftware).  Is it that cut and dried?

To me, the Architecture industry is fascinating; you have extremely creative individuals, but at some point, they’re tasked with building a business, instead of designing an opera house.  How does that work successfully?  And if you take it a step further and say that creativity loosely equals liberalism, and business loosely equals conservatism…

Conflict occurs when the two sides rub against each other.  It is much easier to run a conservative IT shop within a conservative firm, and liberal within liberal.    The biggest conflicts I’ve seen in firms, in general, happens when expectations don’t match.  And that most often happens with “conservative” IT groups in “liberal” firms, and vice versa.  You can lead a group a bit differently than the overall culture, but a significant difference is going to mean near-constant friction with users and other business groups.

If you watched Haidt’s video to the end, his takeaway point was this:  we need a healthy mix of conservatism AND liberalism.  The two function best when they’re together.  And the worst thing we can do is to become technical ideologues  to reject “conservative” ideas or “liberal” solutions simply because of their label, and not on their own merits.

GO VOTE!!

The incredibly important difference between disaster recovery and business continuity.

Standard

Rather than writing a semi-political blog that would immediately alienate 50% of my readership (yes, 1 is 50% of 2), I’ll try for something more topical, since it seems (hopes?) like the worst of Sandy has passed.

We throw around terms in IT a bit like candy, here and there.  And sometimes, like in a See’s box of chocolates, we’re not always sure what all those terms mean.   Well, *we* think we know what it means, but someone else might take a bite and think just a wee bit differently.  And one of those terms that seems to have the most variance (besides “the cloud”) might be disaster recovery.

I think it was my second or third year at PBM when a salesman asked me to go on a client visit with them.  Typical call, new client, no footprint, potentially pretty big.  We sit down, go through the tea ceremony particulars, and the CIO says: “We need a disaster recovery plan.  Do you know enough about DR to write one up?”

I looked at the salesman, I looked back at the client, and simply said “no”.  Needless to say, this was a *very* short meeting.  The salesman, fortunately, was a class individual, and understood we shouldn’t be reaching past our comfort zones.  But he still asked, “We do backups and backup solutions for almost all our customers… what’s the difference?”.

And that is the crux of the issue between backups, disaster recovery, and business continuity.  There are – to my chocolate-focused mind – key lines that segment between the three.  They are tiers, the lowest being backups, the highest BC – but any discussion of a higher tier must involve understanding of the lower ones.

When you’re having the Disaster Recovery conversation with the CIO/CFO/CEO, it will help to use specific examples of outages, for instance:

  1. A project folder is accidentally erased.
  2. Our file server bursts into flames.
  3. We lose Internet and WAN for 3 days.
  4. And my favorite, a meteor hits the Boise office.

As we’ve learned from Sandy – there’s another one – loss of access, physically to an office for a week or more.  The data is there, it’s fine, but the office isn’t accessible, or doesn’t have power.

Each of those situations needs to be defined with a “What do we do if…” and time estimates for retrieving data.  What I have found, over the years, is that the concern level of executives *plummets* after they are told that the project data is being secured offsite.  That seems to be the golden threshold for their comfort level.

The problem is – having that raw data offsite, safe and secure, doesn’t mean anything if you don’t have computers, software and connectivity.  Let’s say we’re doing real-time replication offsite, nearby.  The office becomes inaccessible, or is destroyed (water/fire/etc).  You now have 100 Architects sitting at home, and a server with all your project data in a Colo.  Now what?

This is the step that few want to think through, and even less want to spend money on, because it *does* take an investment.

At this point, you have several options.  You can source out, beforehand, on-demand office space, and computer rental firms.    There are firms that will create business continuity offices that are in railway freight containers.   Most AEC firms don’t have those kinds of resources, however.

The good news is, if you have a colo now, or are partnering with an MSP/VAR who has one, you’re halfway there.  The easiest solution to put into place has three parts:

1) fast internet connectivity and VPN services (either hardware/carrier/whatever)

2) Virtualization infrastructure to bring up servers and roles as-needed

3) Desktop (or at least Application) virtualization software and hardware

These three components can make your IT department true heroes in the case of an extended outage.  If you had an office in New Jersey right now, you could say “hey, here’s the VPN or Citrix URL – just log in from your home PC and go to work”.

The other benefit is that you can dual-role many of these items.  They don’t need to be in a closet waiting for some disaster to happen – you can proactively use the remote desktop or remote application services for mobile or low-bandwidth users.

Selling this to executives, once you’re past the offsite backup line, is difficult.  Done properly, it will take a significant investment of time, money, and people.  These are the same executives who don’t think twice about signing the check for car insurance.  It is IT’s job to properly present the potential issues, the solutions, the costs involved – and then allow them to make an informed decision.  But the discussion should not, can not end with “the data is offsite, so we’re safe”.