Sunday, May 29, 2011

No Killer App


It’s the easy question that has no easy answer that is quite often the most fascinating to ponder.  As the owner to two iPads, it was natural for me to get the “why should I buy an iPad?” question.  I would describe the iPad in glowing detail.  Lightweight, instant-on, all-day battery, touch interface and really cool smart cover.  Lots of apps in the app store, many of them free.  But the answers never really seem to satisfy, so I reflected on why that might be the case.  

Really ground-breaking technology always seems to deliver on something new, something so compelling that almost all people see it as a break-through and they want it badly.  Mainframes had back-office accounting apps.  PCs had VisiCalc, the first PC-based spreadsheet program.  Smart-phones have email and texting.  These “killer apps” drove the technology into an increasing number of people’s hands.  And new eco-systems grew up around these new platforms, propelling the technology world to new heights.

So the real iPad question I was being asked is: “What’s the killer app?”.  The answer: “There isn’t one.”

If not, then why all excitement?  Why do iPads fly off the shelves?  In my opinion, it’s the new “killer experience”.  All those things I was describing had to do with how it felt to use an iPad and much different and exciting is was to use it, not at all what I did with it.  It’s similar to my first HD TV.  It didn’t enable me to watch more TV, cable TV did that.  But I watch HD content almost exclusively because of the awesome experience.  I imagine 3D TV will be the same in 5-10 years.    

I suggest approaching iPads not as a new way of doing old things, or looking for the application that everyone is clamoring to get.  Look for opportunities to completely blow-up or dramatically revise what you’re doing today.  People can now carry a “computer” everywhere, all-day and interact immediately.  We’re not used to thinking that way.  Changing your mindset is the place to start.

Friday, February 25, 2011

I've Gone Mac

A couple months ago I bought a Mac Mini out of total frustration of the time it took to maintain, boot up, debug and protect my Windows machines. I thought it would be a steep learning curve and I could write a long blog on that frustrating process and some useful tips and tricks. Sadly, from the blog's perspective, and happily, from my personal perspective, that's simply not the case.

The Mac Mini is a little box, about the size of fat, small, square frisbee, makes no noise, is cold to the touch, boots in about one minute and has a small selection of ports to attach devices. In my case I have an HDMI-attached monitor and the keyboard and mouse are bluetooth. The mouse is a Magic Mouse, and that alone is worth the price of admission. I spend most of my time in a web browser and the Magic Mouse makes scrolling, zoom and previous page navigation so much easier.

The defining change in going Mac is the lack of interaction that Windows constantly presents. Applications install with one drag. System updates take one click, and so far have not required a re-boot and do not interfere with normal operation. Installing the printer took zero of anything. It takes a little getting used to. Actually, very little, after I realized that Apple takes a minimalistic approach to asking for anything. Quite the refreshing change.

I've loaded a few applications, my beloved Google Chrome browser, DVD converter and Skype being at the top of the list. Most everything else I use comes with it, such as iTunes, iPhoto and iMovie. Haven't felt the need for any anti-bad-guy stuff, although I'll be researching that in the coming months.

And perhaps best of all, no bloatware. No 30-day trials of anything. Annoying the customer doesn't appear to be in Apple's DNA.

Thank you.

Wednesday, February 2, 2011

The Root Root Cause


In a previous article titled "The Perfect Problem", I discussed finding ways to improve outage recovery time by looking at all of the operational aspects surrounding the problem. Now it's time to take a deeper look at the problem itself and see if there are clues to other, perhaps more profound, issues. I call this looking for the "root root cause" since most root cause analysis efforts don't look for a deeper meaning or really find the true source. Root causes like "the disk filled up" or "a table needed reorganized" is frequently as far as it goes. Their root root cause might be something like "no one was watching when Fred was on vacation".

I'll use three real examples, one each for people, process and technology, to demonstrate what you should be looking to undercover.

The most common fault of a root cause analysis is stopping too soon, generally before an individual is identified that made the change that resulted in an outage. I understand that people can be very sensitive to being called out, but how can we truly improve until we know where the actual problem started? The goal is not a witch hunt, but to help the person improve. It may take a few brave souls to step up and say "I made the mistake. Here's where I went wrong". The first clue that you're likely having this situation is to look at the verbiage being used. Personally I like "post-mortem", with the clear meaning that we had a death to our service and we're taking this very seriously. If you're using wording like "post-incident review", you may be in trouble. Sounds more like "we have to do this, we really don't want anyone's feelings hurt, and let's just sweep this under the rug".

One of my favorite examples of a process problem was many years ago when my boss made me responsible for the pocket-sized corporate phone directory that had just been horribly misprinted. The root cause was determined to be a mistake made by the printing company. Digging deeper, the root root cause was a process that avoided both work and blame for the prior organization. The directory was a mess, even before the fatal distribution. A new team was formed with a different mission: publish the best directory we can. No more CYA, just the absolute best we can do. By recognizing the real problem and fixing that upfront, the team went on to change the format, the paper used, the binding, and just about everything else. They met with the printer and found a sure-fire way to avoid printing errors. They met with other local companies and brought back fresh ideas. They met over lunch and looked for errors, even going as far as calling Hawaii to find a mistake and have it corrected. They were rewarded with more positive feedback than they ever imagined. All by starting with an approach 180 degrees opposite and finding their own way.

Technology breaks and technology has bugs. But technology can also be put together in ways, particularly over time, that ends up having negative effects much greater than expected. Such was the case of our building's local area network in the 1990's. We had an outage due to a hosed-up network switch that had user PC's attached. But what was puzzling was why did it affect a half-dozen production servers in the computer room. The answer, its root root cause, was a network architecture that resembled a balloon that was getting bigger and bigger. One problem caused the balloon to pop. So the answer was a new design that would grow horizontally and isolate different services and still allow them to communicate. A new design was created that housed servers, PC's, the Internet and the wide-area network in different "towers" connected by routers that prevented many of the issues inherent in the "balloon". Almost by magic, the network was more reliable, faster and able to change quicker with less disruption. And we spent less money by using far cheaper equipment. All this by identifying and solving the real problem.

As you might guess, this can take a lot of work, finding and fixing the root root causes.

But your customers will love it.


Tuesday, January 18, 2011

It Wasn't Scary


Tablets are all the rage these days.  Seems like a year ago (it was) that the pundits were predicting the market failure of the iPad.  It was just an oversized iPod Touch, wasn't anything more than an interesting toy, and certainly wasn't going to be of interest to corporate types.  How could they be so wrong?  Every other tablet introduced crashed and burned.  They missed the biggest selling feature this time, staring right at them, that wasn't there before.

It wasn't scary.

Take the average person and put them in front of a computer.  Most are scared to death and refuse to touch the keyboard in fear that they might break something.  Put an iPad in their hands and seconds later they're tapping and sliding and laughing.  Most will play for several minutes, ignoring guests and its rightful owner, as they discover this and that cool feature.  I've seen construction workers waiting to take their turn and grandma's sliding pictures with their pinkies.  And smiling all the while.

Why such a difference?  Computers and iPads both have processors, memory, an operating system and icons.  Under the covers, they are basically the same.  But we humans fear complex things and proceed with caution, using our basic survival skills that serve us well, day in and day out.

The typical computer is a big machine and goes through several minutes of whirling and clicking before it's ready to use.  It has a mouse that moves a pointer and two buttons that do different things in different situations.   It has a keyboard with somewhere around one hundred keys, many of which have multiple purposes elicited by holding shift, alt, ctrl, fn or a small four-part flag.  It most likely has a dozen or more lights and a dozen or so ports of different shapes and sizes.  You need to patch, you need A/V, you need anti-this and anti-that.  And most of all, you need to be frightened.

Contrast that with the iPad, which is just shy the size of a piece of 8.5 by 11 inch paper and weighs in at 1.5 pounds, turns on instantly, has four buttons each which you can figure out in less than a second each and the same connector you use on your iPhone and iPods.  You turn, it turns.  You touch, it reacts.  Can't open it up, don't need anti-anything and has a nice "upgrade all" feature.  You bring yours, I'll bring mine.  Let's do coffee and a game.  And not be scared.

That's my belief in what is driving consumers and business people to adopt tablets at a record pace.  You can't employ the typical fear, uncertainty and doubt to slow this down. 

We're not scared anymore.








Monday, December 27, 2010

The Perfect Problem

Problems are a fact of life, certainly if you live in the perfectly bit-oriented world of computing where all those bits must line up in their proper place, billions of them, to be processed at rates exceeding a billion times per second.  Seems on the surface to be an impossible task, but that's the comforting life for IT professionals, who relish in keeping the computing universe in alignment twenty-four hours a day, seven days a week.  And when the inevitable problem occurs, it gets diagnosed and repaired at dizzying speed.  Then it's back to programming more billions of, hopefully but rarely, perfect bits.  

But the above is only a fraction of the whole story, and in many cases is the shortest portion, at least from the viewpoint of the user whose service was interrupted for hours or days.  But this fraction is also typically the focus of the root cause analysis, which is focused on avoiding the problem in the future.  And hence a larger opportunity to improve service is wasted.  To get at the bigger picture, begin using the concept I call "The Perfect Problem".

Was is a Perfect Problem?  This will vary somewhat depending on your service agreements, but the general idea surrounds all the other "stuff" surrounding the outside of the technical problem.  Simply put, a Perfect Perfect is reported, dispatched, escalated, recovered and communicated in the way you designed.  It's all the operational aspects that can consume 90-99% of the actual time a service was unavailable.  It's all about asking questions, and a lot of questions, and getting straight answers so everyone gets better.  Some examples of the type of questions you need to ask include:

  • Reported - Did the automated monitoring tools or computer operators see the problem before users started calling the help desk?  Could they be improved to shave minutes or possibly hours from the overall duration of the outage?
  • Dispatched - Did the ticket get assigned to the correct group?  Where the right people paged?  Was the ticket picked up within the specified time frame?  Get the ticket get bounced back and forth trying to find a home?  Was all the information gathered to that point included to avoid wasting time asking for it a second or third time?
  • Escalated - Per procedure, was the problem escalated to a problem coordinator within the proper time period?  Was management made aware of high severity problems?  Were tickets opened with vendors escalated properly?
  • Recovered - After the service was fixed, did it come back up as quickly as usual, or were other steps needed?  Was all the operational start-up documentation accurate?  Could improvements to the documentation make it more clear and less error prone?  Are there ways to speed up restarting the application?  How did users know the service was available again?
  • Communications - Was a message recorded for the automated help line letting users know that you know about the problem?  Did that message get updated on scheduled?  Did it include all the information that should have been there?  Were key business groups made aware of the right problems?  Was the CIO notified that the critical systems she or he cares the most about?  

In other words, a Perfect Problem went exactly like you planned it to go.  Nothing more, nothing less.

To capture improvements to make future problems flow properly, each identified deficiency needs to be generate an appropriate improvement task with a clear description and owner.  Each task remains open, monitored, prioritized and managed until resolved, and open items regularly reviewed by management.  

In my experience the longest outages are not typically caused by unusually difficult technical problems, but deficiencies in executing the surrounding processes.  And that's totally within our control to improve upon.

Thursday, December 23, 2010

N-1-1


We've all heard of 9-1-1, the single number to reach emergency services which began way back in 1968.  But did you know that other countries use 1-1-2, 9-9-9 and a host of other numbers to reach various services.  For example, Brazil uses 1-9-0 to reach the police, 1-9-2 to contact medical services and 1-9-4 to find the fire department, and a handful of other sequences for more specialized services.  That would be too much for me to remember, particularly in an emergency situation.

You also likely familiar with 4-1-1, the short-cut to reach directory services, which can add a significant amount to your monthly phone bill if you're particularly lazy or forgetful.  But did you know that free alternatives exist?  Jingle Network's 1-800-FREE-411, Microsoft's 1-800-BING-411 and Verizon's 1-800-THE-INFO are provided free of charge, although some are advertising supported.  On the plus side, driving directions, sports, weather and other features may be available.  You won't talk to a human being, but with the recent advances in speech recognition, the computer is most likely going to get your request correct more often than not.

Now to the services you may not be familiar with, mostly depending on what part of the country you live.  

2-1-1 is reserved for community services such as affordable housing, homelessness, drug and alcohol programs and suicide prevention.  Many of these 2-1-1 services are run by a local United Way agency, as they are in the Dayton, Ohio area.   They also have a web site and a toll-free number to reach agencies when outside the local 2-1-1 dialling area.  More information on their service, HelpLink, is at www.dayton-unitedway.org/help.php.

3-1-1 is the non-emergency version of 9-1-1, but is available in only a couple dozen, large metropolitan areas.  Its purpose is to easily connect residents to city services and information.  Columbus is the only city in Ohio currently with 3-1-1 service, including their web site at 311.columbus.gov.  Examples of the many services offered include requesting a bulk trash pickup, reporting issues with street lights or pot holes, reporting an abandoned car or complaining about a barking dog.

5-1-1 gets you connected to traffic information and covers a large percentage of the United States, and a coverage map can be found at www.fhwa.dot.gov/trafficinfo/511.htm.  The only service available in Ohio serves the Cincinnati/Northern Kentucky region with an effort called ARTISMIS, which stands for "The Advanced Regional Traffic Interactive Management & Information System".  More information on the services offered are located at www.artimis.org.  You can also try out the service for yourself by dialling 1-513-333-3333.

7-1-1 is the Telephone Relay Service and is provided nationwide for the Deaf and Hard of Hearing with more details provided at 711service.com.  

6-1-1 is used to contact your phone provider.  The informational web site, www.dial611.com, provides more information.  Most locations should have this service, and it should be free

8-1-1 is the nationwide "call before you dig" number, which seeks to protect our underground infrastructure.  Its web site is www.call811.com.

A summary chart of all N-1-1 numbers is available at www.nanpa.com/number_resource_info/n11_codes.html.

Friday, April 16, 2010

Web Debugging with Fiddler

In a previous blog I mentioned the free Wireshark utility, which has been my number one debugging tool for several years.  Being able to see everything coming into and out of a PC, and having enough network background to glean the important details, has served well.  During a recent problem I needed to debug a web application that only ran using encrypted (https/ssl) communications.  Wireshark was able to show me what was happening with session setups and encryption exchanges, but all the application data was just a garble of meaningless characters.  Enter Fiddler, a free tool provided by Microsoft.  

Fiddler is a web debugging proxy that captures all http, and optionally https, data from any web application that can point to your loopback address (127.0.0.1) on port 8888 (the default).  The interface displays a list of request/response pairs and its status code on the left side of the screen and a detailed breakdown of the request and its response on the right.  There are multiple views available to visualize each request and response.  I find the Statistics view useful to see the number of bytes sent and received and its response time, and the Inspectors (Raw View) to see the gory details.  Other frequently accessed views are the Inspectors (ImageView), Filters and Timeline.  Third-party developers add more features to Fiddler and I've installed neXpert to generate detail reports with suggestions for improvements, Watcher to detect potential security issues and JavaScript Formatter to make JavaScript easier to read. 

Launching Fiddler can be via its shortcut or from Internet Explorer using the Tools ... Fiddler2 option.  Fiddler will automatically change Internet Explorer's proxy settings, which also affects all other applications, like the Google Chrome browser, that use the same settings.  Mozilla's Firefox, which does not use IE's proxy, can be controlled via an option installed in the lower right-hand corner. 

Capturing and displaying encrypted data requires changing some default options, located under Tools ... Fiddler Options ... HTTPS tab.   I suggest you read the information found at the "Learn more about HTTPS Traffic decryption and certificate errors" link.

Fiddler has a number of features, more for the professional developers and tester, that go way beyond simply displaying data.   You can set breakpoints and even fiddle (hence the name) with the data and inject your own.  But just seeing what's really going on "behind-the-scenes" can be eye-opening.

For more details, instructional videos and download, visit www.fiddler.com.