Sunday, December 22, 2013

Redundancy Options


Computer systems can be architected to provide redundancy and recovery using a variety of technologies, such as Microsoft Server clustering, IBM DB2 data sharing and Cisco’s Hot Standby Router Protocol (HSRP).  The discussion can get very confusing very fast, so awhile back I made up a few terms that describes what the final result ends up delivering, which is all that really matters.

The first term is Failover, which provides a rapid switch from a failing primary service to a ready-to-go secondary service.  Failover solutions result in the user experiencing an unusually long response time and possibly the failure of their current transaction, but the user is still connected and would not be required to log back on.  Failover solutions, in my experience, work only about 50 percent of the time, the result of two causes.  First, most Failover solutions are architected using an active-passive rather than an active-active design.  This typically results in the passive side not being used for months before it’s called to active duty and for a variety of causes doesn’t cleanly accept the Failover.  The second cause is the lack of a clear, hard failure.  Failover tends to work well when the primary fails hard, such as a total hardware failure.  Failover tends to work poorly when only a portion of the primary experiences problems.  Either the Failover doesn’t get initiated at all or only a portion starts to move.  In either case you don’t get the result you need.

The second term is Fallover, as in “you fall over and get back up”, and results in the user being disconnected from the service and having to log back in again.  For example, an SAP ERP implementation typically has several application servers, and a web application has several web servers, any of which can provide service to the user.  Which one the user gets connected to is decided at login time, but in the case of that server’s failure, the user simply logs in again and a different, working server is selected.  Fallover tends to work very well because it’s a much simpler solution than Failover and less costly.  Failover usually involves twice the expense to build a fully capable secondary.  Fallover typically involves buying just one extra server, adding perhaps 10% to the total cost.

The third term is Findover, and like Fallover, is a made up word to make a series of words that are easy to remember.  Findover solutions involve finding a secondary service that provides the exactly the same thing as the primary.  A list of Domain Name Servers (DNS) provide a type of Findover.  If a PC or server can’t contact the first DNS server in the list, it tries the next one, and repeats the process until it either contacts an active server or runs out of options.  IBM Lotus Notes servers can be configured to continuously replicate data between each other and if one goes down, the Lotus Notes PC client software will automatically find one of the other replicas.

Failover, Fallover and Findover.  Hopefully an easy to remember list of options.  

And a colleague of mine made up a fourth, self-explanatory term to describe that lack of a recovery option.  

Bendover.

Say no more.

Friday, November 29, 2013

The Future of Information Technology


For a time it appeared that the Information Technology profession was dwindling, being reduced to working for a large outsourcer or technology vendor.  Off-shoring was all the craze and IT jobs appeared to be permanently lost to cheaper labor.  All of that might continue in a substantial way, but the rise of Cloud, Mobile, Social and Big Data, while still in their infancy, portend to an exciting future, albeit too fast for some and too scary for others.  

But why are these four forces unlike any other over-hyped buzz words?  For the same reason the Internet has become much more than its hype of the mid-1990's.  It was simply a common communications protocol that allowed anywhere to anywhere connectivity, just like the railroad, and then the highway, transformed where we live, where we work and all that we can experience.  The Internet, the railroad and the highway all democratized movement.  They laid the foundation for huge numbers of innovations. They were the things necessary to build our next way of living.  

The Cloud, which is simply computing power, both vast and affordable, is starkly different than traditional server farms where a fixed amount is purchased and paid for upfront.  The Cloud allows for experimentation, short-duration projects with vast requirements (e.g. hundreds of millions of visitors to the three-week Olympics web site) or needing thousands of servers for a few minutes.  The Cloud is similar to electricity and gasoline; ubiquitous, low-cost, multi-purpose energy.  Electricity and gasoline did not change the world overnight, but enabled innovations like the light bulb and the automobile, which, in time, changed our everyday life.  

Mobile is taking that computing power and connectivity with you wherever you go.  I liken this to the automobile and the airplane, which allowed our physical bodies to go places in minutes or hours, and at a far lower cost than their predecessors.  Mobile is at its infancy.  Sure we've had laptops and cell phones for most of our working lives, but laptops had limited connectivity and cell phones had limited applications.  That changed with the iPhone and its App Store, a short five years ago.  And while our personal lives may have changed significantly, it's just beginning to change our work lives. We currently have business processes built on the old computer-on-a-desk model and a large investment in those systems.  As our imaginations begin grasping how we can blow up the old rules, just like cars and planes changed our view of the bottlenecks of distance, we can expect how we work will change dramatically.

Social is about staying connected with hundreds, thousands or millions of family, friends, customers or business partners than ever before was possible.  How many classmates from grade school do you still have any relationship with?  High-school?  College?  At most, probably a few, unless you went through, or are still in, school.  Our previous generations had to write letters or make expensive phone calls.  It took a lot of time to share information on a one-to-one basis.  You might even add a photo to your letter to describe a particularly striking vacation spot.  But more likely your friends became the group you physically interacted with, and if you moved to another city, most of those friends dropped off and were replaced with new friends at your new physical location.  The Social technologies like Facebook, Twitter and text messaging allow you to remain connected and engaged with far less effort than before.  This may be the one that ultimately changes the world on a greater scale, just like the creation of language and the telephone brought the world closer together.

Big Data is about being able to quickly process vast amounts of data and could have a similar impact as the invention of the printing press and paper-making, which allowed for the storage and retrieval of vast amounts of human knowledge, but which are still gated by our human limitations.  We can now store, process and harvest insights from the information produced by our computer systems, medical sensors, manufacturing equipment, tweets, posts and many other sources.  And it will take big thinkers to gain new knowledge from our Big Data.  Perhaps that’s I.T.’s true future.  I think it should be.

Autos replaced horses, electricity replaced candles, the telephone replaced the Pony Express and books replaced scrolls.  The future of I.T. looks brighter than ever.  

But it won’t be hardly recognizable.

Should be loads of fun.

Thursday, December 27, 2012

Google BigQuery


During the recent Google I/O 2012 conference I watched one of the keynote sessions from the comfort of my favorite web browser and was introduced to their BigQuery service, which is the public version of Google Dremel, their internal tool for analyzing large datasets.  I was intrigued by the demonstrations on a dataset of 137 million records with query response times in the 3-5 second range.  But was this like the tomato-slicing machines hawked on television that work great for their well-practiced spokesperson, but do a better job of making tomato juice in my kitchen?  But if this few order of magnitude difference in performance was real, it could be a great benefit, and since the cost to try it out amounted to pocket change, I decided to see for myself.

First a little background on the three key differences between using BigQuery and the familiar relational database technology.  BigQuery uses a table scan for everything.  No indexes or other mechanisms to write data to disk in a manner that may help later retrieval.  Its those key differences that make this happen with great speed.  

The first difference is using a column-oriented database approach, which simply is writing a table to disk column by column instead of row by row.  Row by row is great for finding one or a few rows, like is typically needed for executing transactions, but would require reading the entire table to read a single column.  By storing the data column by column, an analytic query can just read the columns requested, greatly reducing the amount of data that needs to be processed.

The second difference is a high degree of compression.  Since the data in a column is the same type and frequently contains large amounts of duplicates, it’s much more likely to compress well, quite often in the 10-to-1 range.  So for example, say we have a 100GB table with 100 equally-sized columns and 10-to-1 compression and we run a query retrieving 5 columns.  Instead of reading 100GB we read just 500MB, a considerable improvement.

The third difference is the number of servers that participate in the query.  While Google doesn’t comment on how many servers a query will be spread across, and it likely will vary on the size of the table queried and other factors, they use enough that the resulting response time stays so fast that people are motivated to use it alot.  It’s a simple equation.  The more you use, the more money they make, and the faster it performs the more you’re likely to use.  

For my test case I had 87,232,116 records consisting of 139 columns, for a total of about 45GB of data.  I’m not saying this is “big data”, but it’s large enough to be interesting and this had never before been attempted before due to performance concerns.  I compressed the data into gzip (.gz) files no larger than 1GB each, uploaded them to Google Cloud Storage and imported them into BigQuery using their Python-based BQ command line tool.  There are a few other setup steps that preceded this and the data was already in a form, pipe-delimited, that was compatible.  Then using the BigQuery web browser interface (bigquery.cloud.google.com) I ran several dozen queries, none that took more than 5 seconds to complete.  I also downloaded their Excel add-in which allows queries to be executed from inside a spreadsheet, with equally impressive results.

The cost to use BigQuery is straightforward.  Twelve cents ($.12) per month per GB stored and three and a half ($0.035) per GB scanned.  The first 100GB scanned per month is free.  So my testing cost $5.40, all in storage costs.  No really a bank breaker.

Tuesday, October 9, 2012

The Outcome-Value Statement Revisited



Awhile back I wrote a blog on a method I use to gain clarity on requirements by describing an Outcome, which is simply what you are trying to accomplish without describing how you plan to get there.  These Outcomes are what drive Values, which is an change in cost or revenue, service provided or risk present.  When that's clearly understood, then the best Projects, Requirements or the Activities that lead to the Outcome can be determined.  As I describe this framework to people, I find that I need to provide examples that are more easily understand.  To that end I turn to the game of golf.

I find golf more of a mental game than a physical game, which is good for me since I'm not particularly physically gifted (or maybe not at all) nor a natural born golfer.  But I can hit a variety of shots: high, low, fade, draw, etc..  And this is simply a matter of physics.  A golf ball only responds to the limited number of forces you can impart on it, and after it leaves your club head, gravity, air and the ground take over.  The ball will react to where on the club face you strike it, and the forces of direction, velocity and acceleration in each the x-axis, y-axis and z-axis.   So you only have ten Outcomes to consider when making a shot.  When you achieve those Outcomes, you’re ball will very predictably go exactly where, and how, you wanted.  It doesn’t matter if your Jack Nicklaus, Happy Gilmore or a pudgy fifty-something. The golf ball isn’t looking at you.  It just responds to the club.

Now let's turn to the shot you're attempting to hit.  This is the Value.  Let's say you're on the tee of a long, dog leg left par 5.  A long tee, shaped right to left would be of Value.  So you swing hard and hope you're normal slice magically disappears this one time.  As is typical, when you don't want a slice, you get a bigger one.  Now you're in the woods and if you're lucky you find your ball and chip it back in the fairway.  And swear next time you'll tee off with an iron instead.  Or maybe you just stop at the swearing.

What was missing from this errant tee shot was a description of the Outcome you needed in order to achieve your Value.  Hitting a long draw requires some specific Outcomes.  The most important Outcomes is this case are (1) hitting the ball on the sweet spot, (2) having the club head square at impact, (3) the club head having a high velocity, (4) the club head moving from left to right relative to the ball and (5) the club head having a moderate rate of acceleration.  These five Outcomes will cause the ball to fly a long way due to having the proper trajectory, a large amount of energy and some counter-clockwise spin (as viewed from above the ball).  How you archive these Outcomes is irrelevant.  You can hit the ball with a tin can, a baseball bat or a golf club and the ball will fly exactly the same way, given the same Outcome is achieved.

Armed with this viewpoint, you then can begin to try to figure out how you can accomplish this Outcome, or maybe just come to the conclusion you just can't that shot no matter how hard you try.  If you can't hit the ball far enough, having a successful draw on the ball simply puts you behind some trees with no direct second shot.  If you just can't hit a draw to save you life, a long tee shot puts you though the fairway and no better off.  For example, I really struggle to hit a fade, which for me, a right-hand golfer, means the ball goes left to right.  I know that's because I stand farther from the ball than most people, and that makes it next to impossible to come across the ball from right to left and cause the Outcome, a clockwise spin, that I know I need.  So I don't try to hit that shot.

Another example occurred a number of years back. Towards the end of the round, one of my playing partners was about forty yards behind a fairly tall tree.  He elected to chip back out to the fairway.  I told him that the tree wasn’t really in the way.  To demonstrate, I dropped an extra ball and launched an 8-iron over the top of the tree.  He was amazed, mainly because he noticed that I normally have a lower trajectory on my iron shots.  He asked me how I did it.  I responded “I accelerated through the ball”.  That causes the ball to stay in contact with the club head a little longer, which in turn causes the ball to roll up the cub a bit more and get more benefit from the loft of the 8-iron.  That’s the outcome I wanted.  How did I do it?  A shorter back swing and a longer follow through.  For me, that combination results in the needed acceleration.  But again, the Outcome matters, not how you get there.  

So in golf, I suggest studying the physics of the game first and clearly understand what makes the ball do what it will do.  Only then begin figuring out the hows, like the grip, elbow and stance. Getting a firm grip on Outcomes will make you more effective, at work or on the golf course.

Tuesday, October 2, 2012

Do More With Less


If I hear another executive spout that we need to “do more with less”, I might just scream.  Apparently they don’t read their audience’s reactions, which range from rolling eyes to demoralization to demonization.  It says to them that management can’t figure it out, so we’ll be cutting spending and working longer hours.  Or perhaps like in the movie Ben Hur, we’ll force the slaves to make bricks without straw.  I’ve never seen it inspire the troops or become the rallying call to action.  

But what are they trying to say?  We have to become more productive.  That simple.  Not any more enlightening perhaps, but at least it’s a better starting point for a conversion. And it’s a lot less threatening, so perhaps some folks will start to engage to figure out how to measure productivity and improve upon it.  I’ve been around long enough to see a few ways that might help you in figuring your  plan out.

Back in the early 1980’s, our company had a small round of layoffs, and our department had to reduce its staff by seven people, roughly a 5% decline.  That’s certainly fits the “with less” side, but it would result in doing “a little less”.  So we used this opportunity, with the company’s approval, to reduce staff by fifteen and then hired eight new people.  So the bottom 10% were let go and a more talented 5% added.  At the end of the process we were more productive.

Information Technology have a built-in advantage in that hardware and telecommunication costs decrease year-over-year at roughly the pace described by Moore’s law, which was an observation made by Gordon Moore, Intel’s co-founder.  He thought the number of transistors on an circuit would double every two years for the next ten years.  It turns out that prediction has lasted for the last forty years without any obvious end in sight.  This results in an exponential curve of productivity and the built-in IT advantage, at least if you’re prepared to take advantage.  I ran a network group for several years at a company roughly the size of my current company.  Over the twenty year span of then to now, the network budget decreased about 80% and where a T-1 line (1.5 Mbp/s) was considered top-of-the-line, we routinely deploy lines 10 to 30 times faster.  Phone calls used to be $0.25 per minute; now they are under $0.02 per minute.  Showing improved productivity as the network group manager was a pretty easy task and it funded other parts of the IT group to tackle new projects.  

My advice for improving productivity is starting with throwing out costly, time-consuming, lower-value work and taking some of that savings to fund higher-value projects, particularly those projects that need early funding to create lower costs in the future.  You’ll have to have a very good understanding of your costs, be willing to change anything, break off from traditional vendors relationships and invest in choices that prepare you for lower costs later.  But that can be a lot of fun.  A lot more than making bricks with no straw.

Friday, September 14, 2012

True Collaboration


The finest example of true collaboration I had the pleasure to witness took place during an SAP project years back.  It started out simply enough when I needed to produce some statistics from SAP but lacked the location of each user.  We needed to get that data into each SAP user’s profile from some external source.  A simple problem statement, yet lacking a simple answer.

The solution presented itself through a most unlikely collaboration source: a simple email stream.  It started with an idea that would work, but at a fairly steep cost, in the six figure range.  That first email was sent to about a dozen people.  A short time later, someone else improved on the first idea.  Then another, and then another.  Somewhere in the middle of the ten or so emails that eventually became part of a stream of ideas, I improved on the idea.  And then my idea was further improved.  At the end of the stream, the final idea would take a couple hours of time and no further outlay of dollars.  I sat amazed at this string of creativity and the fantastic solution.  

Then I took a step back and realized how fantastic and unique this was from a people standpoint.  And how the credit for this was not just in the final idea’s creator, but everyone involved.  

The person submitting the original idea took what most people would consider a bold and probably an unwise risk.  But it took some uncommon bravery to write down a well thought out idea for a group of bright people to critique.  That bravery cannot be understated, and even though we all knew each other pretty well, it can still be a risky thing to do, particularly to one’s own ego.  But without that start, it’s likely the problem would not have been solved.  He deserved a special thank you and a nice chunk of the credit.

Then there were the group of people, myself included, that incrementally improved on the first idea.  And although we also didn’t find the final solution, we kept the energy alive and the ideas flowing.  Each of us deserve some of the credit for getting to the final solution.  

The person with the final solution certainly deserves their share of credit.  They designed a very elegant solution that was quick to implement and at only the cost of a couple hours of time.  

Bravery, energy and ideas are the lifeblood of collaboration, not cool social media tools.  Start and end with people.  Give credit to everyone that participates.  The rest (tools) will take care of itself.

Monday, July 9, 2012

5-50-500-5000

I mostly hear complaints about Change Management.  Too much paperwork, too many meetings and the process slows everything down.  All true, and if the Sarbanes-Oxley legislation didn't exist, I believe most IT departments would have abandoned the process years ago.  And that would have been an absolute shame.

A decent Change Management process is trying to tell you how risky the change you're making is in terms of the impact to the business if things go wrong.  It's also trying to match the level of testing, backout planning, etc. to mitigate that risk.  And the underlying cause in many instances is that we've designed our technology solutions as "big bang" implementations.  Have you noticed how big Internet and mobile device companies implement their changes?  They typically have a beta program which engages risk-tolerant people first.  When a few cycles of this passes and the known bugs are worked out, they begin a slow trickle of upgrades, ready to halt the process at a moments notice.  When all is good for a decent chunk of their users, they upgrade the remainder in short order.  They've avoided the "big bang" approach, shortened their cycle times and not upset their customer or their business.  

Sure, you say, they have advantages internal IT shops don't have, and in some cases that's true.  But in many other cases we do, and just haven't.  Which brings me to the title of this blog, 5-50-500-5000.

Years ago our email system, Lotus Notes, became an increasingly important service.  New software releases came out frequently and offered compelling new features and performance gains, and we wanted to deploy them as quickly as possible.  But with 5000 email users on a single system and everyday business counting on it, any major change was a very high risk.  A test system was of little help, since a few technical people could not adequately test everything, primarily because 5000 users do a lot of different things.  Our solution was to break up the email system into four partitions, while still making it appear as a single email system.  The first partition held about 5 users, just members of the core technical staff.  The second held about 50 users, a mixture of IT and risk-tolerant users.  The 500 system was a broad, representative set from across the entire organization.  The remainder fit into the 5000 set.  

The risk, to the business, of upgrading either the 5, 50 or 500 groups was very tolerable.  When it came time to upgrade the 5000 group, we had reasonable assurance that things would go smoothly, again, with respect to the entire business.  These Change Management meetings typically went smooth and short, which they should when the risk was largely mitigated by the facts at hand.  The real victory were users that experienced few problems and enjoyed new email features.

Listen, learn and adapt your IT services to what your processes are telling you.  Your customers, and yourself, will benefit from the results.