Friday, March 11, 2011

Good Engineering

It's heartbreaking to watch the folks killed, wounded and displaced by the earthquake in the Pacific Ocean. The fury of nature mostly focused on Japan. Surprisingly the damage was not as much as expected in Tokyo. The big city was rattled, nevertheless, but relatively unscathed compared to the lesser known towns and villages. the casualties would have been unthinkably large had the buildings buckled in Tokyo - one of the cities in the world with highest population densities.

How did Tokyo manage to evade the inevitable? Someone twitted - Millions saved in Japan by good engineering and government building code. How profound. Not to belittle the unfathomable suffering of the victims, I couldn't help but think about the parallel to my own profession.

Engineering is not about being superficially creative; it's about reliability and trustworthiness. What good had been to build the highest tower in the world if that toppled over killing thousands and destroying far more in property? The true appreciation for engineering comes if something does not happen when things don't go as expected. When building a database infrastructure, or managing one, the true effort of a DBA is manifested in things that do not occur. Corruptions do not happen, rather than recovery being a necessity, or, security breaches never occur as opposed to scrambling to contain the damage of a breach. When things don't happen, the DBA is likely doing his or her job most effectively. It's not flashy webpages, or nice reports; it's plain simple non-events that differentiates [to borrow from the oft-repeated and near-cliche] men from boys.

I have a simple mantra (well, actually one of several) - success is not an accident; it's planned. Carefully planned engineering artifacts saved the day. Carefully planned processes save the organization from the perils of life - be it tsunamis or attempted credit card thefts. The success of the projects I execute, I believe, depends on how well it was planned - how prepared I was for all contingencies. There are three very important things in any project - details, details and details. Sometimes people around me get a little impatient that I pay too much attention to planning and details before starting the action. Well, without detailed analysis, I don't see how you can succeed in a project. Dumb luck, may be; but definitely not because of effort.

One of the other overlooked factors in success is standardization. It goes for building a good layout or an architectural plan that influences future projects. For years, I have been developing and enforcing strict guidelines in my own organization, to address this eventuality - just in case. Time and again, it has proved invaluable by preventing small and large mishaps, just like the building codes did in Tokyo.

Because of what the engineers did, Tokyo was spared; not because of its good luck. Millions of people should thank the unsung heroes that made it their mission to pay attention to the detail and plan very carefully. And scores of CEOs, CIOs and shareholders should thank the unsung heroes in their own organizations who saved them from corporate perils by making sure nothing happens.

Planning ... details ... boring; but important. Not something you will see in headlines, sadly, though.

19 comments:

oraclenerd said...

That's it...how do you measure non-events? You can't. Well, unless you can travel to alternate realities...

This is something I ponder frequently, how do you measure good planning; analysis, data modeling, infrastructure, etc? It's difficult if not impossible because you can't travel the alternate path. The only way, that I see, is to earn the trust of those above you who make the decisions.

Chen Shapira said...

There's another important lesson here:
Japan has annual nationwide earthquake drills. So when the buildings started shaking, people were well practiced in finding a safe spot. When the Tsunami warning was issues, people knew where to go and how.

I think the importance of practice is underestimated.

Chen Shapira said...

@oraclenerd

SLA is all about measuring non-events.
If you were 98% available on Q1 and 99.9% available on Q2, then you improved your engineering and therefore have less events.

Another metric I use is MTBF - mean time between failure. Also known as "uptime".
the longer you go between catastrophes, the better you are.

Arup said...

@oraclenerd Chen already addressed. Yes, SLA is a measurable metric; but it's not the only one though. One of things I employ is to have a walkthrough of the scenarios and the plan before it is executed. How many scenarios have been addressed by the planner shows the effectiveness of the planning.

Arup said...

@chen couldn't agree more. Practice is important as well; but sometimes it is not possible. But in any case, there is no scenario where a detailed planning is not possible.

oraclenerd said...

I agree that SLA is measurable, but, in the planning stage, no such thing exists. That's the crux of my issue.

I think the walk-through is a great idea (and I will steal it)...

Perhaps my question is more philosophical/rhetorical? A non-event is something that never happens. You can't measure something that never happens...right?

Chen Shapira said...

@oraclenerd

SLA is a *requirement* and therefore it must exist before planning and design.

You can't make any design decision before knowing the required availability, mostly because HA costs more money and you need an SLA to justify the expense.

BTW. One of my superpowers is availability planning - looking at system architecture and figuring out the expected availability and how it should be improved to match SLA. Maybe I should give a seminar :)

oraclenerd said...

you're right...my brain was still on "non-event."

I often forget that...because I don't currently sit-in on those meetings.

I think you should give a seminar. You'll have at least one attendee.

leonid said...

#1. The DBA occupation (especially in the US) is not an Engineering Profession by any stretch of imagination. Any Real Engineer would gladly confirm that.

There are engineers who work as DBAs. I do not know any DBAs who work as engineers.

#2. Ability to create and follow plans is just a basic trait required in many field: from cooking a dinner to building a fence.

#3. Software development without creativity is the reason why corporate IT is stagnating.

Good designers are those who make the difference (Compare Maruti and Benz).

Ability to create good designs comes from culture and education.

It takes generations to bring up a designer with good taste. Without that - it is all code copy pasting.

Which is boring indeed.

#4. Using the Japanese tragedy to bring traffic to a personal blog is shameful.


Reading:
http://en.wikipedia.org/wiki/Software_engineer

http://www.cs.usfca.edu/~parrt/doc/software-not-engineering.html

http://www.codinghorror.com/blog/2005/05/bridges-software-engineering-and-god.html

http://arstechnica.com/civis/viewtopic.php?f=20&t=863774&start=40

http://www.biblepath.com/beatitudes.html

Arup said...

@leonid

>> The DBA occupation ... is not an Engineering Profession...

I agree to some extent; but not entirely. See my next response.

>> There are engineers who work as DBAs. I do not know any DBAs who work as engineers.

Isn't that mere semantics? It's like saying there are cyclists who are also boaters; but not any of the boaters is cyclist. DBA is a more generic term - some focus on managing database; some focus on creating something out of it. Some do both. Their numbers may be low compared to the general population; but there are several. If you haven't seen one, come to Collaborate, I will introduce to you a few myself. Some even commented right here.

>> #4. Using the Japanese tragedy to bring traffic to a personal blog is shameful.

I'm more curious than hurt. How on earth did you get that opinion?

Arup said...

@chen

>> One of my superpowers is availability planning - looking at system architecture and figuring out the expected availability and how it should be improved to match SLA.

Please, please give a seminar. Not just in paid events like OOW or hotsos; but something anyone can attend.

Arup said...

@oraclenerd

>> That's it...how do you measure non-events?

I assumed your question was indeed rhetorical. I have the same question too; but mine is bordering on self pity. It's precisely the inability to measure that unfortunately steals the limelight from the planers and engineers.

[With the proviso that some people will dispute that DBAs are engineers]

oraclenerd said...

Definitely understand your original point.

I have said in the past, that if you don't know your DBA, then he/she good.

I appreciate the article. It's good food for thought. I feel your frustration, err, self-pity. :)

Arup said...

@Leonid please let me know how to contact you. Your profile is private; I can't see your full name or your email. I want to respond to your post; but don't want to do with this forum.

Rodger said...

Hey Arup,

I agree with your thoughts.

So much of what I see in IT, is just trial and error. Complexity upon complexity. Rather than simplicity. So few are thinking through the design, or the details.

As a DBA, I've been concerned with the very worst case scenarios. As a developer, I look into what would happen if the wrong data got into the fields. And without proper constraints, it so often does.

I've been writing some of my ideas. Hopefully, more soon. Here's a post that parallels what you write:
http://rodgersnotes.wordpress.com/2010/09/14/database-design-mistakes-to-avoid/

Best

Enrique Aviles said...

@oraclenerd

"Perhaps my question is more philosophical/rhetorical? A non-event is something that never happens. You can't measure something that never happens...right?"

Wouldn't the metric if a non-event be zero?

oraclenerd said...

I was thinking more of NULL or absence of value.

oraclenerd said...

Ran across this article today, follows along the theme here (the truly good aren't recognized because their stuff doesn't break): http://highscalability.com/blog/2011/4/18/6-ways-not-to-scale-that-will-make-you-hip-popular-and-loved.html

Ava Brown said...

I admit, I have not been on this webpage in a long time... however it was another joy to see It is such an important topic and ignored by so many, even professionals. I thank you to help making people more aware of possible issues. Great stuff as usual..

Translate