Saturday, October 18, 2014

Part 1: A Forbidden Discussion in Software Development

As a performance engineer, I feel a bit tarnished after the launch failure of healthcare.gov.  I was in no way involved and those responsible do not own the monopoly on launch failures.  But from the outside, does it really look like we know what we're doing, like we have a 'big plan?'

One thing missing from such a 'big plan' is a discussion of who is responsible for specific tasks for ensuring system performance.  We have not drawn a line for ourselves and said whether the development team or the performance tuning team is responsible for various tasks, like these:
  • Who validates that the SQL queries hit the right indexes ? (sorry, I can't do "indices")
  • Who makes sure the static data in database is cached, keeping us from fetching to death those little bits of rarely-changing data.
  • Who keeps the SELECT N+1 anti-pattern out of the code base?
And I'm just getting started, the list goes on:  pagination for long lists, memory leak detection, CPU profiling, setting throughput goals.  Which team is going to tackle these?  Dev or performance?  We need to draw the line and explicitly say who owns what.  We need to start talking about this.

I have seen dozens of development teams, in the company I work for and elsewhere, that resist being tasked with performance concerns.  Why is this?  Do we deeply yearn to be blind folded and in plaid pajamas, spewing crappy performing code that functions beautifully? If that is true, we also delight in defending these plaid pajamas not out of laziness or negligence, but out of the most noble intentions:  avoiding premature optimization.  I'm a java architect as well as a performance engineer and I'm at times guilty of this over-zealotry, too.

I feel obliged, at this point, to defend my cred as a performance engineer and put myself on the record. Wasting time on over-optimization is indeed a problem that deserves considerable concern, like setting and sticking to your performance goals.  But as an industry, to the peril of the likes of healthcare.gov, we have failed to define the middle ground, to flesh out design & development's share of the responsibility for performance.  The discussion to divvy out these chores seems oddly forbidden.

Before we draw the line, before we decide who's responsible for 'what', we first need to agree on the 'what' part.   Does a short-list really exist -- a list of performance anti-patterns that should be avoided in most cases?  If not, then perhaps we don't have a 'big plan', after all.

Keep reading here for part 2.


No comments: