If I told you my code was running slow, what would you do? Too many people look at the source and start "optimizing".
They see:
for(int i = 0; i < array.length(); i++)
{
// do stuff
}
And say aha! Everyone knows you should be storing the value of the array length, not re-evaluating it unnecessarily!
The code gets refactored to
arraySize = array.length();
for(int i = 0; i < arraySize; i++)
{
// do stuff
}
This should help, right? It won't. Or not nearly as much as you think it will. The only way to speed code up is to profile it and remove the largest bottleneck. It's the only way. You'll never meaningfully speed up a program by guessing. You'll always be wrong. If you accept that statement you'll waste much less time optimizing uselessly. Saving a few milliseconds total on a loop is worth nothing if the database or RPC call inside is taking 2 seconds.
Performance Myths abound and performance myths are only a small part of what I wanted to talk about. The title holds true for much more than just performance metrics.
What initially prompted this post was a very brief conversation I had while reading the excellent Best Kept Secrets of Peer Code Review book. I was reading a passage out loud that had surprised me:
From Figure 24 it is clear that review size does not affect the defect rate. [Ed: the defect rate is the number of defects found per hour] Although the smaller reviews afforded a few especially high rates, 94% of all reviews had a defect rate under 20 defects per hour regardless of review size.

After reading this to her she said really? Doesn't seem like that would be the case. I thought the same. Both our intuitions was completely wrong in this case. Whether you're reviewing 100 lines of code or 400 lines of code, the number of defects you'll find in a hour is pretty much the same. Amazing.
This is why you should never, ever trust your intuition. You can't afford to. It might be right sometimes but very often it will be dead wrong. The only way to make a proper decision is to base it on facts and measurements. Perhaps these will be in line with your intuition, perhaps not. But if they are, the time spent measuring was not lost. The time spent measuring when intuition was right is the small payment you make for all the times you measure and see your intuition was completely wrong.
This particular topic is very important to me because I came very close to making a colossal mistake. PhpEd essentially saved my project back in July or August or so. When I graduated, my first task as a recent grad was "new everything". The company wanted me to redesign and reimplement all their web based software. Some of the stuff they were doing just wasn't maintainable anymore and they wanted a fresh start. A couple of others and I took on the task of designing and implementing a PHP MVC framework from scratch. We wrote our own ORM layer, had some form validation tools and some other neat stuff. When we had written our first web application with the new framework, page loads were ridiculously slow. (As in 10, 15, 20, sometimes even 30 or more seconds per page load.) We saw that the slowest pages were the pages that made heaviest use of objects so our intuition told us we were creating way too many objects and using too much memory. Our intuition was wrong. I ran the phpEd profiler over our code and noticed that something like 90 or 95% of the processing time was spent in __autoload(), a function that automatically loads a class should it not already be defined. (So you don't have to sprinkle file includes all over your source.) We changed it from a naive recursive directory search to an in-memory hash of file paths and we instantly got sub-second page loads. We were about ready to change our entire core architecture and revert to php4 style code just because of a hunch. Measurement saved me.
So, please, never trust your intuition. If you have a problem, never base a solution on a guess. Because that's all intuition really is. And like the code review book shows, this can and should be applied to much more than source code. Your process can and should be measured. Vague guesswork and tradition will not get you nearly as far as figuring what truly works and applying it.






Hi Guillaume,
Nice post! This is the author of Best Kept Secrets where the chart came from.
The reason for the defect rate being constant against quantity of code is: Regardless of the magnitude of the task in front of us, we're still just reading and understanding source code. More difficult code might take more time, but the *amount* of code doesn't change the *rate* (meaning defects/hour) at which we can understand code.
Some factors that some people have suggested are correlated with defect rate are author's experience, reviewer's experience, reviewer's training on code review, and difficulty of code. HOWEVER, in my own experience most of these factors are NOT correlated statistically. I do believe that "code difficulty" slows down review, and if the author is especially new at writing code or new in the particular development group, then rates can be higher.
Your general point about not trusting your intuition is right-on when it comes to statistics. When we did the studies in the book our initial hypotheses we were often dead wrong. Sometimes we even still thought we had something looking at the data, but statistics proved that the "pattern" was in fact an illusion, not statistically significant.
I think many mis-applications of metrics are due to people reading more into the numbers than are actually there. Its nice to see posts like this one that encourages people to seek the truth.
Finally, I'd be happy to provide you with a high-res version of that chart so you don't have to settle for a scanned-in version! :-)
Thanks.
-- Jason Cohen, http://blog.smartbear.com
yes, you cannot rely on intuition when you have allocated the time not to (rely on intuition), but all lifeforms are always mostly running on intuition. btw (re - your later post above), booze is bad for you. and the 'healthful' stuff in red wine is probably better gained by directly eating the grapeskins. we could research this last intuition-based argument of mine, but it's probably (!) more efficient to just eat grapes. :-)
Clearly, everyone knows that the loop should've been written as:
arraySize = array.length();
for (i = arraySize; i; i--)
{
...
}
And that would've solved all your bottlenecks!
Joking aside, I think an important factor behind having performant code (at least in languages such as C and C++) is *understand the code your compiler will generate*. If you don't understand what the code will be generated as, you have no hope of doing some kinds of performance work.
Profiling helps to identify bottlenecks, but once you actually have to write the code, you must understand what the actual assembly (or intermediate language) will become. And how it will be executed by the VM or the CPU. This is especially important when using optimizations flags in certain compilers. Using "const", using "volatile", using "int" vs "char", manually unrolling your loops, etc, all these are things that a programmer has to worry about when writing highly efficient code (in the case where library calls/bad design isn't your problem, but instruction-level performance bottlenecks).
--
Alex Ionescu
I definitely agree Alex. (Well, outside of something like web development anyway..)
Too many people use "premature optimization is the root of all evil" as an excuse to write lazy slow code.
You should always code with a reasonable amount of optimization in mind depending on the level of responsiveness necessary. If you're writing kernel code then yeah loop unrolling is something you might want to do from the get go if you know it'll help. If you're writing application code however unrolling a loop should be a last resort, something you do when you know the bottleneck is right there and that's how you can speed your program up. No amount of loop unrolling is going to speed up a slow php web app. A C program on the other hand...