Keep your unoptimized source code

You got to keep them by juhansonin
I had a quite interesting discussion with my coworker last week. We often have endless discussions about software development where he start by talking about how nice functional languages are and where it ends with how my knees hurt. It goes in all direction, but it is always interesting.
Back on the topic, at some point during our discussion, I told him how I think database schema architecture should work. I believe you should create your schema to be normalized from the start. BCNF is probably a good normal form. I believe you should at least target the third normal form or higher. Since the normal forms tends to make your database queries slower, if you have any performance problem you should follow these steps:
- Measure your queries’ performance.
- Find out which queries are your performance bottleneck.
- Denormalize tables that are part of those queries to gain performance if it applies.
In many cases, when you denormalize tables, you get a database schema which is less intuitive to understand and to work with. My key point in this is to keep your unoptimized normal form schema. You can have a separate branch in source control management system for your unoptimized schema and one for your optimized one. It is easy to do and it is inexpensive. You can also have a migration script which will take your unoptimized schema and transform it to the optimize form. This way you can keep developing in the reference world and have your production server run the optimized version simply by deploying your reference version plus the optimization migration script.
Why would you want to do this? Here are a few good arguments:
- It is always good to have a clean reference for some optimized part of your application. References are often much easier to read, understand and change than their optimized counter part. Just being able to understand some part faster can have a tremendous impact on your performance.
- Bottlenecks change over time. After multiple updates on your application, you might discover that your bottleneck is changing. Having the clean reference in a separate branch makes it easier to go back to it, evaluate again the new bottleneck and optimize only the new bottlenecks.
- Technology change over time. You might wake up one day to find out the new version of your DBMS optimize itself on the fly without your intervention for your previous slow queries.
This also applies to code. Optimized code is often harder to read, understand and maintain. A good example of this is unladen-swallow. It is a faster implementation of the Python language by Google. In this same spirit, their long-term plan is:
[...] to make Python fast enough to start moving performance-important types and functions from C back to Python.
Google understands the value of easy to read and easy to modify code. It is time we put more effort into that goal. Having your reference unoptimized code easily accessible is a good start to get there.