Friday, January 27, 2012

An Analysis of Normalization Through the Eyes of the Developer - Part II

View Part I
 
This design is also inefficient because it puts a task in the middle tier developer’s hand that a normalized table would already be designed to handle, more to the point, a normalized table would have avoided the problem all together.  I am speaking here of adherence to the first normal form.  The current design would require the developer to manage all data going into all four language columns, per product.  This may include building a subroutine to ensure that only one of the columns in the language family of columns is a “1” and the rest are all “0”s; or manually auditing the data to ensure this; or even worse doing nothing to audit this data and running the risk of having dirty data spread across multiple columns.  A normalized solution allows SQL to naturally avoid this expensive, unnecessary development work, and all future data cleansing work.  Having multiple columns for each language also brings forth the issue of query performance to get the data out.  If the previous inefficiencies are acceptable, the development effort and speed of all queries built against this non-normalized table is guaranteed to become an issue in this design.  There is no quick way to return all records for one specific language in the current non-normalized design.  More importantly, any query that is designed to return the needed data with always be a distance second, performance wise, to a query for the same data built against a properly normalized table.  You are doomed to always have inferior query performance.


One last major issue with this design is the excess space this non-normalized design leads to using.  With one column per language, and four languages, you are essentially saving 3 pieces of data that each say, “I am irrelevant.”.  What a waste!  I am reminded of a former manager who told me once during the architecture of a notification system, “I don’t want to get an email that tells me there is not a problem.”.  The more columns that get added, the more you are guaranteeing wasted space.  As for the argument that states, “Well it’s not that much wasted space, it’s only a bit column.”, how do you think it might be perceived if the accounting department made this argument, “We know we are wasting money, but we’re only wasting a little money.”?  If that is not compelling enough look at it for the ratio of wasted space, in this case we are increasing the necessary data space requirements for this piece of data by 300%!

In Parts I and II, I have pointed out a common development scenario, and have highlighted three shortcomings or on implemented, non-normalized, solution.  In part III, I will propose a normalized solution that eliminates these shortcomings, and saves the developer time, increases application performance and reduces data storage requirements.


View Part III


- Ed Dressler, SQL Server Expert; CTT+, MCDBA, and MCITP LinkIn with Ed

Bookmark and Share

No comments:

Post a Comment