Tuesday, June 21, 2011

ORDER BY - A Presentation Tier Function

I'm working on a project and providing some SQL optimizations for an n-Tier architecture application. Part of this optimization process is the removal of ORDER BY clauses and this got me into a discussion with the application developer about ORDER BY. His thought was that the database would be more efficient about sorting data so there are tons and tons of ordering clauses throughout the database views/queries and such.

It's not that SQL Server (or any database for that matter) is particularly efficient or inefficient at sorting but rather a question of where the task is most appropriate. More specifically I see sorting as a presentation tier activity, I'm not 100% against order by clauses and sorting results in the database but it is extra steps the database is taking to return data when those steps could be done in the presentation tier. In some cases SQL Server is doing it all, data, logic and presentation, but when you have a second application (web/thick client/etc) presenting the data to the client I think that is where the sorting should be done.

Today at the client the developer has SQL Server handling both the data and logic tiers of the architecture (with the three tiers being data, logic and presentation), this isn't a bad thing as it provides a unified interface to the system and allows for other applications to be created to interface with it. But, they also have SQL Server handling part of your presentation tier processing (the order by statements). If this is required (and with the current application design it is in the short term anyway).

One other interesting tidbit about this application's design is they have a ton of views that are all stacked on each other multiple levels deep (up to 8 in some cases). Even though we cannot remove all the order by statements from the database as it would affect the user interface, there's no need for the lower level views to be doing sorting when the levels above them are also sorting the same data. If the query isn't being called directly by the application, remove the order by clauses and let the sort process be done once at the top level with the final result.

This will also reduce the IO load on tempdb, and help resolve another issue this client is experiencing.

This guy uses an interesting analogy to describe how the sql server plans are built. If you go to the bottom of the article he talks about his thoughts on order by:


Why I Hate “ORDER BY”

We believe in division of labor in the Ozar house. When I get home from shopping, I deliver the bags into the dining room table. From there, it’s Erika’s job to put things away in the various cupboards and closets. I know that I could make Erika’s job easier if I sorted items as I loaded them into the car, but that just doesn’t make sense. It’s not efficient for me to stop what I’m doing in the middle of the shopping trip, move things around between bags, and get them in exactly the right order for efficient cabinet loading. There’s plenty of space for efficient sorting when I get home.
Likewise, I don’t want my SQL Server sorting data. The more processing that I can offload to web servers, application servers, and thick clients, the better. SQL Server Enterprise Edition is $30k/CPU, but your web/app servers are likely a great deal cheaper, and you’ve probably got much better methods of scaling out your app server tier. We just don’t have a good way of scaling out SQL Server queries until Denali AlwaysOn hits.

0 comments:

The significant problems we have cannot be solved at the same level of thinking with which we created them.
    -- Albert Einstein (1879 - 1955)