Common Table Expressions (CTE)
A new feature to SQL Server 2005 is the addition of Common Table Expressions. You can think of CTEs as dynamic views. With a sql view, you cannot pass parameters to filter data, with CTEs you can and it does so with ease. So what is a real world application of CTEs you ask? I was recently tasked with building a web app, and in one particular part of the application I needed to traverse hierarchical data. Sounds pretty straight forward right? The data in SQL was architectured with a parent child relationship in one table. This table(Question) held the data that represented, a Question Domain, a question section, possibly subsections, and at the lowest leaf level, question(s).
So given any point in the navigation tree, I need to show where I am in the survey of questions and aggregate points based on the number of sections and questions within those sections. As I began writing a query my first stab was a stored procedure that would in turn call a scalar valued function and aggregate data for parent child relationships. That worked for small pieces of data until I had a domain that included over 100 sections each with 15 questions, needless to say that solution wouldn't have worked.
The second attempt was to aggregate the data and traverse my way back up by running a stored procedure and then in turn looping and calling my stored procedure multiple times to simulate recursion. Whereas this also worked, this was horrible on performance.
So taking the two previous concepts, I was introduced to CTEs, the dynamic view and using it to traverse hiearchical data. First a CTE was built to select the data that I wanted, so for instance
WITH domain (QuesitonId, ParentId, Description)
As
(
SELECT QuestionId, ParentId, Description FROM Question
* note: If the 'WITH' statement had been preceeded by anthing else, that precedding statement would need to be terminated by a (;). As it stands, the above will create a record set that I could query from, like a view!
next:
UNION
SELECT a.QuestionId, a.ParentId, a.Description FROM domain a WHERE (ParentID = QuestionId)
)
* at this point the record set has begun to iterate on itself and will do so until the condition cannot be met and will terminate iterating on its own.
So when the condition stops what we're left with is an in memory record set of the traversed data that we can either query against or use for another CTE if we needed. This method proved to be so much more efficient than calling my stored proc numerous times. Have fun!