Complex SQL Query

Greetings Expert,

    Background: We have a large table of documents (1,000,000+ entries) with a variable number of text columns and bit columns. The variable number of columns is implemented by using a DocTextFields table and a DocCheckboxFields table where only the entries that contain data are listed. There can be over 100 of these variable number of columns. Please see attached Stored-Procedure.xlsx file for more detail about how these tables are defined.

    We are looking for an efficient way to pull back the data so the following is true (see Stored-Procedure.xlsx for an exact example of how we would like the resulting table to look):

        1. The FieldId of the text field or the checkbox field should be the column header of the returned table.

        2. Only the requested fields are included. Typically maybe 10 of the 100 fields are needed.

        3. Only the requested DocIds are included. The stored procedure needs to be in a form where we can add ROW_NUMBER() OVER and pull back only 25 docs at a time.

        4. It needs to be possible to sort by the TextFields or the CheckboxFields

    We are using SQL Server 2008 and have a fair amount of flexibility as to how this should be done. Any help would be greatly appreciated.

    NOTE: I have included Schema-Script.sql and Data-Script.sql for an easy way to create a test database and try out various approaches.

    Please let me know if you need additional information.


Who is Participating?
Hi Karsten,

You are correct, this data is highly normalised and makes sense if the front end allows for a high degree of configurability.

Before I tackle the question, could I just suggest you trim your example code files down by taking the "chaff" out when including them as it just takes us longer to work out what the meat of the code is. For instance your schema code boils down to just a couple of small table definitions, i.e.
CREATE TABLE DocCheckboxFields (DocId uniqueidentifier, FieldId uniqueidentifier)

Open in new window

is the only relevant piece in the first 37 lines.

So on with your question!
Dynamic SQL is the only way I can see this working. Dynamic SQL is technique where you build SQL script using SQL. i.e.
   Declare @SQL nVarChar(Max)
    Select Top 1 @SQL = 'Select [' + A.FieldId + '] = B.Text From DocTextFields'
                              + ' Where DocId = ' + @DocIdParameter
        From FieldsIncluded
    Exec sp_ExecuteSQL @SQL

Open in new window

Dynamic SQL is very very flexible BUT it is also very, very inefficient. Use it sparingly and only if there are no other alternatives.

To take that snippet further, you can loop through a table like FieldsIncluded to build the SQL script. Again Loops are bad in SQL performance terms but this is a scenario where I can see no other choice. I'd recommend creating a table variable, assuming the number of columns in the output is less than 1000. Because a table variable is assured to be in memory it keeps the loop as fast as possible. See code snippet below.

What I've shown you is just a pointer in what I think is the right direction.
It covers point 1 extensively.
Point 2 is a matter of applying the right WHERE clauses to my examples.
Point 3 is a matter of "partitioning" or "paging". I have some code where I have implemented that but it's not to hand easily.
I'll try and post that up later.
Point 4, depending on the end user viewer you use, it may be sufficient to pass it off to that tool, other wise you can build your sort order into the dynamic sql based on data or parameters passed to the routine.

Hope that gets you started!


Declare @FieldsIncluded Table(RowNo Int, FieldId UniqueIdentifier)
Declare @Row Int, @MaxRow Int, @SQL nVarChar(4000)

Insert Into @FieldsIncluded(RowNo, FieldId)
    Select Row_Number() Over(FieldId), Field_Id From FieldsIncluded

Select @Row = 0, @MaxRow = Count(*) From @FieldsIncluded
Set @SQL = 'Select [DocId]'
While @Row < @MaxRow
    Select @SQL = @SQL + ', [' + FieldId + '] = Text' 
        From @FieldsIncluded Where RowNo = @Row
Select @SQL = @SQL + ' From DocTextFields Where DocId = @DocIdParameter'
Exec sp_ExecuteSQL @SQL

Open in new window

Have you considered normalizing your data design?
karstenweberAuthor Commented:
Thanks aikimark for the feedback.

I'm not quite sure I understand what you mean by normalizing the data design. As far as I'm concerned, one of the complications with our approach is that the data is so highly normalized. Instead of storing everything in a huge non-normalized table (similar to an Excel sheet), we are only storing the fields that are not empty (like storing only the cells in an Excel sheet that contain data).

Can you give us some more details on what you were thinking?


Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

>>with a variable number of text columns and bit columns
This part of the description raised my concerns that this table is not as normalized as it could/should be.  I don't see these reflected in the two SQL files you posted.
Anthony PerkinsCommented:
>>we are only storing the fields that are not empty (like storing only the cells in an Excel sheet that contain data).<<
Right, you are going to have to look into using PIVOT to convert the rows into columns.

I thought there was some other table that we weren't seeing.  Maybe I've got an entirely wrong perspective on this problem.  (wouldn't be the first time)
Anthony PerkinsCommented:
>>I thought there was some other table that we weren't seeing.<<
I have to confess I did too.  That is until I looked at the schema.
karstenweberAuthor Commented:
Thanks chrismc. I'll take a look at your detailed response as soon as I get a chance.

karstenweberAuthor Commented:
Thanks chrismc for your input. Much appreciated.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.