?
Solved

Indexing Strategy for Ad Hoc queries of a Data Warehouse.

Posted on 2003-03-10
4
Medium Priority
?
302 Views
Last Modified: 2008-01-16
I have a table of denormalized data which is approximately 150 columns wide. It will grow to approximately 150,000 rows or so.

The front end creates ad hoc queries which allow pretty much any column to potentially be part of the where clause.

Currently, I capture the queries that are being run, and their execution time so that I can create composite indexes for frequently run and or particularly slow queries.

A successful answer would ideally include a link or two to articles from respected sources as well as personal experiences.
0
Comment
Question by:JimV_ATL
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 3

Expert Comment

by:RiverGuy
ID: 8104440
Have you thought of use MS Analysis Services?  This would be useful if you were limiting column criteria to 1 value per column, i.e., 'WHERE State = 'CA' and AreaCode = '909'.  You would probably end up with sub-second result-sets.
0
 
LVL 18

Expert Comment

by:nigelrivett
ID: 8104470
What's the question?
Your approach seems good.
You will need indexes to support the queries but no point in indexing every column if most are not searched.

Another approach would be to not allow ad hoc queries and force the users to request search criteria - you would then build a stored procedure to support it.

I usually do this sort of thing by giving access to a search SP which is passed the search criteria (not a where clause but the conditions on columns). I can then log all the searches from the SP and put in specific code to support the most common searches. Less common ones are supported by dynamic sql.

I did have an article from a respected source (:)) but lost it when I changed ISPs.
0
 
LVL 3

Accepted Solution

by:
Frostbyte_Zero earned 1000 total points
ID: 8104964
I agree with RiverGuy. MS Analysis Server and use of OLAP Cubes would definitely be the best direction. With OLAP you define dimensions and measures. Measures are strictly numerical data that provide the data the user may be interested in analyzing. Dimensions are hierarchical data sructures that define some categorical business need. For instance, Country-State-County-City-ZipCode is a hierarchy of geographical locations. Year, Quarter, Month, Week, Day may provide time dimensions. Best of all your data is already denormalized! OLAP likes to use this type of data best.

What is your front-end? If you are using ASP, you could consider using ADOMD. That is a Multidimensional Extension of ADO. These are some links to usef ADO-MD syntax:

http://www.microsoft.com/msj/defaultframe.asp?page=/msj/0899/mdx/mdx.htm&nav=/msj/0899/newnav.htm

http://www.asptoday.com/find.asp?q=ADOMD&type=i


ADOMD uses MDX syntax. MDX is a syntax similar to SQL for OLAP Cubes:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/olapdmad/agmdxbasics_04qg.asp

A simpler approach would be to use MS Excel Pivot Table Services. Excel has a built in Pivot Table function that can connect directly to OLAP cubes either as a stand-alone application or used as a Web Component.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/olapdmpr/pt_intro_49wl.asp

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexcl2k2/html/odc_xlextendolap.asp

I have written a few applications with ADO MD using data going back to 4 years ago and utilizing 10 million rows or more. Designing your cube is the hardest part. But the beauty is, once you define it right you never have to go back to it. Just process it once in a while to keep the data current and the user can slice and dice the data anyway they want. Then you can add some data mining techiniques and monitor that instead. You will find you can get a better feel for what the user really wants after a few short weeks.

Good luck.
0
 
LVL 3

Expert Comment

by:RiverGuy
ID: 8105089
I have never used ADOMD, but from what Frostbyte is saying, it sounds good.  Another good tool (although not free, there is a free sample) is Crystal Analysis.  It looks somewhat like the cube broswer in Analysis Manager, and includes graphs.  Its somewhat of a dynamic report in that the combo boxes for selecting specific dimension criteria are part of the report file itself.

You can also install English query from the SQL Server CD.  This would allow the user to input via an intranet-based web page something like "What was the revenue for California sales in Quarter 1 of 1999", and they would have the results displayed.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to shrink a transaction log file down to a reasonable size.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question