Solved

Storing/Executing SQL queries in SQL table

Posted on 2013-12-15
20
307 Views
Last Modified: 2013-12-16
I have a website with thousands of members. When they log in to the website there are certain questions displayed to them. These questions need to be dynamic in terms of them only displaying questions that are relevant to each logged in user. When we create the questions, we assign a select statement to each one defining who should see it. Currently we have to loop through all of the questions, execute the custom query and see if the current session ID is amongst the result set, otherwise don't show the question on the page. The site is coded in ASP.NET however I'd like to get my head around the SQL to optimise this process so that's the focus for this question.

Tables look like this...

UsersTable                        
UserID      Name      Age      Gender      City
12345      Joe      32      Male      London
54321      Bill      44      Male      Paris
45678      Mary      24      Female      Madrid
                        
                        
QuestionTable                        
QuestionID      QText      QRY            
10      Do you travel on London Underground?      Select UserID from UsersTable where City = 'London'            
11      How often do you use the Paris Metro?      Select UserID from UsersTable where City = 'Paris'            
12      How often do you use the Metro Ligero?      Select UserID from UsersTable where City = 'Madrid'            

Current user is logged in to account and session ID is present.      
                  
The syntax of this may be wrong, but within the loop that checks each question in the QuestionsTable, we have something that essentially works like this...

select QuestionID from QuestionTable where SessionID in (EXEC QuestionTable.QRY)                        

This works, but takes way too long to run so I'm looking to optimise this process in some way.
0
Comment
Question by:guswebb
  • 10
  • 5
  • 5
20 Comments
 
LVL 9

Expert Comment

by:QuinnDex
ID: 39719871
in the example questions you have given and assuming all are related to the city in which the user lives I would group the questions by the city they apply to.

you can also do this for other questions ie gender, age etc

this would simplify your query, and allow for betting indexing of your tables (good indexing will give you the best performance boost).

ie user comes from London you would select question you have relevant to London

select * from questions where r1=london

Open in new window


if you want to add gender and age details to the answer you grab them via user id from user details after question is answered,

if you want to filter the questions by gender or age you add these filters to the question when add it to the data base with extra columns ie r2,r3 etc
select * from questions where r1=london and r2=female and r3 between 20 and 30

Open in new window

0
 
LVL 9

Author Comment

by:guswebb
ID: 39719877
Whilst the example queries are all based on city, in fact they can vary from one to the next, and be based on a combination of age, gender, city, or other attributes pulled from the UsersTable or even a join with another table.

My aim is to only show those questions to those users who are amongst the results returned from the custom query stored in the QRY field. In order to do this, each query in the QRY field has to be executed to see if the current sessionID is amongst the UserIDs that it returns. With over 200,000 users and around 1,000 questions being live at any one time, you can see the problem with looping through each custom query one at a time.
0
 
LVL 9

Expert Comment

by:QuinnDex
ID: 39719892
to look deeper into that we need to know how you determine what questions are shown, how do you decide on what questions get allocated the current sessionid

the main thing we would need to know is whats in the QRY field as your selection seems to hing on this

does each question have its own query, if this is the case then looking at some way of standardizing these queries so you only store parameters in the table and call an SP to get results might be a way forward.

but we need to know the process you use currently and table structure with sample data and results wanted
0
 
LVL 9

Author Comment

by:guswebb
ID: 39719901
Most questions are open to all UserIDs and so the custom query field will simply contain a default query of select * from UsersTable

Around 40% of the questions loaded have a custom query assigned to them so that they will only be displayed to those users who fall within the designated sub-population returned by that custom query. The contents of those custom queries could be based on around 800 fields of user-specific metadata stored across 100 tables. For example, we might only want to show a particular question to males who own a Mercedes, whilst another question might need to be shown to women aged 30 who own an iPhone. All of these additional attributes are stored against user accounts, so details of car ownership, mobile device etc. are amongst around 800 fields that exist alongside each user's account. Custom queries can therefore be pretty complex and very different to anything that is already stored in the custom query field, as they may interlock multiple fields/tables.

Determining which questions are shown is driven by a) which questions are open to an individual based on the custom queries that they would fall amongst the result set for, and b) those questions which they have not yet answered. We show 10 questions on the landing page when the user logs in, reloading the page once each question is answered and then again showing 10 available questions that they have not yet answered. Currently it is a case of executing each custom query for those QuestionIDs that the user has not yet answered and seeing if the current SessionID (UserID) is amongst the target population to view that question, then adding this QuestionID to an array of questions to show on the page.

The example tables and sample data above is representative of the process, albeit in a much more simplified state.
0
 
LVL 9

Accepted Solution

by:
QuinnDex earned 500 total points
ID: 39719920
you could run an update over night or during quite periods, that run your queries against the user db, and populate another table that associates user id and question id where the question hasn't been answered yet

this would take a bit of time to populate first run so maybe done on a development server before putting live.. subsequent updates would be done for new users or new questions, scheduled for your quite periods and would be much quicker.

your query then would just be that table for question ids to display by user id

as a question is answered drop that question id from that user

this would speed up the end user experience considerably,  and move your process load to idle periods
0
 
LVL 9

Author Comment

by:guswebb
ID: 39719926
This is where my thought process was heading, the ideal scenario being a table containing each UserID and alongside that a field that contains all of the QuestionIDs open to that person (delimited by semi-colons e.g. 18;20;40;44, would mean that the current user can only see questions 18, 20, 40 and 44). This string of QuestionIDs could be as long as 100 values, or more. Each time a user answers a question, the corresponding questionID value can be stripped from this field leaving only those other questionIDs that are open to them.

This is made slightly more complex by some questions having closing dates which means they come offline automatically. So, in that case the QuestionID would have to be stripped from up to 200,000 rows at the point it is removed. Is there an easy way to do a find/replace within a string using T-SQL? For example, if QuestionID 40 is to come offline due to its closing date having been hit, then the value ';40;' is replaced with ';' in all those rows that contain this value?
0
 
LVL 9

Expert Comment

by:QuinnDex
ID: 39719951
including colons each side of the id to be replaced will mean that you wont replace 2 digits from 3 digit numbers

you will have to add colons to beginning and end of your delimited string or it wont match first or last number, and will have to remove them after the replace

SELECT REPLACE('18;20;40;44',';20;',';');

Open in new window


easiest way to do that would be to add a second ; after the replace and then replace ;; with ''
0
 
LVL 9

Author Comment

by:guswebb
ID: 39719961
I know, that's why my example above included the semi-colon either side of the QuestionID to be removed ;o)

Good idea re: the double ';;', thanks.

How can the SELECT REPLACE be done across 200,000 rows quickly? If my field containing the QuestionID strings is called QuestionString in a table called OpenQuestions and I want to remove QuestionID 20 from all 200,000 rows in that table, would this be enough to do it...

SELECT REPLACE('QuestionString',';20;',';') FROM OpenQuestions

Is that right?
0
 
LVL 9

Assisted Solution

by:QuinnDex
QuinnDex earned 500 total points
ID: 39719967
you need to remove quotes from column name but yes that will do it

SELECT REPLACE(QuestionString,';20;',';') FROM OpenQuestions

Open in new window

0
 
LVL 9

Author Comment

by:guswebb
ID: 39719975
Great, thanks for your assistance. Will give this some testing tomorrow but am fairly sure you've helped guide my thoughts in the right direction.
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39720462
>>the ideal scenario being a table containing each UserID and alongside that a field that contains all of the QuestionIDs open to that person (delimited by semi-colons e.g. 18;20;40;44,

I would not describe a separated string containing ids as ideal at all

You would be better off with a normalized design of 2 fields e.g.
UserID AnsweredQuestionID

then something like this becomes possible
(it will list all unanswered questions for a user):

select
q.*
from questions as q
left join answered as a on q.id = AnsweredQuestionID
                                     and a.userid = 1 --<< this would be "the logged-in user"
where a.id is null
0
 
LVL 9

Author Comment

by:guswebb
ID: 39720468
Thanks, but that doesn't take in to account the crux of this issue. Identifying which questions are answered or not is very easy (whether by your method or mine). The main problem here is that each question has a custom select statement assigned to it, so identifying if a user is eligible to answer that question requires that custom query to first of all be executed. This is the process that currently takes way too long and for which I needed a solution. You've completely overlooked that being the core problem!
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39720869
You nominated the 'ideal' which includes a concatenated string. All I am really suggesting is that a concatenated string isn't ideal. My suggestion does not alter the data you store - just the way you store it.
0
 
LVL 9

Author Comment

by:guswebb
ID: 39720926
You have still not grasped the core problem. How will the array of open questions *applicable to the current user* (bearing in mind not all questions are open to all users) be determined and stored for easy access at the point that the user logs in to their account and the page needs to show only those questions that are relevant to them?
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39722617
I'm suggesting you use a normalize design instead of a concatenated string so that you don't need to use replace()

You are free to ignore that suggestion and continue with the concatenated string approach.
0
 
LVL 9

Author Comment

by:guswebb
ID: 39722667
The options are:

Concatenated string in new table containing 2 columns and 200,000 rows (1 row per User ID)

OR

Individual rows per UserID and questionID in new table containing 2 columns and 200,000,000 rows (200,000 UserIDs x 1,000 QuestionIDs)

Why would the 2nd option be better?
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39722754
>>" populate another table that associates user id and question id where the question hasn't been answered yet .." (ID: 39719920)

The table is for questions that has not been answered yet.

Let's say you have 10 unanswered questions for each user (i.e. 2,000,000 data points)
You could handle these as 200,000 concatenated strings of 10 items each, or as 2,000,000 rows if normalized.

When a question is answered by a user a record in this normalized table should be deleted.

If a question passes its due date, deleting a set of records that reference that question id will be simpler than scanning/manipulating a concatenated string.

Indexing a normalized design will be more effective for seeks than scanning concatenated strings.
0
 
LVL 9

Author Comment

by:guswebb
ID: 39722784
Point taken however at this point there are 1,000 open questions and 200,000 users so the table will have 200m rows. I assumed this would not be an efficient object to be interacting with vs. a table of 200,000 rows and simply manipulating the concatenated strings within the ASP page itself (only show questions whose IDs feature in the parsed QuestionID string for the currently logged in user).
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39722835
If you have no data to limit the result right now you would get 200m rows and while that isn't impossible to deal with it wouldn't be wonderful

but neither will 1,000 question id's in a string be wonderful.

I'll leave you to ponder this now as you seem more comfortable with the strings.
0
 
LVL 9

Author Comment

by:guswebb
ID: 39722850
I have been pondering over both routes for several days. My main objective is to avoid excessive processing time and delays when users access their accounts and this content has to load. Pulling a string from one field and parsing it just seems to me to be the leaner option but I will very likely trial both routes. Thanks for your input.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

IntroductionWhile developing web applications, a single page might contain many regions and each region might contain many number of controls with the capability to perform  postback. Many times you might need to perform some action on an ASP.NET po…
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
This video discusses moving either the default database or any database to a new volume.
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now