lanterv
asked on
CFML - dynamic query
CF9
MS SQL 2005
I have a query that's built dynamically. I need to find ways to make it more efficient. I'm thinking that I need to eliminate as much of the CF code as possible.
One of the includes is;
../assets/cf/jobagent_quer y_usregion .cfm
I thought someone might help me get started in the right direction. Maybe using a CASE statement.
MS SQL 2005
I have a query that's built dynamically. I need to find ways to make it more efficient. I'm thinking that I need to eliminate as much of the CF code as possible.
<cffunction name="QGetCurrentJobAgentJobCount" returnType="string">
<cfargument name="MemberID" required="True">
<cfset hitcheck = 0>
<cfquery name="getagents" datasource="#request.dsn_live#">
SELECT job_agents.*
FROM job_agents
where job_agents.userid = <cfqueryparam cfsqltype="cf_sql_integer" value="#val(arguments.MemberID)#">
</cfquery>
<cfoutput query="getagents">
<cfset USREGION = #getagents.USREGION#>
<cfset USSTATE = #getagents.USSTATE#>
<cfset COUNTRY = #getagents.COUNTRY#>
<cfset SPECIALTY = #getagents.SPECIALTY#>
<cfset POSITIONTYPE = #getagents.POSITIONTYPE#>
<cfset JOBTITLE = #getagents.JOBTITLE#>
<cfset KEYWORD = #getagents.KEYWORD#>
<cfset industry = #getagents.industry#>
<cfset JOBCLASS = #getagents.JOBCLASS#>
<cfset JOBCLASS = #getagents.industry#>
<cfset agentname = #getagents.agentname#>
<cfset agentID = #getagents.ID#>
<cfset taxproID = #getagents.UserID#>
<cfset hitcheck = 0>
<cfquery name="dbcheck" datasource="#request.dsn_live#">
SELECT id <!--- , USREGION, USSTATE, COUNTRY, SPECIALTY, JOBTITLE, POSITIONTYPE, industry, JOBCLASS, Status --->
FROM board_details
WHERE 0=0
<cfif trim(USREGION) NEQ ""> AND <cfinclude template="../assets/cf/jobagent_query_usregion.cfm"></cfif>
<cfif trim(USSTATE) NEQ ""> AND <cfinclude template="../assets/cf/jobagent_query_state.cfm"></cfif>
<cfif trim(COUNTRY) NEQ "">AND <cfinclude template="../assets/cf/jobagent_query_country.cfm"></cfif>
<cfif trim(SPECIALTY) NEQ ""> AND <cfinclude template="../assets/cf/jobagent_query_specialty.cfm"></cfif>
<cfif trim(JOBTITLE) NEQ ""> AND <cfinclude template="../assets/cf/jobagent_query_jobtitle.cfm"></cfif>
<cfif trim(POSITIONTYPE) NEQ ""> AND <cfinclude template="../assets/cf/jobagent_query_positiontype.cfm"></cfif>
<cfif trim(industry) NEQ "">
and
(
<cfset icounter = 0>
<cfloop list="#industry#" index="i">
<cfset icounter = icounter + 1>
<cfif icounter gt 1>or</cfif>
industry like '%#trim(i)#%'
</cfloop>
<cfif findnocase('All',industry)>
or industry is null or industry = ''
</cfif>
)
</cfif>
<cfif KEYWORD NEQ ""> AND ((COUNTRY Like '%#KEYWORD#%') or (USREGION Like '%#KEYWORD#%') or (USSTATE Like '%#KEYWORD#%') or (USCITY Like '%#KEYWORD#%') or (POSITIONTYPE Like '%#KEYWORD#%') or (SPECIALTY Like '%#KEYWORD#%') or (JOBTITLE Like '%#KEYWORD#%') or (companyname Like '%#KEYWORD#%') or (JOBAD Like '%#KEYWORD#%') or (COMPLOCATION Like '%#KEYWORD#%') or (COMPTITLE Like '%#KEYWORD#%'))</cfif>
<cfif JOBCLASS NEQ ""> AND JOBCLASS = '#JOBCLASS#'</cfif>
AND ((Status='Phase 2 - Job Board'))
<!--- ORDER BY searchweight desc, JOBDATE DESC --->
</cfquery>
<cfset hitcheck = #hitcheck# + #dbcheck.recordcount#>
FYI; Board_Hits_By_UserHittype is a view and Board_Hits has 18,913,743 rows in it
<cfloop query="dbcheck">
<cfquery name="checklog" datasource="#request.dsn_live#">
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = #val(dbcheck.id)# and userid = #val(getagents.UserID)# and hittype = 'showjob'
</cfquery>
<cfset hitcheck = #hitcheck# - #checklog.logcount#>
</cfloop>
</cfoutput>
<cfreturn #hitcheck# />
</cffunction>
One of the includes is;
../assets/cf/jobagent_quer
<cfset USREGIONquery = "false">
(
<cfif #find("Any US",USREGION)#>
(USREGION IS NOT NULL)
<cfset USREGIONquery = "true">
<cfelse>
<cfif #find("Carolinas",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Carolinas%')
<cfelse>
(USREGION Like '%Carolinas%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Mountain West",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Mountain West%')
<cfelse>
(USREGION Like '%Mountain West%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Mid Atlantic",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Mid Atlantic%')
<cfelse>
(USREGION Like '%Mid Atlantic%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Plains",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Plains%')
<cfelse>
(USREGION Like '%Plains%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("West Great Lakes",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%West Great Lakes%')
<cfelse>
(USREGION Like '%West Great Lakes%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Southeast",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Southeast%')
<cfelse>
(USREGION Like '%Southeast%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Southwest",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Southwest%')
<cfelse>
(USREGION Like '%Southwest%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Tri State",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Tri State%')
<cfelse>
(USREGION Like '%Tri State%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Northern Cal",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Northern Cal%')
<cfelse>
(USREGION Like '%Northern Cal%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Northwest",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Northwest%')
<cfelse>
(USREGION Like '%Northwest%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Philly Up",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Philly Up%')
<cfelse>
(USREGION Like '%Philly Up%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Southern Cal",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Southern Cal%')
<cfelse>
(USREGION Like '%Southern Cal%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Great Lakes East",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Great Lakes East%')
<cfelse>
(USREGION Like '%Great Lakes East%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
<cfif #find("Hawaii",USREGION)#>
<cfif USREGIONquery EQ "true">
or (USREGION Like '%Hawaii%')
<cfelse>
(USREGION Like '%Hawaii%')
<cfset USREGIONquery = "true">
</cfif>
</cfif>
or (USREGION Like '%All%')
</cfif>
)
I thought someone might help me get started in the right direction. Maybe using a CASE statement.
ASKER
I tried moving the following;
<cfquery name="checklog" datasource="#request.dsn_l ive#">
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = #val(dbcheck.id)# and userid = #val(getagents.UserID)# and hittype = 'showjob'
</cfquery>
to;
AND ((Status='Phase 2 - Job Board'))
and (select count(id) as logcount
from Board_Hits_By_UserHittype as bh
where bh.jobid = id and bh.userid = #val(getagents.UserID)# and bh.hittype = 'showjob') = 0
and it takes many time longer to run. That makes it check board_hits for every row in board_details and board_hits has 18 million rows.
<cfquery name="checklog" datasource="#request.dsn_l
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = #val(dbcheck.id)# and userid = #val(getagents.UserID)# and hittype = 'showjob'
</cfquery>
to;
AND ((Status='Phase 2 - Job Board'))
and (select count(id) as logcount
from Board_Hits_By_UserHittype as bh
where bh.jobid = id and bh.userid = #val(getagents.UserID)# and bh.hittype = 'showjob') = 0
and it takes many time longer to run. That makes it check board_hits for every row in board_details and board_hits has 18 million rows.
I was talking about reworking it so you have less subqueries - period - and definitely not within a loop. But I know that's a big job.
Honestly the above is pretty complex logic, so it's hard to visualize the actual queries that end up executing. Making it hard to give good SQL advice. Given that I'm on a deadline unfortunately, I'm not sure how much help I'm going to be. I don't know if you saw my edit, but it might help other experts (ie draw more attention to your question) if you could post an example of the generated SQL. So other experts can see how it all fits together.
Honestly the above is pretty complex logic, so it's hard to visualize the actual queries that end up executing. Making it hard to give good SQL advice. Given that I'm on a deadline unfortunately, I'm not sure how much help I'm going to be. I don't know if you saw my edit, but it might help other experts (ie draw more attention to your question) if you could post an example of the generated SQL. So other experts can see how it all fits together.
ASKER
Just learning how to " post the generated SQL at the bottom of the .cfm page." would be a coup for me.
You have to enable debugging in the CF Admin. Then all the queries that run will show at the bottom of the page.
ASKER
Three queries were ran inside the cfouput loop;
query
RESULTSET
query
ID
1 197322
2 197332
3 197298
4 197202
CACHED false
EXECUTIONTIME 0
SQL SELECT id FROM board_details WHERE 0=0 AND ( (USREGION IS NOT NULL) ) AND ( (USSTATE Like '%California%') or (USSTATE Like '%All%') ) AND ((Status='Phase 2 - Job Board'))
query
RESULTSET
query
ID
1 197324
CACHED false
EXECUTIONTIME 0
SQL SELECT id FROM board_details WHERE 0=0 AND ( (USREGION Like '%Mountain West%') or (USREGION Like '%All%') ) AND ( (SPECIALTY Like '%US Corporate Generalist%') ) AND ( (JOBTITLE Like '%Manager%') ) AND ( (POSITIONTYPE Like '%Full Time%') ) AND ((Status='Phase 2 - Job Board'))
query
RESULTSET
query
ID
1 197258
CACHED false
EXECUTIONTIME 16
SQL SELECT id FROM board_details WHERE 0=0 AND ( (USREGION Like '%Southeast%') or (USREGION Like '%All%') ) AND ( (USSTATE Like '%Georgia%') or (USSTATE Like '%All%') ) AND ( (COUNTRY Like '%US%') ) AND ( (SPECIALTY Like '%US Corporate Generalist%') or (SPECIALTY Like '%US Corporate Federal%') or (SPECIALTY Like '%US Corporate State and Local - Sales and Use%') or (SPECIALTY Like '%US Corporate International Outbound%') or (SPECIALTY Like '%Tax Technology/Systems%') ) AND ( (JOBTITLE Like '%Supervisor%') or (JOBTITLE Like '%Staff%') ) AND ( (POSITIONTYPE Like '%Full Time%') ) AND JOBCLASS = 'corporate' AND ((Status='Phase 2 - Job Board'))
hitcheck=1
query
RESULTSET
query
ID
1 197322
2 197332
3 197298
4 197202
CACHED false
EXECUTIONTIME 0
SQL SELECT id FROM board_details WHERE 0=0 AND ( (USREGION IS NOT NULL) ) AND ( (USSTATE Like '%California%') or (USSTATE Like '%All%') ) AND ((Status='Phase 2 - Job Board'))
query
RESULTSET
query
ID
1 197324
CACHED false
EXECUTIONTIME 0
SQL SELECT id FROM board_details WHERE 0=0 AND ( (USREGION Like '%Mountain West%') or (USREGION Like '%All%') ) AND ( (SPECIALTY Like '%US Corporate Generalist%') ) AND ( (JOBTITLE Like '%Manager%') ) AND ( (POSITIONTYPE Like '%Full Time%') ) AND ((Status='Phase 2 - Job Board'))
query
RESULTSET
query
ID
1 197258
CACHED false
EXECUTIONTIME 16
SQL SELECT id FROM board_details WHERE 0=0 AND ( (USREGION Like '%Southeast%') or (USREGION Like '%All%') ) AND ( (USSTATE Like '%Georgia%') or (USSTATE Like '%All%') ) AND ( (COUNTRY Like '%US%') ) AND ( (SPECIALTY Like '%US Corporate Generalist%') or (SPECIALTY Like '%US Corporate Federal%') or (SPECIALTY Like '%US Corporate State and Local - Sales and Use%') or (SPECIALTY Like '%US Corporate International Outbound%') or (SPECIALTY Like '%Tax Technology/Systems%') ) AND ( (JOBTITLE Like '%Supervisor%') or (JOBTITLE Like '%Staff%') ) AND ( (POSITIONTYPE Like '%Full Time%') ) AND JOBCLASS = 'corporate' AND ((Status='Phase 2 - Job Board'))
hitcheck=1
ASKER
Ha! I forgot all about that. But, I can't enable debug for this particular site anyway.
How long is the above currently taking? Also how many rows in each of the related tables?
ASKER
Job_Agents - 5767 rows
Board_Details - 6930 rows
Board_Hits - 18,743,918 rows
Overall page load is 1.8 seconds
This is just the "notifications" and will be added to all of the "home" pages. There may be 6 or 7 notification checks per home page.
Board_Details - 6930 rows
Board_Hits - 18,743,918 rows
Overall page load is 1.8 seconds
This is just the "notifications" and will be added to all of the "home" pages. There may be 6 or 7 notification checks per home page.
How much of the time is spent on the Board_Hits query? IF I'm reading it right, you're only using it for counts. Assuming the dbcheck query only returns a small number of id's, I'm thinking you could run a single query instead. Just pass in a list. ie Instead of using a loop here:
<cfloop query="dbcheck">
<cfquery name="checklog" datasource="#request.dsn_l ive#">
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = #val(dbcheck.id)# and userid = #val(getagents.UserID)# and hittype = 'showjob'
</cfquery>
<cfset hitcheck = #hitcheck# - #checklog.logcount#>
</cfloop>
I *think* you could you do something like this instead:
<cfset idList = valueList(dbcheck.id)>
<cfif listLen(idList)>
<cfquery name="checklog" datasource="#request.dsn_l ive#">
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid IN (
<!--- change the SQL type if needed --->
<cfqueryparam value="#idList#" cfsqltype="cf_sql_integer" list="true">
)
and userid = #val(getagents.UserID)#
and hittype = 'showjob'
</cfquery>
<cfset hitcheck = hitcheck - checklog.logcount>
</cfif>
<cfloop query="dbcheck">
<cfquery name="checklog" datasource="#request.dsn_l
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = #val(dbcheck.id)# and userid = #val(getagents.UserID)# and hittype = 'showjob'
</cfquery>
<cfset hitcheck = #hitcheck# - #checklog.logcount#>
</cfloop>
I *think* you could you do something like this instead:
<cfset idList = valueList(dbcheck.id)>
<cfif listLen(idList)>
<cfquery name="checklog" datasource="#request.dsn_l
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid IN (
<!--- change the SQL type if needed --->
<cfqueryparam value="#idList#" cfsqltype="cf_sql_integer"
)
and userid = #val(getagents.UserID)#
and hittype = 'showjob'
</cfquery>
<cfset hitcheck = hitcheck - checklog.logcount>
</cfif>
ASKER
Too slow.
checklog1 (Datasource=ttlive, Time=130701ms, Records=108) in D:\inetpub\wwwroot\Tt.com\ notificati ons.cfc @ 12:37:11.011
select id, jobid, userid, hittype
from Board_Hits_By_UserHittype
where userid = 36075 and hittype = 'showjob'
order by jobid
opposed to the original code;
checklog (Datasource=ttlive, Time=0ms, Records=1) in D:\inetpub\wwwroot\T.com\n otificatio ns.cfc @ 12:42:09.009
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = 197322 and userid = 36075 and hittype = 'showjob'
There are a couple of indexes on board_hits. See the attached files.
index1.png
index2.png
checklog1 (Datasource=ttlive, Time=130701ms, Records=108) in D:\inetpub\wwwroot\Tt.com\
select id, jobid, userid, hittype
from Board_Hits_By_UserHittype
where userid = 36075 and hittype = 'showjob'
order by jobid
opposed to the original code;
checklog (Datasource=ttlive, Time=0ms, Records=1) in D:\inetpub\wwwroot\T.com\n
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = 197322 and userid = 36075 and hittype = 'showjob'
There are a couple of indexes on board_hits. See the attached files.
index1.png
index2.png
(EDIT)
That first query is different than what I was suggesting. It's missing the JOBID filter. That's probably why it's so slow. Also, the final query should only return the count(id). (I'm guessing you just included the other columns temporarily for debugging.)
Say you have 3 jobs to check, instead of running a separate query for each jobid within a loop:
WHERE jobid = 197322 AND.. <=== first loop
WHERE jobid = 197333 AND.. <=== second loop
WHERE jobid = 197444 AND.. <=== third loop
Run one query for all 3 job id's instead:
WHERE jobid IN ( 197322 , 197333, 197444 ) AND..
Keep all the other filters, you just want to change
WHERE JOBID = #val(dbcheck.id)#
to :
WHERE JOBID IN ( #yourListOfJobIDs# )
instead. Take another look at the second example here to see how to build the single query. Be sure to use cfqueryparam on all your variable parameters. That'll help performance if you're running this function multiple times.
Let me look over the indexes.
That first query is different than what I was suggesting. It's missing the JOBID filter. That's probably why it's so slow. Also, the final query should only return the count(id). (I'm guessing you just included the other columns temporarily for debugging.)
Say you have 3 jobs to check, instead of running a separate query for each jobid within a loop:
WHERE jobid = 197322 AND.. <=== first loop
WHERE jobid = 197333 AND.. <=== second loop
WHERE jobid = 197444 AND.. <=== third loop
Run one query for all 3 job id's instead:
WHERE jobid IN ( 197322 , 197333, 197444 ) AND..
Keep all the other filters, you just want to change
WHERE JOBID = #val(dbcheck.id)#
to :
WHERE JOBID IN ( #yourListOfJobIDs# )
instead. Take another look at the second example here to see how to build the single query. Be sure to use cfqueryparam on all your variable parameters. That'll help performance if you're running this function multiple times.
Let me look over the indexes.
ASKER
I've been testing the two queries. The original only takes seconds. The second takes minutes. Can't seem to get around it.
Are you sure you modified the query correctly? Because I don't see how this:
<!--- note this uses a COUNT, not SELECT COLUMNS .... ->
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid IN ( 197322, 197323, 197324)
and userid = 36075 and hittype = 'showjob'
would be significantly slower than the cost and o'h of running this query 3x:
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = 197322 and userid = 36075 and hittype = 'showjob'
Not unless you're passing in a lot more jobid's than shown in this thread. It shouldn't cause a significant change in the execution plan. SQL Server is usually pretty smart about indexes. When you look at the estimated execution plan in Management Studio, what does it say?
<!--- note this uses a COUNT, not SELECT COLUMNS .... ->
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid IN ( 197322, 197323, 197324)
and userid = 36075 and hittype = 'showjob'
would be significantly slower than the cost and o'h of running this query 3x:
select count(id) as logcount
from Board_Hits_By_UserHittype
where jobid = 197322 and userid = 36075 and hittype = 'showjob'
Not unless you're passing in a lot more jobid's than shown in this thread. It shouldn't cause a significant change in the execution plan. SQL Server is usually pretty smart about indexes. When you look at the estimated execution plan in Management Studio, what does it say?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I was going to post a lot of results from testing. But, nothing runs as fast as the original code. I guess I'm looking at a rewrite. Maybe the most efficient thing I could do right now is reduce the log table that has 18 million rows. I really don't know what's in it and how they use it (other than the project I'm working on). You did answer my post. Very good tips.
ASKER
Great tips.
EDIT:
It's obviously a complex query, and I'm stuck w/deadline so I may not be the one to get you through the home stretch, but ... it would help others to see an example of the SQL queries the above code produces. ie Enable debugging, run the code with some parameters, then post the generated SQL at the bottom of the .cfm page.