Link to home
Start Free TrialLog in
Avatar of nmarano
nmarano

asked on

Better Query for large number of records

Experts-

I have a query that returns about 150,000 records that I need to insert into a table in my DB.  As I run the process, it hangs the server.  

Can someone recommend a better method to deal with this?  The insert starts to process, but the server is at a crawl during this.
<cfquery name="qryCampaignStats" datasource="#application.dsn#" blockfactor="50">
		SELECT DISTINCT s.api_vendor,S.Campaign_ID, S.stats_Cost, convert(varchar,S.Period_StartDay,101) as SpendDate, cast(a.referenceID as bigint) as refID, A.kw_client_Id AS ClientId, A.Campaign_Name, A.API_Campaign_ID,
		CASE WHEN C.isOnWatch = 1 THEN C.name + ' &dagger;' ELSE C.Name END AS name 
		FROM API_Campaign_Stats S
		LEFT JOIN API_Campaigns A ON S.Campaign_ID = A.API_Campaign_ID
		LEFT JOIN Clients C ON A.KW_Client_ID = C.id
		WHERE S.period_Type = 'Day' 
		AND (S.Period_StartDay BETWEEN '#form.pFromDate#' AND '#DateFormat(DateAdd("d",1,form.pToDate),"MM/DD/YY")#') 
		AND C.BrandNameID NOT IN (107,109)
    </cfquery>
    <!------><cfdump var="#qryCampaignStats#">
    <cftry>
    
    <cfoutput query="qryCampaignStats">
    <cfquery name="addCampaignStats" datasource="#application.dsn#">
    	insert into platinumASInformation(s_api_vendor,s_campaignID,s_statsCost,s_spendDate,a_referenceID,a_kwClientID,a_campaignName,a_apiCampaignID)
        values('#api_vendor#',#Campaign_ID#,#stats_Cost#,'#SpendDate#',#refID#,#ClientId#,'#Campaign_Name#',#API_Campaign_ID#)
    </cfquery>
    </cfoutput>

Open in new window


Any suggestions would be appreciated

Nick
ASKER CERTIFIED SOLUTION
Avatar of gdemaria
gdemaria
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nmarano
nmarano

ASKER

Hey Maria-

Thanks...I'll give that a shot
Avatar of nmarano

ASKER

I use that query, "qryCampaignStats" to run query of queries off of further down in my code.  Will I still be able to use this with the insert?

Thanks
nick
No, when you add the insert, the cfquery will no longer return any values.

However, there are ways you can continue that would make the whole thing more efficient.   First do the insert/select to move all 150,000 records, then run a query to pull just the records you want instead of using a query-of-query.   The query-of-query is not efficient and is much slower than querying the database, so that will be another advantage to get rid of that, also I cannot image that you would want to fetch all 150,000 records in a query of query.
Hey nick, since the data you're working with would be so large I have a couple of recommendations: First, if you can't use gdemaria's recommendation (which is a better sql-only approach to moving all that data), you can run multiple statements in one cfquery block, this will send a bunch of transactions through one query request and will speed up your job considerably, however, you will run into cfquery's statement size limitation at some point so you won't be able to send all your 150,000 insert statements in one cfquery block.

How you approach this depends on what your needs are for this data transfer. Is this a one time thing? or is it a job that needs to be done on a regular basis?

If it's a one time thing, (and assuming I don't have access to any database tools) I would just run the transfer in smaller batches, (maybe 2000 or 5000 rows at a time).

If it's a regular thing, I would spend some time in writing a cfc to broker your transactions in a set of batches, I've done this as a function that takes a bunch of transactions at a time (a query with a specific set of rows) and builds a query with all those rows in one transaction, then submits them in one big cfquery object, the broker would break down your 150,000 row query into multiple cffunction calls of smaller batches at a time, that way it can all be run in one go and you're still within the query size limit.

Also, as gdemaria said, using query of queries should be used judiciously since there is a performance loss in using it instead of using straight query of queries.