Solved

MSSQL Loop thru entire table

Posted on 2010-11-22
10
846 Views
Last Modified: 2012-05-10
I need to loop thru an entire table consisting of about 2million rows. I need to do some text matching routines on each row.

 Is cursor a good option? What are my options besides using cursor?

Thanks
0
Comment
Question by:noobe1
  • 2
  • 2
  • 2
  • +4
10 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 34186257
can you please clarify with data samples what you need to do, actually?
0
 
LVL 9

Expert Comment

by:s_chilkury
ID: 34186269
Use WHILE instead of cursor

WHILE <condition>
BEGIN
     <query>
     <set the operator>
END

//Check the following url's for reference:
//http://stackoverflow.com/questions/834582/while-loop-in-sql
//http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=104176

0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 34186426
For SQL Server 2005/2008, I would recommend either
(1) a set-based approach (if you can express the logic in a query fashion) or
(2) encapsulate the logic in a sql function, either as a scalar or generate rows using CROSS/OUTER APPLY
Even if you cannot process all 2 million records in one go, doing batches of 10,000 or so in a set will be more efficient by far then a straight cursor/loop one-by-one.
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 

Author Comment

by:noobe1
ID: 34186484
@angelIII:

Incoming data : Johar New Town Argon Binary

My table has 5 columns:
ID int
sName varchar(200)  eg - John
sLoca varchar(200) eg London
sDeci varchar(200) eg Hydrocarbon
sGen varchar(200) eg Tri Penta

I will first need to concat the 4 varchar columns into :
John London Hydrocarbon Tri Penta

Then I run a text matching routine on "Johar New Town Argon Binary" with "John London Hydrocarbon Tri Penta" to get a percentage matched. I will need to compare the incoming data with 2 million rows of data I have in the table.

Thanks
0
 

Author Comment

by:noobe1
ID: 34186534
@s_chilkury:
Will look into that.

@cyberkiwi:
Do you think set-based approach in my situation??

Thanks
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 34186539
0
 
LVL 58

Accepted Solution

by:
cyberkiwi earned 500 total points
ID: 34186568
Yes. If it is a one-off reporting function on some industrial hardward (6+cores, SAN) I would do it all in one query over a weeknight/end in about 2 hours or so (guesstimate).  concatenating columns is trivial

col1+isnull(' '+col2,'')+isnull(' '+col3, '')...
the match routine sounds like a straight scalar function that sql server can stream and parallelise easily.

select
-- top (x)
col1.. col4, dbo.mymatchfunction(col1+isnull(' '+col2,'')+isnull(' '+col3, '')..., 'other text') as percentagematch
from tbl2milrows
-- where idfield > [whereiwasupto]

or some such
If this is run regularly or if resources are limited (keep log file small), break it into smaller chunks using id markers
0
 
LVL 15

Expert Comment

by:Anuj
ID: 34186810
Remember SQL is build for Set based operations. Set-based operations always outperform the Row by row approach.  Cursors are Evil! you can avoid by using correlated sub-queries, or by using CTE(For SQL 2005 and Higher) or prefer using table variables or temp tables(you can create Indexes on it) and while loops.

0
 
LVL 6

Expert Comment

by:subhashpunia
ID: 34186820
First keep the concatenated value John London Hydrocarbon Tri Penta in a temp table and then CROSS JOIN it with main table on some ID field and get the matching percentage by using DIFFERENCE('string1', 'steing2') function.
0
 
LVL 2

Expert Comment

by:Dinesh Subramanian
ID: 34187342
If I am not wrong, the procedure you are trying to achieve is a massive programming, and this is a product of IBM called as Quality Stage, which is used for customer and address validation. First of all you have to take the individual percentage for the values entered in to the system, i mean using the occurrence methodology.

first, take the individual comparison values and union then, into a temp table.
sName where occurance are > 0 for the values ie "Johar"
sLoca where occurance are > 0 for the values ie "New Town"
sDeci where occurance are > 0 for the values ie "Argon"
sGen where occurance are > 0 for the values ie "Binary"

now combine the strings and compare with the concatenated string to get the percentage.
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction SQL Server Integration Services can read XML files, that’s known by every BI developer.  (If you didn’t, don’t worry, I’m aiming this article at newcomers as well.) But how far can you go?  When does the XML Source component become …
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Via a live example, show how to shrink a transaction log file down to a reasonable size.

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now