Link to home
Start Free TrialLog in
Avatar of AsNanda
AsNanda

asked on

Need a cursor for Select Statement

Hi I have a select statement that runs against 2 tables , these two tables have more then 5 milliion records, tableA has 4.5 mill and tableB has around 50000.
The columns in the tables are properly indexed but it takes a lot of time the way i am doing the select query.Can someone help this to make it faster.
tableA
name
age
zip
zip_4

tableB
city
zip
zip_4

I need to select the data from both the tables where the zips match but the way the data is in secod table
i hv
city   zip           zip_4
a      98765       9999
b      98765       1234
c       98765       2345
d      88989        9999
e      67676        9999

table A has the data in a different way they have all zip_4 in XXXX but not 9999

so i have to run two passes
the first select does

here is the query
select * from tableA A,table B B
where a.zip+a.zip_4=b.zip+b.zip_4

union

select * from tableA A,table B B
where a.zip=b.zip
and
b.zip+b.zip_4  not in
(
select b.zip+b.zip_4 from tableA A,table B B
where a.zip+a.zip_4=b.zip+b.zip_4
)

Can someone tell me an easier way so that query accessing 4.5 mill records doesnt take that long ..
I finaly need to build a table out of this data ... i run a truncate / insert statement .. or something ...
Thanks
Ash
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Sorry, but it looks to me like this query is the equivalent of what you are doing:

SELECT *
FROM tableA A
INNER JOIN tableB B ON B.zip = A.zip


I don't fully understand this:
>> table A has the data in a different way they have all zip_4 in XXXX but not 9999 <<


If this is incorrect, perhaps you could give some sample data from the other table and the expected results from the combined data?

Avatar of rafrancisco
rafrancisco

Try this:

SELECT Name, Age, ZIP, ZIP_4, ISNULL(B.City, C.City) AS City
FROM TableA A LEFT OUTER JOIN TableB B
                                     ON A.ZIP = B.ZIP AND A.ZIP_4 = B.ZIP_4
                       LEFT OUTER JOIN TableC C
                                     ON A.ZIP = C.ZIP
Ooops, the second LEFT OUTER JOIN should be TableB and not TableC

SELECT Name, Age, ZIP, ZIP_4, ISNULL(B.City, C.City) AS City
FROM TableA A LEFT OUTER JOIN TableB B
                                     ON A.ZIP = B.ZIP AND A.ZIP_4 = B.ZIP_4
                       LEFT OUTER JOIN TableB C
                                     ON A.ZIP = C.ZIP
Don't use
a.zip+a.zip_4=b.zip+b.zip_4
but
a.zip=b.zip and a.zip_4=b.zip_4
Avatar of AsNanda

ASKER

When i say the data is different i mean that in table B the Cities i have are

City    Zip        zip_4
City1  98765    9999
City2  98765    1245
City3  98765    2345
City4  98789    9999
City5  98666    9999

What i am trying to say is that every 4 digit zip code will belong to the same city if the Zip_4='9999' but its possible that their are zip codes(Zip_4) in their which belong to another city i.e in the example of Zip 98765 where 987651245 belong to city2 and 987652345 belong to city3. So all the Zips from 987650001-987651244 belong to City1 as Zip_4 is 9999 for City 1 but then Zip_4=1245 goes in to City2.
 So technically 9999(in table B) is just to minimize the amount of data and not include every zip code.
But my table A does not have 9999 in Zip_4 i.e an actual zip_4 is used their.

So i will have

98765     0001
98765     0002
98765     0003
....
98765     1245
.....
98789     0001
98789     0002
...
and so on
Now for me to get the exact zips and not duplicating the data the only way i can do it by doing the multiple pass i am doing...a match for all the 9 digits and get their city and then match the fives digits of the zip excluding the 9 digists i have matched
Does this makes sense?
Ash
Try this.  This is the same as the one above except that I added the 9999 in the second LEFT OUTER JOIN:

SELECT Name, Age, ZIP, ZIP_4, ISNULL(B.City, C.City) AS City
FROM TableA A LEFT OUTER JOIN TableB B
                                     ON A.ZIP = B.ZIP AND A.ZIP_4 = B.ZIP_4
                       LEFT OUTER JOIN TableB C
                                     ON A.ZIP = C.ZIP AND C.ZIP_4 = '9999'
Avatar of AsNanda

ASKER

Francisco,
The format of the table does not exactly match the way i put the table strucutes , i have more columns then just column city i want to pull in like state,timeZone,Area,Region...etc..etc from table B, how would i include those?
Ash
You can try this:

SELECT *
FROM TableA A INNER JOIN TableB B
                                     ON A.ZIP = B.ZIP AND A.ZIP_4 = B.ZIP_4
UNION ALL
SELECT *
FROM TableA A INNER JOIN TableB B
                                     ON A.ZIP = B.ZIP AND B.ZIP_4 = '9999'
WHERE NOT EXISTS (SELECT 'X' FROM TableB C
                               WHERE A.ZIP = C.ZIP AND A.ZIP_4 = C.ZIP_4)
Select * from TableA A INNER JOIN TableB B
on A.ZIP = B.ZIP and A.ZIP_4 = case B.ZIP_4 when '9999' then A.ZIP_4 else B.ZIP_4

This will force any TableA row that matches on B.ZIP and does not match on B.Zip_4 to match to the '9999' row in TableB instead.


Since IF an A.ZIP row matches a '9999' row in TableB on the ZIP field, then the exact match of the TableA row (all nine digits) will never occur in TableB.  These are two exclusive sets, or I am missing something.


I'm not sure what you are asking in the follow-up on columns for State, timeZone, unless you are still concerned about the UNION, which I eliminated.

Tom


folderol, your query will work but it will contain duplicate records from tableA.  Based on the sample data in tableB

City    Zip        zip_4
City1  98765    9999
City2  98765    1245
City3  98765    2345
City4  98789    9999
City5  98666    9999

A record in tableA where the ZIP = 98765 and ZIP_4 = 1245 will produce 2 records in the output, one for City1 and another for City2.
Avatar of AsNanda

ASKER


Yes
Francisco is right if we just do the comparison your way tom it will produce 2 rows
if tableA had jim 21 91765 1245

Name Age   City1    Zip   Zip_4
jim     21     City1   91765 9999
jim     21     City2   91765 1245

but jim can belong toonly 1 city and that is with Zip_4 1245.

I ran your query Francisco but it is not returining all the records ... Some of them are missing ..
What will happen if either a Zip or Zip_4 is null?
Ash
ASKER CERTIFIED SOLUTION
Avatar of rafrancisco
rafrancisco

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I strongly suspect this is already true, but just to be sure:
For best performance, tableB should have an index on (zip, zip_4) [in that order].
[TableA very probably does not need an index for this query.]
Avatar of AsNanda

ASKER

Yes i have the indexes Franciscos Q changed the time to 4 mins... i still hv to test Scotts query ...
I will let u guys know how it worked out..
thanks
Ash