katrina_mc
asked on
Write a SQL query to select only rows from a table where 2 columns together are distinct.
1) The first part of the question is, how do I write a SQL query to select only rows from a database table where 2 columns together are distinct?
2)Does anyone know if there is a way to save the results of the SELECT query & insert the rows a few at a time so that the order of the rows is preserved?
Once I have selected only rows where the 2 columns together are unique, I want to insert those rows into an Oracle database. I'm migrating data from Access to Oracle & inserting it by writing a utility in ColdFusion that does all of the Selects from Access & Inserts into Oracle. The problem is, when I do an insert, I keep getting an error in Oracle that says, "maximum number of cursors exceeded" & it stops inserting values half-way through. I have no control over the Oracle database, so my only option is to make this work through ColdFusion. Since the "maximum number of cursors exceeded" error only appears after inserting many rows, I figure if I can insert a few rows at a time then it will all be inserted.
Thanks,
Katrina
2)Does anyone know if there is a way to save the results of the SELECT query & insert the rows a few at a time so that the order of the rows is preserved?
Once I have selected only rows where the 2 columns together are unique, I want to insert those rows into an Oracle database. I'm migrating data from Access to Oracle & inserting it by writing a utility in ColdFusion that does all of the Selects from Access & Inserts into Oracle. The problem is, when I do an insert, I keep getting an error in Oracle that says, "maximum number of cursors exceeded" & it stops inserting values half-way through. I have no control over the Oracle database, so my only option is to make this work through ColdFusion. Since the "maximum number of cursors exceeded" error only appears after inserting many rows, I figure if I can insert a few rows at a time then it will all be inserted.
Thanks,
Katrina
There is no such thing as an order unless you specifically put an ORDER BY clause into the query - otherwise orders are database dependant.
A distinct works on all fields in the select statement. But you probably want other columns too. SO you have to first do a
select distinct col1, col2 from table
Then, if you want other columns too, you have to rejoin that back with the original table to get the results you want - in the menatime deciding how to handle the fields that are not distinct.
A distinct works on all fields in the select statement. But you probably want other columns too. SO you have to first do a
select distinct col1, col2 from table
Then, if you want other columns too, you have to rejoin that back with the original table to get the results you want - in the menatime deciding how to handle the fields that are not distinct.
ASKER
I don't think I explained the "order by" part of the question clearly. I want to write a query that selects all of the rows from a table in MS Access where the Project Number & the Tenant Number are together unique & then isert those rows into an Oracle table. There are other columns in this table besides Project Number & Tenant Number which I also want to insert into the Oracle table.
The reason I am concerned with the order of the original Select query is because I can only insert a few rows into Oracle at a time because I'm getting a "maximum number of open cursors exceeded" error. So I need to somehow cache or save the order of the original Select query results & insert rows 1 - 10, then rows 11 - 20, etc. Without any overlapping of the rows. Does this make sense?
The reason I am concerned with the order of the original Select query is because I can only insert a few rows into Oracle at a time because I'm getting a "maximum number of open cursors exceeded" error. So I need to somehow cache or save the order of the original Select query results & insert rows 1 - 10, then rows 11 - 20, etc. Without any overlapping of the rows. Does this make sense?
I don't believe your group by will work kkhipple,
SELECT DISTINCT(Column1, Column2)
FROM YourTable
Order By Column1 DESC
What is your query? Maybe we could help you fix what you have? So you don't get the error you're seeing?
~trail
SELECT DISTINCT(Column1, Column2)
FROM YourTable
Order By Column1 DESC
What is your query? Maybe we could help you fix what you have? So you don't get the error you're seeing?
~trail
Okay but what happens when project number and tenant number are the same twice like this:
PN TN Description
1 345 Inital work
2 425 Inital WOrk
1 345 COmpleted project
then a select distinct of PN and TN is 1 and 345 but what do you do with the description field?
PN TN Description
1 345 Inital work
2 425 Inital WOrk
1 345 COmpleted project
then a select distinct of PN and TN is 1 and 345 but what do you do with the description field?
Does your query transfer more than 50,000 records in a table? I haven't had a problem transfering records before, or even building large tables. Maybe it's your SQL or an error in it giving you the issues?
~trail
~trail
SELECT DISTINCT(Column1, Column2), Column3
SELECT DISTINCT (CorNumber), (CorLevel), CorName
I just tested this, for the rows with unique values you're still going to get dups!
Example..
PN TN Description
1 345 Inital work
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888 Initial Work
But you will avoid...
1 345 Inital work
1 345 Inital work
1 345
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
2 425 COmpleted project
2 425
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888
8 888 Comps
8 888 Initial Work
So to answer this...
1) The first part of the question is, how do I write a SQL query to select only rows from a database table where 2 columns together are distinct?
You do SELECT DISTINCT (FirstDistinctColumn), (SecondDistinctColumn), ColumnThree, ColumnFour...
2)Does anyone know if there is a way to save the results of the SELECT query & insert the rows a few at a time so that the order of the rows is preserved?
can we see the query?
I just tested this, for the rows with unique values you're still going to get dups!
Example..
PN TN Description
1 345 Inital work
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888 Initial Work
But you will avoid...
1 345 Inital work
1 345 Inital work
1 345
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
2 425 COmpleted project
2 425
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888
8 888 Comps
8 888 Initial Work
So to answer this...
1) The first part of the question is, how do I write a SQL query to select only rows from a database table where 2 columns together are distinct?
You do SELECT DISTINCT (FirstDistinctColumn), (SecondDistinctColumn), ColumnThree, ColumnFour...
2)Does anyone know if there is a way to save the results of the SELECT query & insert the rows a few at a time so that the order of the rows is preserved?
can we see the query?
If you wanted to insert something like this to your table
PN TN Description
1 345 Inital work
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888 Initial Work
Without the identical numbers
You could always do..
<cfoutput query="QRY" group="PN">
<cfquery name="InsrtQry" datasource="UrDSN">
INSERT INTO NewTable (Column1, Column2, Column3)
VALUES ('#PN#','#TN#', '#Description#')
</cfquery>
</cfoutput>
But you can't remove all duplicates because some fields are different in your rows, columns may be the same, but rows of info may vary, unless the rows are duplicate, you're going to have a hard time getting distinct values on more than one column.
~trail
PN TN Description
1 345 Inital work
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888 Initial Work
Without the identical numbers
You could always do..
<cfoutput query="QRY" group="PN">
<cfquery name="InsrtQry" datasource="UrDSN">
INSERT INTO NewTable (Column1, Column2, Column3)
VALUES ('#PN#','#TN#', '#Description#')
</cfquery>
</cfoutput>
But you can't remove all duplicates because some fields are different in your rows, columns may be the same, but rows of info may vary, unless the rows are duplicate, you're going to have a hard time getting distinct values on more than one column.
~trail
By the way the above will only do something like this
Insert this...
PN TN Description
1 345 Inital work
2 425 Inital WOrk
3 555 Working
7 777 Completed
8 888 Comps
from this...
PN TN Description
1 345 Inital work
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888 Initial Work
So you can see you may be missing some info in your new tables, it all depends on what you want. :o)
Insert this...
PN TN Description
1 345 Inital work
2 425 Inital WOrk
3 555 Working
7 777 Completed
8 888 Comps
from this...
PN TN Description
1 345 Inital work
1 345 COmpleted project
2 425 Inital WOrk
2 425 COmpleted project
3 555 Working
7 777 Completed
8 888 Comps
8 888 Initial Work
So you can see you may be missing some info in your new tables, it all depends on what you want. :o)
This is why I gave that example and asked for clarification - I don't think the asker realizes that there will be 2 records even with the select distinct if the description is different....
Yup gotcha! needs clarification ;^)
ASKER
The only columns that need to be unique together are PN & TN. Description can be repeated. The actual query I'm using now is SELECT * FROM TenantData, because I don't know how to only capture the records with unique Project Number & Tenant Number. If there is more than 1 record with the same Project Number & Tenant Number, then I want to only capture the 1st one. This is what I want:
PN TN Description
1 145 Completed
1 245 Completed
1 345 Initial Work
2 145 Completed
2 245 Initial Work
2 345 Initial Work
This is what I don't want:
PN TN Description
1 145 Anything
1 145 Anything
1 145 Anything
2 245 Anything
2 245 Anything
Thanks,
Katrina
PN TN Description
1 145 Completed
1 245 Completed
1 345 Initial Work
2 145 Completed
2 245 Initial Work
2 345 Initial Work
This is what I don't want:
PN TN Description
1 145 Anything
1 145 Anything
1 145 Anything
2 245 Anything
2 245 Anything
Thanks,
Katrina
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Then you should be able to loop in CF and do the insert one at a time like so:
<cfquery name="getRecords" datasource="yourfirstdsn">
SELECT DISTINCT ProjectNumber, TenantNumber, Description
FROM TenantData
ORDER BY ProjectNumber
</cfquery>
<cfloop query="getRecords">
<cfquery datasource="yourseconddsn" >
INSERT INTO newtable (ProjectNumber, TenantNumber, Description)
VALUES(#ProjectNumber#, #TenantNumber#, '#Description#')
</cfquery>
</cfloop>
<cfquery name="getRecords" datasource="yourfirstdsn">
SELECT DISTINCT ProjectNumber, TenantNumber, Description
FROM TenantData
ORDER BY ProjectNumber
</cfquery>
<cfloop query="getRecords">
<cfquery datasource="yourseconddsn"
INSERT INTO newtable (ProjectNumber, TenantNumber, Description)
VALUES(#ProjectNumber#, #TenantNumber#, '#Description#')
</cfquery>
</cfloop>
Thatnks for the points, I hope you were able to find you answer and get the query working the way you needed. Good luck and thanks!
Best Regards,
~trail
Best Regards,
~trail
ASKER
I'm not sure if the above solution actually works, because I ended up doing it a different way because of the unique situation I had where the Oracle database was abruptly ending my insert statement with the error, "maximum number of cursors exceeded."
If it is useful to anyone, I ended up doing it like this:
<cfquery name="getRecords" datasource="yourfirstdsn">
SELECT *
FROM TenantData
ORDER BY ProjectNumber, TenantNumber
</cfquery>
<cfloop query="getRecords">
<cfquery name="getDuplicates" datasource="yourseconddsn" >
SELECT ProjectNumber
FROM TenantData
WHERE ProjectNumber = #getRecords.ProjectNumber#
AND TenantNumber = #getRecords.TenantNumber#
</cfquery>
<cfif getDuplicates.recordcount EQ 0>
<cfquery datasource="yourseconddsn" >
INSERT INTO newtable (ProjectNumber, TenantNumber, Description)
VALUES(#ProjectNumber#, #TenantNumber#, '#Description#')
</cfquery>
</cfif>
</cfloop>
If it is useful to anyone, I ended up doing it like this:
<cfquery name="getRecords" datasource="yourfirstdsn">
SELECT *
FROM TenantData
ORDER BY ProjectNumber, TenantNumber
</cfquery>
<cfloop query="getRecords">
<cfquery name="getDuplicates" datasource="yourseconddsn"
SELECT ProjectNumber
FROM TenantData
WHERE ProjectNumber = #getRecords.ProjectNumber#
AND TenantNumber = #getRecords.TenantNumber#
</cfquery>
<cfif getDuplicates.recordcount EQ 0>
<cfquery datasource="yourseconddsn"
INSERT INTO newtable (ProjectNumber, TenantNumber, Description)
VALUES(#ProjectNumber#, #TenantNumber#, '#Description#')
</cfquery>
</cfif>
</cfloop>
SELECT
DISTINCT col1, COUNT( DISTINCT col2)
FROM
tableName
GROUP BY
col1
HAVING
COUNT(team_name) = '1'
What this does is grab the distinct values from col1. And adds up all the occurences of distinct values of col2. It will only display those where the number of col2 values is equal to 1
Not sure if this is exactly right but i think i might be on the right track