stopm
asked on
Sql (oracle) select distinct on two rows and order by count
Hi
In one table i have two columns 'Names' and 'Numbers' .
There are multiple rows for the unique 'Names' and they can have with different 'Numbers'
eg
Names | Number |
John 1234
John 1234
John 1234
John 4567
John 6789
I want a distinct list of the 'Names' and a count of how many different 'Numbers' each name has ordered highest to lowest so in example above
Names | Count
John 3
There is also a stored date field so only want results for last 24 hours eg stored > sysdate -1.
Preferably though i would like to include the 'Numbers' field as well but still order by highest count first
eg
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765
Probably a real simple one but combinations i tried didn't get results i thought they would.
In one table i have two columns 'Names' and 'Numbers' .
There are multiple rows for the unique 'Names' and they can have with different 'Numbers'
eg
Names | Number |
John 1234
John 1234
John 1234
John 4567
John 6789
I want a distinct list of the 'Names' and a count of how many different 'Numbers' each name has ordered highest to lowest so in example above
Names | Count
John 3
There is also a stored date field so only want results for last 24 hours eg stored > sysdate -1.
Preferably though i would like to include the 'Numbers' field as well but still order by highest count first
eg
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765
Probably a real simple one but combinations i tried didn't get results i thought they would.
oops... missed some requirements. The sample data threw me off.
I'm confused.
You first want the name and a count of distinct numbers then a list of names and distinct numbers.
Which is it? I see these as two different selects.
Can you clarify?
I'm confused.
You first want the name and a count of distinct numbers then a list of names and distinct numbers.
Which is it? I see these as two different selects.
Can you clarify?
OK, see if this test case gives you what you need.
It doesn't have the date logic but I provided that in the post above.
If there are problems, please add to the test case and explain what you are adding and why.
It doesn't have the date logic but I provided that in the post above.
If there are problems, please add to the test case and explain what you are adding and why.
drop table tab1 purge;
create table tab1 (names varchar2(5), mynumber number);
insert into tab1 values('John',1234);
insert into tab1 values('John',1234);
insert into tab1 values('John',1234);
insert into tab1 values('John',4567);
insert into tab1 values('John',6789);
insert into tab1 values('John',5);
--
insert into tab1 values('Steve',2);
insert into tab1 values('Steve',3);
insert into tab1 values('Steve',3);
commit;
select names, max(number_count) from
(
select names,
mynumber,
count(*) over(partition by names, mynumber order by mynumber) number_count
from tab1
)
group by names
order by 2 desc
/
select names, mynumber from
(
select distinct names, mynumber, number_count
from
(
select names,
mynumber,
count(*) over(partition by names, mynumber order by mynumber) number_count
from tab1
)
order by names,number_count desc
)
/
I'm not an Oracle expert, this works in MS SQL Server (using a derived-table):
SELECT DISTINCT
a.Name
, a.Number
, b.RowCount
FROM SomeTable a
JOIN ( SELECT Name
, Number
, COUNT(*) as RowCount
FROM SomeTable
GROUP
BY Name
, Number
) b ON a.Name = b.Name
AND a.Number = b.Number
OUTPUT:
Name Number RowCount
------ ------ --------
John 1234 3
John 4567 1
John 6789 1
Steve 2312 1
Steve 4765 1
>>I'm not an Oracle expert, this works in MS SQL Server (using a derived-table):
Syntax also appears to work in Oracle in my 11gR2 database.
It hits the table twice which could be an issue. The examples I posted should only hit it once.
Syntax also appears to work in Oracle in my 11gR2 database.
It hits the table twice which could be an issue. The examples I posted should only hit it once.
This variant also works in 11g (no nesting)
SELECT DISTINCT
name
, mynumber
, count(*) over(partition BY name, mynumber) AS number_count
FROM Table1
ORDER BY number_count DESC
but you do get all 3 columns returned
{+edit, order by isn't required in the count}
SELECT DISTINCT
name
, mynumber
, count(*) over(partition BY name, mynumber) AS number_count
FROM Table1
ORDER BY number_count DESC
but you do get all 3 columns returned
{+edit, order by isn't required in the count}
ASKER
Hi Slightwv,
So going back to your first sql will this one count blanks as i hadn't realised that some rows have a Name but no Number. (i don't want to count the blanks)
eg
Names Numbers
John 1234
John
John 1234
As for the other question, yes they are two different statements and my preferred one would be to return the names and the numbers with the Name that has the highest amount of combinations at the top of the list.
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765
Just going to try the other select statement you sent.
Thanks
So going back to your first sql will this one count blanks as i hadn't realised that some rows have a Name but no Number. (i don't want to count the blanks)
eg
Names Numbers
John 1234
John
John 1234
As for the other question, yes they are two different statements and my preferred one would be to return the names and the numbers with the Name that has the highest amount of combinations at the top of the list.
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765
Just going to try the other select statement you sent.
Thanks
yes, nulls will be counted
Names Numbers
John 1234
John
John 1234
should produce
John <null> 1
John 1234 2
Names Numbers
John 1234
John
John 1234
should produce
John <null> 1
John 1234 2
>> (i don't want to count the blanks)
Add "and numbers is not null" to the inner query?
If the column contains spaces:
and trim(numbers) is not null
Add "and numbers is not null" to the inner query?
If the column contains spaces:
and trim(numbers) is not null
ASKER
Sorry for delay in replying- Easter break
Most of the above queries give me a total count of all rows in the table of Names and Numbers but I need just the unique combinations of Names and Numbers ordered by the highest count of these first (as per example below count below 4,3,2)
Note:- The table has many more rows where the Names and the Numbers are identical but i don't need to know this number only the distinct pairings.
Eg:-
Names | Numbers | Count of unique combinations
John 1234 4
John 4567 4
John 6789 4
John 9876 4
Steve 2312 3
Steve 4765 3
Steve 5678 3
Colin 7777 2
Colin 6666 2
This query below from 'slightwv' does give me a count of the unique combinations of Names and Numbers but would be good if i can get as above and include the Numbers field as well or even just the Names and Numbers without the count field providing they are ordered in highest number of distinct combinations descending:-
select names, count(numbers) from ( select distinct name, numbers
from tab1 where numbers is not null))
group by names order by count(numbers) desc
Names | Count
John 4
Steve 3
Colin 2
Thanks, and hope this makes sense.
Most of the above queries give me a total count of all rows in the table of Names and Numbers but I need just the unique combinations of Names and Numbers ordered by the highest count of these first (as per example below count below 4,3,2)
Note:- The table has many more rows where the Names and the Numbers are identical but i don't need to know this number only the distinct pairings.
Eg:-
Names | Numbers | Count of unique combinations
John 1234 4
John 4567 4
John 6789 4
John 9876 4
Steve 2312 3
Steve 4765 3
Steve 5678 3
Colin 7777 2
Colin 6666 2
This query below from 'slightwv' does give me a count of the unique combinations of Names and Numbers but would be good if i can get as above and include the Numbers field as well or even just the Names and Numbers without the count field providing they are ordered in highest number of distinct combinations descending:-
select names, count(numbers) from ( select distinct name, numbers
from tab1 where numbers is not null))
group by names order by count(numbers) desc
Names | Count
John 4
Steve 3
Colin 2
Thanks, and hope this makes sense.
ASKER
Increased points as it looks a bit more complicated then i thought originally
Sorry but I still don't understand.
Looks like a simple tweak to the second query in http:#a40005407 returns the numbers and count.
Change:
select names, mynumber from
To:
select names, mynumber, number_count from
Making that change with the sample data above I get:
Also looks like the example from John_Vidmar also shows similar results even though it hits the table twice.
Instead of trying to explain what you want please provide sample raw data and expected results. Similar to the test case I posted. It has the base table and sample data. then all we need to see is the actual expected results.
Looks like a simple tweak to the second query in http:#a40005407 returns the numbers and count.
Change:
select names, mynumber from
To:
select names, mynumber, number_count from
Making that change with the sample data above I get:
NAMES MYNUMBER NUMBER_COUNT
----- ---------- ------------
John 1234 3
John 5 1
John 4567 1
John 6789 1
Steve 3 2
Steve 2 1
Also looks like the example from John_Vidmar also shows similar results even though it hits the table twice.
Instead of trying to explain what you want please provide sample raw data and expected results. Similar to the test case I posted. It has the base table and sample data. then all we need to see is the actual expected results.
count() over( ... order by ..)
there is no point or need for order by inside that over(...)
and nesting of the query isn't required either because distinct is a row operator (it's applied after all rows are known)
there is no point or need for order by inside that over(...)
and nesting of the query isn't required either because distinct is a row operator (it's applied after all rows are known)
SELECT DISTINCT
names
, mynumber
, count(*) over(partition BY names, mynumber) AS number_count
FROM Tab1
ORDER BY names, number_count DESC, mynumber ASC
**[Results][2]**:
| NAMES | MYNUMBER | NUMBER_COUNT |
|-------|----------|--------------|
| John | 1234 | 3 |
| John | 5 | 1 |
| John | 4567 | 1 |
| John | 6789 | 1 |
| Steve | 3 | 2 |
| Steve | 2 | 1 |
[1]: http://sqlfiddle.com/#!4/0eddb/3
ASKER
Thanks slightwv
Ref your comment :-
Regarding the second query in http:#a40005407
Yes that query does return the results i want , except it does not order them by highest number of unique combinations 1st it orders them by the Names field descending.
(In my table the Names are actually stored as digits to represent peoples names (eg John = 04, Steve =02 etc.. Thought it would be simpler to use names in example but maybe that was not such a good idea!)
Sorry actual data from my table is a bit sensitive to post on line.
-------------------------- ---------- ---------- ---------
Making these changes below only gave me count of the number of rows which i am not interested in. eg I don't need to know that John 1234 has 3 entries. I only want it to be above Steve because it has 4 unique entries and Steve only has 2
NAMES MYNUMBER NUMBER_COUNT
----- ---------- ------------
John 1234 3
Change:
select names, mynumber from
To:
select names, mynumber, number_count from
Thanks again
Ref your comment :-
Regarding the second query in http:#a40005407
Yes that query does return the results i want , except it does not order them by highest number of unique combinations 1st it orders them by the Names field descending.
(In my table the Names are actually stored as digits to represent peoples names (eg John = 04, Steve =02 etc.. Thought it would be simpler to use names in example but maybe that was not such a good idea!)
Sorry actual data from my table is a bit sensitive to post on line.
--------------------------
Making these changes below only gave me count of the number of rows which i am not interested in. eg I don't need to know that John 1234 has 3 entries. I only want it to be above Steve because it has 4 unique entries and Steve only has 2
NAMES MYNUMBER NUMBER_COUNT
----- ---------- ------------
John 1234 3
Change:
select names, mynumber from
To:
select names, mynumber, number_count from
Thanks again
>>" except it does not order them by highest number of unique combinations 1st it orders them by the Names field descending."
a. that is trivially easy to fix, just alter the order by clause to suit
b. that is what you actually asked for in the question: the provided list is ordered by name first
a. that is trivially easy to fix, just alter the order by clause to suit
SELECT DISTINCT
names
, mynumber
, count(*) over(partition BY names, mynumber) AS number_count
FROM Tab1
ORDER BY number_count DESC, names, mynumber ASC -- change line this to suit
, andb. that is what you actually asked for in the question: the provided list is ordered by name first
Preferably though i would like to include the 'Numbers' field as well but still order by highest count first
eg
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765
ASKER
Hi PotrletPaul
Your query is returning a count of all the rows as opposed to only a count of the unique combinations of Names and Numbers
eg. There are thousands of rows for John 123, John 4567 and John 6789 in the table but i don't need a count of all these rows i just need a count of the unique parings , in this example would = 3.
I only ever wanted the list ordered by highest number of these unique combinations 1st but grouped together by Names. So if 'Steve' has got more unique combinations then 'John' put all of 'Steve's' unique combinations at the top of the list above 'John's'
Sorry if that was not clear in my statement
Thanks
Your query is returning a count of all the rows as opposed to only a count of the unique combinations of Names and Numbers
eg. There are thousands of rows for John 123, John 4567 and John 6789 in the table but i don't need a count of all these rows i just need a count of the unique parings , in this example would = 3.
I only ever wanted the list ordered by highest number of these unique combinations 1st but grouped together by Names. So if 'Steve' has got more unique combinations then 'John' put all of 'Steve's' unique combinations at the top of the list above 'John's'
Sorry if that was not clear in my statement
Thanks
Please: As I posted in http:#a40013846
Instead of trying to explain what you want please provide sample raw data and expected results. Similar to the test case I posted. It has the base table and sample data. then all we need to see is the actual expected results.
Given all your explanations once we see the raw data along with the expected results we'll be able to post a working test case.
Instead of trying to explain what you want please provide sample raw data and expected results. Similar to the test case I posted. It has the base table and sample data. then all we need to see is the actual expected results.
Given all your explanations once we see the raw data along with the expected results we'll be able to post a working test case.
ASKER
Sorry as i said earlier i cant post the raw data on line.
>>Sorry as i said earlier i cant post the raw data on line.
I'm not asking for real data. Just dummy data that is close enough to your actual structure that when we post working SQL, you can apply it to your data.
You've provided dummy expected results in just about every post but you have never provided the raw dummy data to go with it.
What I am looking for is a sample test case like I posted in http:#a40005407
It has a table and sample data. The only thing I'm missing is:
Given that table and data, what results do you want from it.
Make it up but make it close to your actual situation.
For example say you are really working with payroll data and have:
table employee_pay(ssn varchar2(9), salary number)
111-22-3333, 10000000
666-44-9999, 10
...
Can you not post a dummy table here:
table fuzzy_pickle(flavor varchar2(9), calories number)
dill, 10
kohser, 500
...
Same table structure, same data types, similar data. All you need to do is provide the expected results from the data you provide.
Then any query against fuzzy_pickle would run against your salary table. Just change the names.
I'm not asking for real data. Just dummy data that is close enough to your actual structure that when we post working SQL, you can apply it to your data.
You've provided dummy expected results in just about every post but you have never provided the raw dummy data to go with it.
What I am looking for is a sample test case like I posted in http:#a40005407
It has a table and sample data. The only thing I'm missing is:
Given that table and data, what results do you want from it.
Make it up but make it close to your actual situation.
For example say you are really working with payroll data and have:
table employee_pay(ssn varchar2(9), salary number)
111-22-3333, 10000000
666-44-9999, 10
...
Can you not post a dummy table here:
table fuzzy_pickle(flavor varchar2(9), calories number)
dill, 10
kohser, 500
...
Same table structure, same data types, similar data. All you need to do is provide the expected results from the data you provide.
Then any query against fuzzy_pickle would run against your salary table. Just change the names.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
See? Paul's post above has table, data and results. nothing in it is 'real'.
If that post isn't what you are after, take that example, add or remove data from it and provide the results you are after.
Don't explain 'why' it doesn't work, just post the results as you want to see them.
If that post isn't what you are after, take that example, add or remove data from it and provide the results you are after.
Don't explain 'why' it doesn't work, just post the results as you want to see them.
utterly agree with slightwv
you have implied sample data, but not actually provided any
e.g. the result list at comment 40013818
(which is what I have based my latest queries on)
e.g. the following would have been clear, with less words
you have implied sample data, but not actually provided any
e.g. the result list at comment 40013818
(which is what I have based my latest queries on)
e.g. the following would have been clear, with less words
Sample Data:
Table1
NAMES NUMBERS
Colin 6666
Colin 6666
Colin 6666
Colin 7777
Colin 7777
Colin 7777
John 1234
John 1234
John 1234
John 4567
John 4567
John 4567
John 6789
John 6789
John 6789
John 9876
John 9876
John 9876
Steve 2312
Steve 2312
Steve 2312
Steve 4765
Steve 4765
Steve 4765
Steve 5678
Steve 5678
Steve 5678
Expected Result:
Names | Numbers | Count of unique combinations
John 1234 4
John 4567 4
John 6789 4
John 9876 4
Steve 2312 3
Steve 4765 3
Steve 5678 3
Colin 7777 2
Colin 6666 2
assuming of course that I have interpreted comment 40013818 correctly
Table1
NAMES NUMBERS
Colin 6666
Colin 6666
Colin 6666
Colin 7777
Colin 7777
Colin 7777
John 1234
John 1234
John 1234
John 4567
John 4567
John 4567
John 6789
John 6789
John 6789
John 9876
John 9876
John 9876
Steve 2312
Steve 2312
Steve 2312
Steve 4765
Steve 4765
Steve 4765
Steve 5678
Steve 5678
Steve 5678
Expected Result:
Names | Numbers | Count of unique combinations
John 1234 4
John 4567 4
John 6789 4
John 9876 4
Steve 2312 3
Steve 4765 3
Steve 5678 3
Colin 7777 2
Colin 6666 2
ASKER
Many Thanks PortletPaul your a star, both your query's look to work perfectly but still doing some testing to confirm.
Much appreciated
Much appreciated
Thanks for the award of points, but you have graded the answer as B. Quite frankly I cannot possibly see how that was warranted.
Did you have to perform extra work to get it to work for you?
Did you explain why you chose the B grading?
(And by the way, you have flagged the wrong comment as the accepted answer, which should have been comment ID: 40016346)
I see you quite often award B's and C's
You really need to read that article above. If you feel a question is simple to answer, adjust the points downward. You should not use grading for scaling the percieved difficulty.
I do realise reaching the answer probably took longer than you expected, but hopefully we have not only explained why that was true, but also shown you techniques to express your requirements in future.
What grade should I award?Did the answer lack information?
A B grade means the solution given lacked some information or required you to do a good amount of extra work to resolve the problem. When closing the question, the asker should explain why a B grade was awarded.
Did you have to perform extra work to get it to work for you?
Did you explain why you chose the B grading?
(And by the way, you have flagged the wrong comment as the accepted answer, which should have been comment ID: 40016346)
I see you quite often award B's and C's
You really need to read that article above. If you feel a question is simple to answer, adjust the points downward. You should not use grading for scaling the percieved difficulty.
I do realise reaching the answer probably took longer than you expected, but hopefully we have not only explained why that was true, but also shown you techniques to express your requirements in future.
ASKER
Hi SouthMod
After reading the Grade Award info i can see that i should have given this question an A grade, as the answer did fulfil all requirements.
Not sure if you can change retrospectively?
Thanks
After reading the Grade Award info i can see that i should have given this question an A grade, as the answer did fulfil all requirements.
Not sure if you can change retrospectively?
Thanks
@stopm
SouthMod did alter set the accepted answer to ID: 40016346 and the grading to A so there's no more to be done, but thank you for agreeing to that change.
If you did need to change something you may "Request Attention" (a link you will find under the question) and ask the moderators for assistance.
Paul
SouthMod did alter set the accepted answer to ID: 40016346 and the grading to A so there's no more to be done, but thank you for agreeing to that change.
If you did need to change something you may "Request Attention" (a link you will find under the question) and ask the moderators for assistance.
Paul
Open in new window