asked on

Sql (oracle) select distinct on two rows and order by count

Hi
In one table i have two columns 'Names' and 'Numbers' .
There are multiple rows for the unique 'Names' and they can have with different 'Numbers'

eg

Names | Number |
John 1234
John 1234
John 1234
John 4567
John 6789

I want a distinct list of the 'Names' and a count of how many different 'Numbers' each name has ordered highest to lowest so in example above

Names | Count
John 3

There is also a stored date field so only want results for last 24 hours eg stored > sysdate -1.

Preferably though i would like to include the 'Numbers' field as well but still order by highest count first

eg
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765

Probably a real simple one but combinations i tried didn't get results i thought they would.

slightwv (䄆 Netminder)

Try this:

select names, count(mynumber) from
( select distinct names,mynumber from tab1 where somedate_column >= trunc(sysdate-1)
)
group by names
/

Open in new window

slightwv (䄆 Netminder)

oops... missed some requirements. The sample data threw me off.

I'm confused.

You first want the name and a count of distinct numbers then a list of names and distinct numbers.

Which is it? I see these as two different selects.

Can you clarify?

slightwv (䄆 Netminder)

OK, see if this test case gives you what you need.

It doesn't have the date logic but I provided that in the post above.

If there are problems, please add to the test case and explain what you are adding and why.

drop table tab1 purge;
create table tab1 (names varchar2(5), mynumber number);

insert into tab1 values('John',1234);
insert into tab1 values('John',1234);
insert into tab1 values('John',1234);
insert into tab1 values('John',4567);
insert into tab1 values('John',6789);
insert into tab1 values('John',5);
--
insert into tab1 values('Steve',2);
insert into tab1 values('Steve',3);
insert into tab1 values('Steve',3);
commit;



select names, max(number_count) from 
(
select names,
	mynumber,
	count(*) over(partition by names, mynumber order by mynumber) number_count
from tab1
)
group by names
order by 2 desc
/



select names, mynumber from
(
select distinct names, mynumber, number_count
from
(
select names,
	mynumber,
	count(*) over(partition by names, mynumber order by mynumber) number_count
from tab1
)
order by names,number_count desc
)
/

Open in new window

John_Vidmar

I'm not an Oracle expert, this works in MS SQL Server (using a derived-table):

SELECT DISTINCT
	a.Name
,	a.Number
,	b.RowCount
FROM	SomeTable	a
JOIN	(	SELECT	Name
		,	Number
		,	COUNT(*) as RowCount
		FROM	SomeTable
		GROUP
		BY	Name
		,	Number
	)	b	ON	a.Name = b.Name
			AND	a.Number = b.Number

OUTPUT:

	Name	Number	RowCount
	------	------	--------
	John	1234	3
	John	4567	1
	John	6789	1
	Steve	2312	1
	Steve	4765	1

Open in new window

slightwv (䄆 Netminder)

>>I'm not an Oracle expert, this works in MS SQL Server (using a derived-table):

Syntax also appears to work in Oracle in my 11gR2 database.

It hits the table twice which could be an issue. The examples I posted should only hit it once.

PortletPaul

This variant also works in 11g (no nesting)

SELECT DISTINCT
name
, mynumber
, count(*) over(partition BY name, mynumber) AS number_count
FROM Table1
ORDER BY number_count DESC

but you do get all 3 columns returned

{+edit, order by isn't required in the count}

stopm

ASKER

Hi Slightwv,

So going back to your first sql will this one count blanks as i hadn't realised that some rows have a Name but no Number. (i don't want to count the blanks)

eg

Names Numbers
John 1234
John
John 1234

As for the other question, yes they are two different statements and my preferred one would be to return the names and the numbers with the Name that has the highest amount of combinations at the top of the list.

Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765

Just going to try the other select statement you sent.
Thanks

PortletPaul

yes, nulls will be counted

Names Numbers
John 1234
John
John 1234

should produce

John <null> 1
John 1234 2

slightwv (䄆 Netminder)

>> (i don't want to count the blanks)

Add "and numbers is not null" to the inner query?

If the column contains spaces:
and trim(numbers) is not null

stopm

ASKER

Sorry for delay in replying- Easter break

Most of the above queries give me a total count of all rows in the table of Names and Numbers but I need just the unique combinations of Names and Numbers ordered by the highest count of these first (as per example below count below 4,3,2)
Note:- The table has many more rows where the Names and the Numbers are identical but i don't need to know this number only the distinct pairings.

Eg:-

Names | Numbers | Count of unique combinations
John 1234 4
John 4567 4
John 6789 4
John 9876 4
Steve 2312 3
Steve 4765 3
Steve 5678 3
Colin 7777 2
Colin 6666 2

This query below from 'slightwv' does give me a count of the unique combinations of Names and Numbers but would be good if i can get as above and include the Numbers field as well or even just the Names and Numbers without the count field providing they are ordered in highest number of distinct combinations descending:-

select names, count(numbers) from ( select distinct name, numbers
from tab1 where numbers is not null))
group by names order by count(numbers) desc

Names | Count
John 4
Steve 3
Colin 2

Thanks, and hope this makes sense.

stopm

ASKER

Increased points as it looks a bit more complicated then i thought originally

slightwv (䄆 Netminder)

Sorry but I still don't understand.

Looks like a simple tweak to the second query in http:#a40005407 returns the numbers and count.

Change:
select names, mynumber from

To:
select names, mynumber, number_count from

Making that change with the sample data above I get:

NAMES   MYNUMBER NUMBER_COUNT
----- ---------- ------------
John        1234            3
John           5            1
John        4567            1
John        6789            1
Steve          3            2
Steve          2            1

Open in new window

Also looks like the example from John_Vidmar also shows similar results even though it hits the table twice.

Instead of trying to explain what you want please provide sample raw data and expected results. Similar to the test case I posted. It has the base table and sample data. then all we need to see is the actual expected results.

PortletPaul

count() over( ... order by ..)
there is no point or need for order by inside that over(...)

and nesting of the query isn't required either because distinct is a row operator (it's applied after all rows are known)

    SELECT DISTINCT
            names
          , mynumber
          , count(*) over(partition BY names, mynumber) AS number_count
    FROM Tab1
    ORDER BY names, number_count DESC, mynumber ASC
    
    

**[Results][2]**:
    
    | NAMES | MYNUMBER | NUMBER_COUNT |
    |-------|----------|--------------|
    |  John |     1234 |            3 |
    |  John |        5 |            1 |
    |  John |     4567 |            1 |
    |  John |     6789 |            1 |
    | Steve |        3 |            2 |
    | Steve |        2 |            1 |



  [1]: http://sqlfiddle.com/#!4/0eddb/3

Open in new window

stopm

ASKER

Thanks slightwv

Ref your comment :-
Regarding the second query in http:#a40005407

Yes that query does return the results i want , except it does not order them by highest number of unique combinations 1st it orders them by the Names field descending.
(In my table the Names are actually stored as digits to represent peoples names (eg John = 04, Steve =02 etc.. Thought it would be simpler to use names in example but maybe that was not such a good idea!)

Sorry actual data from my table is a bit sensitive to post on line.

-------------------------------------------------------
Making these changes below only gave me count of the number of rows which i am not interested in. eg I don't need to know that John 1234 has 3 entries. I only want it to be above Steve because it has 4 unique entries and Steve only has 2

NAMES MYNUMBER NUMBER_COUNT
----- ---------- ------------
John 1234 3

Change:
select names, mynumber from

To:
select names, mynumber, number_count from

Thanks again

PortletPaul

>>" except it does not order them by highest number of unique combinations 1st it orders them by the Names field descending."

a. that is trivially easy to fix, just alter the order by clause to suit

    SELECT DISTINCT
            names
          , mynumber
          , count(*) over(partition BY names, mynumber) AS number_count
    FROM Tab1
    ORDER BY number_count DESC, names, mynumber ASC -- change line this to suit

Open in new window

, and

b. that is what you actually asked for in the question: the provided list is ordered by name first

Preferably though i would like to include the 'Numbers' field as well but still order by highest count first

eg
Names | Numbers
John 1234
John 4567
John 6789
Steve 2312
Steve 4765

stopm

ASKER

Hi PotrletPaul

Your query is returning a count of all the rows as opposed to only a count of the unique combinations of Names and Numbers
eg. There are thousands of rows for John 123, John 4567 and John 6789 in the table but i don't need a count of all these rows i just need a count of the unique parings , in this example would = 3.
I only ever wanted the list ordered by highest number of these unique combinations 1st but grouped together by Names. So if 'Steve' has got more unique combinations then 'John' put all of 'Steve's' unique combinations at the top of the list above 'John's'

Sorry if that was not clear in my statement

Thanks

slightwv (䄆 Netminder)

Please: As I posted in http:#a40013846

Instead of trying to explain what you want please provide sample raw data and expected results. Similar to the test case I posted. It has the base table and sample data. then all we need to see is the actual expected results.

Given all your explanations once we see the raw data along with the expected results we'll be able to post a working test case.

stopm

ASKER

Sorry as i said earlier i cant post the raw data on line.

slightwv (䄆 Netminder)

>>Sorry as i said earlier i cant post the raw data on line.

I'm not asking for real data. Just dummy data that is close enough to your actual structure that when we post working SQL, you can apply it to your data.

You've provided dummy expected results in just about every post but you have never provided the raw dummy data to go with it.

What I am looking for is a sample test case like I posted in http:#a40005407

It has a table and sample data. The only thing I'm missing is:
Given that table and data, what results do you want from it.

Make it up but make it close to your actual situation.

For example say you are really working with payroll data and have:
table employee_pay(ssn varchar2(9), salary number)
111-22-3333, 10000000
666-44-9999, 10
...

Can you not post a dummy table here:
table fuzzy_pickle(flavor varchar2(9), calories number)
dill, 10
kohser, 500
...

Same table structure, same data types, similar data. All you need to do is provide the expected results from the data you provide.

Then any query against fuzzy_pickle would run against your salary table. Just change the names.

ASKER CERTIFIED SOLUTION

PortletPaul

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

slightwv (䄆 Netminder)

See? Paul's post above has table, data and results. nothing in it is 'real'.

If that post isn't what you are after, take that example, add or remove data from it and provide the results you are after.

Don't explain 'why' it doesn't work, just post the results as you want to see them.

PortletPaul

utterly agree with slightwv

you have implied sample data, but not actually provided any
e.g. the result list at comment 40013818

(which is what I have based my latest queries on)

e.g. the following would have been clear, with less words

Sample Data:
Table1
NAMES NUMBERS
Colin 6666
Colin 6666
Colin 6666
Colin 7777
Colin 7777
Colin 7777
John 1234
John 1234
John 1234
John 4567
John 4567
John 4567
John 6789
John 6789
John 6789
John 9876
John 9876
John 9876
Steve 2312
Steve 2312
Steve 2312
Steve 4765
Steve 4765
Steve 4765
Steve 5678
Steve 5678
Steve 5678

Expected Result:
Names | Numbers | Count of unique combinations
John 1234 4
John 4567 4
John 6789 4
John 9876 4
Steve 2312 3
Steve 4765 3
Steve 5678 3
Colin 7777 2
Colin 6666 2

assuming of course that I have interpreted comment 40013818 correctly

stopm

ASKER

Many Thanks PortletPaul your a star, both your query's look to work perfectly but still doing some testing to confirm.
Much appreciated

PortletPaul

Thanks for the award of points, but you have graded the answer as B. Quite frankly I cannot possibly see how that was warranted.

What grade should I award?

A B grade means the solution given lacked some information or required you to do a good amount of extra work to resolve the problem. When closing the question, the asker should explain why a B grade was awarded.

Did the answer lack information?
Did you have to perform extra work to get it to work for you?
Did you explain why you chose the B grading?

(And by the way, you have flagged the wrong comment as the accepted answer, which should have been comment ID: 40016346)

I see you quite often award B's and C's
You really need to read that article above. If you feel a question is simple to answer, adjust the points downward. You should not use grading for scaling the percieved difficulty.

I do realise reaching the answer probably took longer than you expected, but hopefully we have not only explained why that was true, but also shown you techniques to express your requirements in future.

stopm

ASKER

Hi SouthMod

After reading the Grade Award info i can see that i should have given this question an A grade, as the answer did fulfil all requirements.

Not sure if you can change retrospectively?
Thanks

PortletPaul

@stopm

SouthMod did alter set the accepted answer to ID: 40016346 and the grading to A so there's no more to be done, but thank you for agreeing to that change.

If you did need to change something you may "Request Attention" (a link you will find under the question) and ask the moderators for assistance.

Paul