Link to home
Start Free TrialLog in
Avatar of Russ Suter
Russ Suter

asked on

Get the 1st occurrence of a row based on column value

Given the data set in the attached Excel spreadsheet, how would I query against that data so I get only the 1st occurrence of Property_Id?

I already have a query that gets the data I need but it's pretty slow. I'm looking to get this data in the fastest way possible.

Or, to ask another way, given the following simple set of data...

Element_Id     Property_Id     Value
 2528          1               Company conameshort
-4096          1             
 2528          2               Company Name
-4096          2               Text
-4096          3               Company Text Field
 2528          5               6
-4096          5               3

Open in new window

How do I get this...
Element_Id     Property_Id     Value
 2528          1               Company conameshort
 2528          2               Company Name
-4096          3               Company Text Field
 2528          5               6

Open in new window


properties.xlsx

I'm using Microsoft SQL Server 2012
ASKER CERTIFIED SOLUTION
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Russ Suter
Russ Suter

ASKER

Sorry, I forgot to mention. I need to do this using a CROSS APPLY. I cannot use any windowed functions to perform this task due to performance reasons.
How about doing the cross apply in an inner query and using row_number on an outer query?

pseudo code:
select element_id, property_id, value
from (
  select element_id, property_id, value,
    row_number() over(partition by property_id order by property_id, element_id desc) rn
   from (
         select blah, blah blah from junk1 cross apply junk2
       ) t2
) t1
where rn=1
 

Open in new window

That still uses ROW_NUMBER() which I can't use.
As you know, I'm not really a SQL Server person.  I was thinking by moving the window function to an outer level and keeping the cross join inside, performance might not be an issue.

Guess a temp table or materialize view from the cross join then using row_number isn't an option either since it uses row_number?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks all!

So it would appear that I was operating under a false assumption that ROW_NUMBER() is slow. I had an interesting few hours last weekend figuring out why my view was running so slow. It turned out that my view uses an APPLY clause to call a user defined function. It's the UDF that is using the ROW_NUMBER() function. When using CROSS APPLY, performance of the view is about twice as slow as when using OUTER APPLY. Furthermore, if I filter the view with any kind of WHERE clause, performance of the view drops even further when using CROSS APPLY. My solution was to just use OUTER APPLY for the view and now performance is great. The UDF is still using ROW_NUMBER(). Thanks all for your contributions.
What does that UDF return?

A scalar value?  e.g. RETURN INT
A table?

I suggest you consider possible alternatives for that function.