Avatar of Russ Suter
Russ Suter
 asked on

Get the 1st occurrence of a row based on column value

Given the data set in the attached Excel spreadsheet, how would I query against that data so I get only the 1st occurrence of Property_Id?

I already have a query that gets the data I need but it's pretty slow. I'm looking to get this data in the fastest way possible.

Or, to ask another way, given the following simple set of data...

Element_Id     Property_Id     Value
 2528          1               Company conameshort
-4096          1             
 2528          2               Company Name
-4096          2               Text
-4096          3               Company Text Field
 2528          5               6
-4096          5               3

Open in new window

How do I get this...
Element_Id     Property_Id     Value
 2528          1               Company conameshort
 2528          2               Company Name
-4096          3               Company Text Field
 2528          5               6

Open in new window


properties.xlsx

I'm using Microsoft SQL Server 2012
Microsoft SQL Server* T-SQLSQL

Avatar of undefined
Last Comment
PortletPaul

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
slightwv (䄆 Netminder)

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Russ Suter

ASKER
Sorry, I forgot to mention. I need to do this using a CROSS APPLY. I cannot use any windowed functions to perform this task due to performance reasons.
slightwv (䄆 Netminder)

How about doing the cross apply in an inner query and using row_number on an outer query?

pseudo code:
select element_id, property_id, value
from (
  select element_id, property_id, value,
    row_number() over(partition by property_id order by property_id, element_id desc) rn
   from (
         select blah, blah blah from junk1 cross apply junk2
       ) t2
) t1
where rn=1
 

Open in new window

Russ Suter

ASKER
That still uses ROW_NUMBER() which I can't use.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
slightwv (䄆 Netminder)

As you know, I'm not really a SQL Server person.  I was thinking by moving the window function to an outer level and keeping the cross join inside, performance might not be an issue.

Guess a temp table or materialize view from the cross join then using row_number isn't an option either since it uses row_number?
SOLUTION
PortletPaul

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Russ Suter

ASKER
Thanks all!

So it would appear that I was operating under a false assumption that ROW_NUMBER() is slow. I had an interesting few hours last weekend figuring out why my view was running so slow. It turned out that my view uses an APPLY clause to call a user defined function. It's the UDF that is using the ROW_NUMBER() function. When using CROSS APPLY, performance of the view is about twice as slow as when using OUTER APPLY. Furthermore, if I filter the view with any kind of WHERE clause, performance of the view drops even further when using CROSS APPLY. My solution was to just use OUTER APPLY for the view and now performance is great. The UDF is still using ROW_NUMBER(). Thanks all for your contributions.
PortletPaul

What does that UDF return?

A scalar value?  e.g. RETURN INT
A table?

I suggest you consider possible alternatives for that function.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.