Solved

# select oldest orderid for each of 1,000 customers?

Posted on 2007-07-25
450 Views
Starting with a list containing 1-2,000 customer id's, I need to generate a query that gives me the orderid for each customer, of the earliest order associated with that customer.  So I have a two-part challenge:

The only way I can think of to retrieve the same list of customers is with a code-generated where clause listing each of the id's.  Doesn't smell right, so I thought I'd post it here for feedback.  S'ok?  Not ok? (SQL 2000)

Second, how to retrieve only the oldest orderid for each customer?  Where I also need to ignore orders that cost less than \$10.

The fields are:
customers.customer_code (pk, varchar)
order_entry.customer_code (fk)
order_entry.job_number (pk, int)
order_entry.entry_date (datetime)
order_entry.totalprice (decimal)

Help much appreciated-
0
Question by:ottenm

LVL 142

Accepted Solution

select c.*, o.*
from customers c
left join order_entry o
on o.customer_code = c.customer_code
and o.job_number = ( select top 1 i.job_number from order_entry i where i.customer_code = c.customer_code and i.totalprice >= 10 order by i.entry_date desc )
0

LVL 68

Assisted Solution

If the job_numbers are always in order -- that is, a higher job number is always a later order -- you can do this:

SELECT c.customer_code, MAX(oe.job_number) AS [Latest Order]
FROM customers c
INNER JOIN order_entry oe ON c.customer_code = oe.customer_code
WHERE oe.totalprice >= 10
GROUP BY c.customer_code
ORDER BY c.customer_code
0

Author Comment

Yes, both solutions will work because the job_numbers are increasing.  My mistake for leaving that out of the description.

Any concerns using the same query (or any query for that matter) with 1-2,000 tests in the where clause?  Like:

SELECT c.customer_code, MAX(oe.job_number) AS [Latest Order]
FROM customers c
INNER JOIN order_entry oe ON c.customer_code = oe.customer_code
WHERE oe.totalprice >= 10 and (c.customer_code = 'Cust1' OR c.customer_code = 'Cust2'
OR c.customer_code = 'Cust3' OR c.customer_code = 'Cust4' etc etc etc)
GROUP BY c.customer_code
ORDER BY c.customer_code
0

LVL 142

Expert Comment

I do not "like" the suggestion of ScottPletcher (it is not "wrong"! )
because it uses the "implicit" meaning of the higher job_number value for "older" entry_dates, and also only returns the job_number, while my suggestion returns the entire order_entry row ...
0

Author Comment

Yeah, it's the old "but 98% of the time, the numbers *do* increase!" and then when it really matters, you hit the other 2% and have sad eyes in the boss' office.  Similarly for the job_number, too.  I really just need the job_number right now, but there are some more reports coming that might build off this one.

What about my 2,000-entry WHERE clause?  No complaints there?
0

LVL 68

Expert Comment

>> Any concerns using the same query (or any query for that matter) with 1-2,000 tests in the where clause? <<

Yes; that will not be good for performance.  Instead, create a table of the values with a clustered, unique index and INNER JOIN that table to the query.  SQL is optimized for joins.

Speaking of optimization, I wrote my query to satisfy the q as asked.  The original q asked for the orderid, not the entire row.  So, I was aiming for the most efficient way to do that.  Naturally my query could be modified to return the entire row, but that would reduce the efficiency, requiring joins and/or subqueries, as are already present in angel's query.  That query requires a join *and* a *correlated* subquery *with a sort*.

As to the job number, what else would a higher job number mean?  Particularly if, as is likely, an identity column is used to assign the number.
0

LVL 68

Expert Comment

My other concern with datetime is that it may not be unique enough, esp. if edi or other electronic orders are used.  If multiple orders have the same datetime, which one is selected by the TOP 1 is random; you could get different results for each run.
0

LVL 68

Expert Comment

Is job_number an identity value?  If so, I fail to see how a row inserted later could ever have a lower number, but if you feel unsure, that's fine.

Sorry, I think like a DBA, and as such, I don't do hundred, thousands or possibly millions of extra I/Os -- with the inevitable big hit on performance -- unless I genuinely *need* to.
0

Author Comment

These are some extremely helpful comments from both of you, and thanks for not only providing superb solutions but also taking the time to clarify the trade offs.

As for job numbers, for various reasons we sometimes (2% of the time) have jobs where we need to reset the entry_date to a later date, giving us a 'younger' job with a lower job number.

Again, tremendous help and much appreciated.
0

LVL 68

Expert Comment

>> where we need to reset the entry_date to a later date <<

Wow; OK, I've never seen that done before in a business setting, where the "entry_date" was changed to a later date.  That would explain the 2%.  [You really should rename the column, tho, since it's technically no longer an entry_date.]
0

## Featured Post

### Suggested Solutions

Introduced in Microsoft SQL Server 2005, the Copy Database Wizard (http://msdn.microsoft.com/en-us/library/ms188664.aspx) is useful in copying databases and associated objects between SQL instances; therefore, it is a good migration and upgrade tool…
I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.