Link to home
Start Free TrialLog in
Avatar of BKennedy2008
BKennedy2008

asked on

sql statement Inner join with Max(item A)

Using a following example below from 1 table (will redesign..), I need to produce a list (below) with the latest scanned date of an item and the latest import date for that item.

There will be only 1 UPC Code per day at 1 location, but may be imported several times throughout a period. So I need both for that UPC Code the Max(Scan_Date) and the Max(ImportDate) to achieve a list with the latest import date for that item on that day.
How can I use a join statement to get both for a UPC Code the latest date and latest Scan?

Table A
Scan_Date      Current_Location         UPCCode        ImportDate
1-10-2012       A                                  10                   1-11-2012 22:56
1-10-2012       A                                  10                   1-12-2012 13:56
1-10-2012       B                                   15                   1-10-2012  08:59
1-09-2012       A                                   15                  1-8-2012  08:56
1-09-2012       C                                   10                   1-9-2012 07:56
1-09-2012       B                                    15                   1-9-2012 08:26
1-09-2012       C                                    10                  1-8-2012  08:27
1-08-2012       B                                    15                  1-7-2012 09:30
1-07-2012       C                                    20                  1-7-2012 09:28
1-07-2012       C                                    20                  1-7-2012 10:30
etc..

I want to produce:
10        1-10-2012      A                   1-12-2012 13:56
15         1-10-2012     B                   1-10-2012  08:59
20          1-07-2012    C                    1-7-2012 10:30
ect...

In the example above, UPC 10 has a scan date of 1-10-2012, and 1-9-2012, but I only need  the latest scanned Date (1-10-2012) and the latest import time (1-12-2012 13:56). Then same down the list for each UPC Code.....
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

SELECT Current_Location, UPC_Code, Max(Scan_Date) as scan_date_max, Max(ImportDate) as import_date_max
FROM [Table A]
GROUP BY  Current_Location, UPC_Code
Avatar of BKennedy2008
BKennedy2008

ASKER

That produces multiple entries for each UPC Code, giving me the latest import for that day, but for each day.
I just need the latest date from that list for each UPC Code , and the latest time that UPC Code was imported on that day....
I think I need an added inner join to that statement..

Produced:

UPC             Loc                     Scan Date                           Import Date

10                   A                    2012-12-01  0:00              2012-12-01 12:18
10                   A                     2012-12-28 9:20              2012-12-28  12:00
10                   A                     2012-12-27 9:10              2012-12-27  12:30
etc..
etc..
Got it.. Thanks

 SELECT Current_Location, a.upccode, Max(Scan_Date) as scan_date_max, Max(ImportDate) as import_date_max
FROM scans a
inner join
(Select MAX(ImportDate) as expr1, UPCCode  from Equipment.dbo.Scans group by UPCCode)c
on a.ImportDate = c.expr1 and a.UPCCode=c.UPCCode
GROUP BY  Current_Location, a.upccode
Spell out whether Current_Location participates in the grouping.

The below code returns five rows:  A-10, A-15, B-15, C-10, and C-20.

CREATE TABLE a (Scan_Date date, Current_Location char(1), UPCCode tinyint, ImportDate datetime)

INSERT INTO a  (Scan_Date, Current_Location, UPCCode, ImportDate)
VALUES 
('1-10-2012', 'A', 10, '1-11-2012 22:56'), 
('1-10-2012', 'A', 10, '1-12-2012 13:56'),
('1-10-2012', 'B',  15, ' 1-10-2012  08:59'),
('1-09-2012', 'A',  15, '1-8-2012  08:56'),
('1-09-2012', 'C',  10, '1-9-2012 07:56'),
('1-09-2012', 'B',   15, '1-9-2012 08:26'),
('1-09-2012', 'C',   10, '1-8-2012  08:27'),
('1-08-2012', 'B',   15, '1-7-2012 09:30'),
('1-07-2012', 'C',   20, '1-7-2012 09:28'),
('1-07-2012', 'C',   20, '1-7-2012 10:30')

SELECT Current_Location, UPCCode, Max(Scan_Date) as scan_date_max, Max(ImportDate) as import_date_max
FROM a
GROUP BY Current_Location, UPCCode 
ORDER BY Current_Location, UPCCode 

Open in new window

SELECT A1.Current_Location, A1.UPC_Code, Max(A1.Scan_Date) as scan_date_max, Max(A2.ImportDate) as import_date_max
FROM [Table A] A1
JOIN [Table A] A2 ON A1.CurrentLocation = A2.CurrentLocation
AND A1.UPC_Code = A2.UPC_Code
AND A1.Scan_Date = A2.Scan_Date

GROUP BY  A1.Current_Location, A1.UPC_Code
Close.....

It still gives me for items that were imported at the same time for 2 different days the following:
UPC                Loc                      Scan Date                          Import Date
10                   A                    2012-12-01  0:00              2012-12-01 12:18
10                   A                     2012-12-02 9:20              2012-12-01  12:18

on the list.
Do you see how I can teake care of that?
Using the data you posted in this question, and the T-SQL I provided above, the Current_Location and UPCCode columns are unique.  

What you have as two rows of 10-A I'm showing as a 10-A and 15-A.

I recommend checking your mockup data to make sure it accurately reflects your question.
>It still gives me for items that were imported at the same time for 2 different days the following:
Based on this statement, and not the sample data, you'll have to use a subquery and a JOIN, as deighton posted above.
the current location and UPC are unique for that day, but the UPC Code may be at another location the next day.
SELECT Current_Location,UPC_Code,Scan_Date_max,MAX(A2.ImportDate) maxImport  FROM

(SELECT A1.Current_Location, A1.UPC_Code,  Max(A2.Scan_Date) as Scan_Date_max
FROM [Table A] A1 GROUP BY  A1.Current_Location, A1.UPC_Code) DQ
JOIN [Table A] A2 ON DQ.CurrentLocation = A2.CurrentLocation
AND DQ.UPC_Code = A2.UPC_Code
AND DQ.Scan_Date_max = A2.Scan_Date
GROUP BY DQ.Current_Location, DQ.UPC_Code
Deighton, from your code, I get The multi-part identifier "A2.Scan_Date" could not be bound.
in the 3rd line
Jim Horn You are correct, but I am looking for a query that produces only:

A-10      2012-01-10
B-15      2012-01-09
c-20       2012-01-07

etc..

out of your table that was created...
Can you please explain what logic you want for Current_Location
I need the latest location and the latest date where the UPC Code was scanned .

With the file being updated multiple times within a day, each time it is updated, it gets re-imported into the database. Therefore I need the latest Import date-time from that day it was scanned.

The item may move to another location the next day and the import/scan process starts all over.

My goal here is to find the last known location and date of the UPC item.
ASKER CERTIFIED SOLUTION
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Rehan- Using your code, it will get
-The latest Import Date and the latest scan date for UPC 11 but combines the location from the latest import date instead of the latest date.

(Scan Date, Location, UPC code, Import Date)

2012-11-29 00:00, A, 11, 2012-12-20 10:10

When the record(s) for UPC 11 are:

2012-11-29 00:00, B, 11, 2012-11-30 07:41
2012-11-04 00:00, A, 11, 2012-12-20 10:10
Ok- I switched the import with the scan date, and it looks good
Thanks to both, but the query runs like a champ that Rehan submitted after switching out the

AND a2.ImportDate = a1.ImportDate)  to

AND a2.Scan_Date = a1.Scan_Date

Thanks again..
select upc_code, scan_date, current_location, importdate from
(select upc_code, scan_date, current_location, importdate,
 row_number() over (partition by upc_code order by scan_date desc) srn,
 row_number() over (partition by upc_code order by importdate desc) irn
 from a) as x
where srn = 1 and irn = 1;
BKennedy2008 - Let me know if my T-SQL answered your question, as it was posted before the one you ultimately accepted, which looks like it borrowed off of previous experts' comments.
I was too slow typing, but I would like to know if my query worked as well.
I found with yours that  i did not realize that if the upc code was scanned at 2 locations for the same day it listed both. I didnt realize these guys sometimes moved it on the same day. Using the other code it gave me only that day but the latest time for that day that item was scanned.
>> I found with yours that ...<<
Whose yours?
awking00 - No it only displays 2682 rows out of the 3100 top daily scans, so its dropping records
That was for jimhorn...
Thanks, it worked with the sample data you posted, but I wasn't sure whether or not there might be other scenarios where it fails.
Out of curiosity, what happens if you change the last line to -
where srn = 1 OR irn = 1?
I have an issue with the way points were awarded in this question.

It's pretty clear that multiple experts helped you, either by helping you frame up the question, discover issues in data that prevented a successful answer, and ultimately giving a final answer.    It's also clear that the answer that was accepted borrowed code from other experts.  Not that there's anything wrong with that, but it is common practice to acknowledge it, and it's not good grading practice to not acknowledge the help of others that led to it.

It appears that two experts (if not more) gave a correct answer to the question as it was defined.

In cases like this a split is more appropriate than awarding all points to the comment that was closest to what was ultimately the answer.
I agree, you helped me alot here getting to a solution, and yes I should have. The question is that I cannot undo points now. I appreciate your help, and will award points properly next time.
or produces multiple scan dates and and import times