BKennedy2008
asked on
sql statement Inner join with Max(item A)
Using a following example below from 1 table (will redesign..), I need to produce a list (below) with the latest scanned date of an item and the latest import date for that item.
There will be only 1 UPC Code per day at 1 location, but may be imported several times throughout a period. So I need both for that UPC Code the Max(Scan_Date) and the Max(ImportDate) to achieve a list with the latest import date for that item on that day.
How can I use a join statement to get both for a UPC Code the latest date and latest Scan?
Table A
Scan_Date Current_Location UPCCode ImportDate
1-10-2012 A 10 1-11-2012 22:56
1-10-2012 A 10 1-12-2012 13:56
1-10-2012 B 15 1-10-2012 08:59
1-09-2012 A 15 1-8-2012 08:56
1-09-2012 C 10 1-9-2012 07:56
1-09-2012 B 15 1-9-2012 08:26
1-09-2012 C 10 1-8-2012 08:27
1-08-2012 B 15 1-7-2012 09:30
1-07-2012 C 20 1-7-2012 09:28
1-07-2012 C 20 1-7-2012 10:30
etc..
I want to produce:
10 1-10-2012 A 1-12-2012 13:56
15 1-10-2012 B 1-10-2012 08:59
20 1-07-2012 C 1-7-2012 10:30
ect...
In the example above, UPC 10 has a scan date of 1-10-2012, and 1-9-2012, but I only need the latest scanned Date (1-10-2012) and the latest import time (1-12-2012 13:56). Then same down the list for each UPC Code.....
There will be only 1 UPC Code per day at 1 location, but may be imported several times throughout a period. So I need both for that UPC Code the Max(Scan_Date) and the Max(ImportDate) to achieve a list with the latest import date for that item on that day.
How can I use a join statement to get both for a UPC Code the latest date and latest Scan?
Table A
Scan_Date Current_Location UPCCode ImportDate
1-10-2012 A 10 1-11-2012 22:56
1-10-2012 A 10 1-12-2012 13:56
1-10-2012 B 15 1-10-2012 08:59
1-09-2012 A 15 1-8-2012 08:56
1-09-2012 C 10 1-9-2012 07:56
1-09-2012 B 15 1-9-2012 08:26
1-09-2012 C 10 1-8-2012 08:27
1-08-2012 B 15 1-7-2012 09:30
1-07-2012 C 20 1-7-2012 09:28
1-07-2012 C 20 1-7-2012 10:30
etc..
I want to produce:
10 1-10-2012 A 1-12-2012 13:56
15 1-10-2012 B 1-10-2012 08:59
20 1-07-2012 C 1-7-2012 10:30
ect...
In the example above, UPC 10 has a scan date of 1-10-2012, and 1-9-2012, but I only need the latest scanned Date (1-10-2012) and the latest import time (1-12-2012 13:56). Then same down the list for each UPC Code.....
ASKER
That produces multiple entries for each UPC Code, giving me the latest import for that day, but for each day.
I just need the latest date from that list for each UPC Code , and the latest time that UPC Code was imported on that day....
I think I need an added inner join to that statement..
Produced:
UPC Loc Scan Date Import Date
10 A 2012-12-01 0:00 2012-12-01 12:18
10 A 2012-12-28 9:20 2012-12-28 12:00
10 A 2012-12-27 9:10 2012-12-27 12:30
etc..
etc..
I just need the latest date from that list for each UPC Code , and the latest time that UPC Code was imported on that day....
I think I need an added inner join to that statement..
Produced:
UPC Loc Scan Date Import Date
10 A 2012-12-01 0:00 2012-12-01 12:18
10 A 2012-12-28 9:20 2012-12-28 12:00
10 A 2012-12-27 9:10 2012-12-27 12:30
etc..
etc..
ASKER
Got it.. Thanks
SELECT Current_Location, a.upccode, Max(Scan_Date) as scan_date_max, Max(ImportDate) as import_date_max
FROM scans a
inner join
(Select MAX(ImportDate) as expr1, UPCCode from Equipment.dbo.Scans group by UPCCode)c
on a.ImportDate = c.expr1 and a.UPCCode=c.UPCCode
GROUP BY Current_Location, a.upccode
SELECT Current_Location, a.upccode, Max(Scan_Date) as scan_date_max, Max(ImportDate) as import_date_max
FROM scans a
inner join
(Select MAX(ImportDate) as expr1, UPCCode from Equipment.dbo.Scans group by UPCCode)c
on a.ImportDate = c.expr1 and a.UPCCode=c.UPCCode
GROUP BY Current_Location, a.upccode
you can check for alternatives in this article:
https://www.experts-exchange.com/Database/Miscellaneous/A_3203-DISTINCT-vs-GROUP-BY-and-why-does-it-not-work-for-my-query.html
https://www.experts-exchange.com/Database/Miscellaneous/A_3203-DISTINCT-vs-GROUP-BY-and-why-does-it-not-work-for-my-query.html
Spell out whether Current_Location participates in the grouping.
The below code returns five rows: A-10, A-15, B-15, C-10, and C-20.
The below code returns five rows: A-10, A-15, B-15, C-10, and C-20.
CREATE TABLE a (Scan_Date date, Current_Location char(1), UPCCode tinyint, ImportDate datetime)
INSERT INTO a (Scan_Date, Current_Location, UPCCode, ImportDate)
VALUES
('1-10-2012', 'A', 10, '1-11-2012 22:56'),
('1-10-2012', 'A', 10, '1-12-2012 13:56'),
('1-10-2012', 'B', 15, ' 1-10-2012 08:59'),
('1-09-2012', 'A', 15, '1-8-2012 08:56'),
('1-09-2012', 'C', 10, '1-9-2012 07:56'),
('1-09-2012', 'B', 15, '1-9-2012 08:26'),
('1-09-2012', 'C', 10, '1-8-2012 08:27'),
('1-08-2012', 'B', 15, '1-7-2012 09:30'),
('1-07-2012', 'C', 20, '1-7-2012 09:28'),
('1-07-2012', 'C', 20, '1-7-2012 10:30')
SELECT Current_Location, UPCCode, Max(Scan_Date) as scan_date_max, Max(ImportDate) as import_date_max
FROM a
GROUP BY Current_Location, UPCCode
ORDER BY Current_Location, UPCCode
SELECT A1.Current_Location, A1.UPC_Code, Max(A1.Scan_Date) as scan_date_max, Max(A2.ImportDate) as import_date_max
FROM [Table A] A1
JOIN [Table A] A2 ON A1.CurrentLocation = A2.CurrentLocation
AND A1.UPC_Code = A2.UPC_Code
AND A1.Scan_Date = A2.Scan_Date
GROUP BY A1.Current_Location, A1.UPC_Code
FROM [Table A] A1
JOIN [Table A] A2 ON A1.CurrentLocation = A2.CurrentLocation
AND A1.UPC_Code = A2.UPC_Code
AND A1.Scan_Date = A2.Scan_Date
GROUP BY A1.Current_Location, A1.UPC_Code
ASKER
Close.....
It still gives me for items that were imported at the same time for 2 different days the following:
UPC Loc Scan Date Import Date
10 A 2012-12-01 0:00 2012-12-01 12:18
10 A 2012-12-02 9:20 2012-12-01 12:18
on the list.
Do you see how I can teake care of that?
It still gives me for items that were imported at the same time for 2 different days the following:
UPC Loc Scan Date Import Date
10 A 2012-12-01 0:00 2012-12-01 12:18
10 A 2012-12-02 9:20 2012-12-01 12:18
on the list.
Do you see how I can teake care of that?
Using the data you posted in this question, and the T-SQL I provided above, the Current_Location and UPCCode columns are unique.
What you have as two rows of 10-A I'm showing as a 10-A and 15-A.
I recommend checking your mockup data to make sure it accurately reflects your question.
What you have as two rows of 10-A I'm showing as a 10-A and 15-A.
I recommend checking your mockup data to make sure it accurately reflects your question.
>It still gives me for items that were imported at the same time for 2 different days the following:
Based on this statement, and not the sample data, you'll have to use a subquery and a JOIN, as deighton posted above.
Based on this statement, and not the sample data, you'll have to use a subquery and a JOIN, as deighton posted above.
ASKER
the current location and UPC are unique for that day, but the UPC Code may be at another location the next day.
SELECT Current_Location,UPC_Code, Scan_Date_ max,MAX(A2 .ImportDat e) maxImport FROM
(SELECT A1.Current_Location, A1.UPC_Code, Max(A2.Scan_Date) as Scan_Date_max
FROM [Table A] A1 GROUP BY A1.Current_Location, A1.UPC_Code) DQ
JOIN [Table A] A2 ON DQ.CurrentLocation = A2.CurrentLocation
AND DQ.UPC_Code = A2.UPC_Code
AND DQ.Scan_Date_max = A2.Scan_Date
GROUP BY DQ.Current_Location, DQ.UPC_Code
(SELECT A1.Current_Location, A1.UPC_Code, Max(A2.Scan_Date) as Scan_Date_max
FROM [Table A] A1 GROUP BY A1.Current_Location, A1.UPC_Code) DQ
JOIN [Table A] A2 ON DQ.CurrentLocation = A2.CurrentLocation
AND DQ.UPC_Code = A2.UPC_Code
AND DQ.Scan_Date_max = A2.Scan_Date
GROUP BY DQ.Current_Location, DQ.UPC_Code
ASKER
Deighton, from your code, I get The multi-part identifier "A2.Scan_Date" could not be bound.
in the 3rd line
in the 3rd line
ASKER
Jim Horn You are correct, but I am looking for a query that produces only:
A-10 2012-01-10
B-15 2012-01-09
c-20 2012-01-07
etc..
out of your table that was created...
A-10 2012-01-10
B-15 2012-01-09
c-20 2012-01-07
etc..
out of your table that was created...
Can you please explain what logic you want for Current_Location
ASKER
I need the latest location and the latest date where the UPC Code was scanned .
With the file being updated multiple times within a day, each time it is updated, it gets re-imported into the database. Therefore I need the latest Import date-time from that day it was scanned.
The item may move to another location the next day and the import/scan process starts all over.
My goal here is to find the last known location and date of the UPC item.
With the file being updated multiple times within a day, each time it is updated, it gets re-imported into the database. Therefore I need the latest Import date-time from that day it was scanned.
The item may move to another location the next day and the import/scan process starts all over.
My goal here is to find the last known location and date of the UPC item.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Rehan- Using your code, it will get
-The latest Import Date and the latest scan date for UPC 11 but combines the location from the latest import date instead of the latest date.
(Scan Date, Location, UPC code, Import Date)
2012-11-29 00:00, A, 11, 2012-12-20 10:10
When the record(s) for UPC 11 are:
2012-11-29 00:00, B, 11, 2012-11-30 07:41
2012-11-04 00:00, A, 11, 2012-12-20 10:10
-The latest Import Date and the latest scan date for UPC 11 but combines the location from the latest import date instead of the latest date.
(Scan Date, Location, UPC code, Import Date)
2012-11-29 00:00, A, 11, 2012-12-20 10:10
When the record(s) for UPC 11 are:
2012-11-29 00:00, B, 11, 2012-11-30 07:41
2012-11-04 00:00, A, 11, 2012-12-20 10:10
ASKER
Ok- I switched the import with the scan date, and it looks good
ASKER
Thanks to both, but the query runs like a champ that Rehan submitted after switching out the
AND a2.ImportDate = a1.ImportDate) to
AND a2.Scan_Date = a1.Scan_Date
Thanks again..
AND a2.ImportDate = a1.ImportDate) to
AND a2.Scan_Date = a1.Scan_Date
Thanks again..
select upc_code, scan_date, current_location, importdate from
(select upc_code, scan_date, current_location, importdate,
row_number() over (partition by upc_code order by scan_date desc) srn,
row_number() over (partition by upc_code order by importdate desc) irn
from a) as x
where srn = 1 and irn = 1;
(select upc_code, scan_date, current_location, importdate,
row_number() over (partition by upc_code order by scan_date desc) srn,
row_number() over (partition by upc_code order by importdate desc) irn
from a) as x
where srn = 1 and irn = 1;
BKennedy2008 - Let me know if my T-SQL answered your question, as it was posted before the one you ultimately accepted, which looks like it borrowed off of previous experts' comments.
I was too slow typing, but I would like to know if my query worked as well.
ASKER
I found with yours that i did not realize that if the upc code was scanned at 2 locations for the same day it listed both. I didnt realize these guys sometimes moved it on the same day. Using the other code it gave me only that day but the latest time for that day that item was scanned.
>> I found with yours that ...<<
Whose yours?
Whose yours?
ASKER
awking00 - No it only displays 2682 rows out of the 3100 top daily scans, so its dropping records
ASKER
That was for jimhorn...
Thanks, it worked with the sample data you posted, but I wasn't sure whether or not there might be other scenarios where it fails.
Out of curiosity, what happens if you change the last line to -
where srn = 1 OR irn = 1?
where srn = 1 OR irn = 1?
I have an issue with the way points were awarded in this question.
It's pretty clear that multiple experts helped you, either by helping you frame up the question, discover issues in data that prevented a successful answer, and ultimately giving a final answer. It's also clear that the answer that was accepted borrowed code from other experts. Not that there's anything wrong with that, but it is common practice to acknowledge it, and it's not good grading practice to not acknowledge the help of others that led to it.
It appears that two experts (if not more) gave a correct answer to the question as it was defined.
In cases like this a split is more appropriate than awarding all points to the comment that was closest to what was ultimately the answer.
It's pretty clear that multiple experts helped you, either by helping you frame up the question, discover issues in data that prevented a successful answer, and ultimately giving a final answer. It's also clear that the answer that was accepted borrowed code from other experts. Not that there's anything wrong with that, but it is common practice to acknowledge it, and it's not good grading practice to not acknowledge the help of others that led to it.
It appears that two experts (if not more) gave a correct answer to the question as it was defined.
In cases like this a split is more appropriate than awarding all points to the comment that was closest to what was ultimately the answer.
ASKER
I agree, you helped me alot here getting to a solution, and yes I should have. The question is that I cannot undo points now. I appreciate your help, and will award points properly next time.
ASKER
or produces multiple scan dates and and import times
FROM [Table A]
GROUP BY Current_Location, UPC_Code