Link to home
Start Free TrialLog in
Avatar of tashmyke
tashmyke

asked on

SQL Subquery find previous record in a group

I would like some help with a subquery that I am trying to write, I have tried varies attempts and have not been able to figure it out.
A solution using SQL-99 syntax would be acceptable, although I am using Access to return the data.

Data sample:
TRAN_ID      DIST      ACT_COST      DATE
12345            19      100.00            12/31/2008
12345            20      100.00            1/31/2009
12345            21      100.00            1/31/2009    
12345            22      0                    1/31/2009    <- I want to return this record
12345            23      0                    2/28/2009
12345            24      0                    2/28/2009    <- and this record
12345            25      0                    3/31/2009
12345            26      0                    4/30/2009

Using the above data set I would like to write a query that will return the following
TRAN_ID     DIST      ACT_COST    DATE        PREV_DIST      PREV_ACT_COST      PREV_DATE
12345             24            0              2/28/2009        22            0                  1/31/2009

I have tried to write the subquery and I am lost, see the mess below.
SELECT DISTINCT
TRAN_ID,
MAX(DIST),
LAST(ACT_COST),
DATE
FROM TABLE A
WHERE DATE  = 2/28/2009
------   I NEED A SUBQUERY TO GET THE PREVIOUS RECORD USING THE GROUPING MAX(DIST) AND LAST(ACT_COST)
----------
GROUP BY TRAN_ID, DATE;
Avatar of peter57r
peter57r
Flag of United Kingdom of Great Britain and Northern Ireland image

What logic are you using to select the previous record.
Avatar of tashmyke
tashmyke

ASKER


I'm going to simplify it, I think my original request might be too complex and hard to explain.

I 'm really looking for last change in the ACT_COST column by TRAN ID, I also will need the DIST and DATE value returned for that record.

Data sample:
TRAN_ID      DIST      ACT_COST      POST_DATE
12345            19      100.00            12/31/2008
12345            20      100.00            1/31/2009
12345            21      100.00            1/31/2009    
12345            22      0                    1/31/2009    
12345            23      0                    2/28/2009
12345            24      0                    2/28/2009    
12345            25      0                    3/31/2009
12345            26      0                    4/30/2009

Using the data set above

SELECT TABLE.TRAN_ID AS TRAN_ID, Max(TABLE.DIST) AS DIST, Last(TABLE.ACT_COST) AS ACT_COST, TABLE.POST_DATE AS POST_DATE
FROM [TABLE]
GROUP BY TABLE.TRAN_ID, TABLE.POST_DATE
HAVING (((TABLE.TRAN_ID)="12345") AND ((TABLE.POST_DATE)=#2/28/2009#))
ORDER BY TABLE.TRAN_ID, Max(TABLE.DIST);

It returns the following record:
TRAN_ID      DIST      ACT_COST      POST_DATE
12345            24      0                    2/28/2009

I would like to expand the query so that it will return that record along with the last change in ACT_COST which is the record where DIST=21

TRAN_ID      DIST      ACT_COST      POST_DATE     DIST_CHG     ACT_COST_CHG      POST_DATE_CHG
12345            24      0                    2/28/2009               21          100.00                  1/31/2009  

Personally, I would say that the chances of you using a query for this are minimal.
This requires recordset processing so that positional changes can be recognised and stored for later re-use.
If your DIST values are consecutive then it is easy.

Last change in ACT_COST calculated for MAX(DIST) which should be equivalent to MAX(POST_DATE):

SELECT t1.TRAN_ID, t1.DIST, t1.ACT_COST, t2.DIST, t2.ACT_COST, t2.POST_DATE
  FROM Table t1
  INNER JOIN Table t2 ON t2.TRAN_ID = t1.TRAN_ID AND t2.DIST = t1.DIST+1
 WHERE t1.ACT_COST <> t2.ACT_COST
    AND t2.DIST = (SELECT MAX(DIST) FROM Table WHERE TRAN_ID = t2.TRAN_ID );
That's a nice try pcelba, but I can't see that it would give the poster the specific answer he used as an example.
The clause :
 INNER JOIN Table t2 ON t2.TRAN_ID = t1.TRAN_ID AND t2.DIST = t1.DIST+1
means that a record can only be matched with its immediate neighbour  and the sample result was using a more distant record.  
So you should get results showing pairs of records when changes occur but unless that is actually what the poster wants then I think the problem is more difficult.
I know my solution is not complete. It was just a reaction to simplified question mentioned in ID: 24861432 ("I 'm really looking for last change in the ACT_COST column by TRAN ID, I also will need the DIST and DATE value returned for that record.")

I am matching immediate neighbour to identify the ACT_COST column value change.

The more distant record can be joined using its TRAN_ID, it is not a problem but looking at my answer it cannot work because the last condition is wrong...

Much easier would be to upsize this table to SQL Server and create a complex query.
It seems Access 2007 allows rather complex queries. Example of what you could need is attached. It just uses Table4 instead of [Table] and the date has British format. The first part is your original query, the second is my part joined.

Sample data:
TRAN_ID      DIST    ACT_COST      POST_DATE
12345      19      100      31.12.2008
12345      20      100      31.1.2009
12345      21      100      31.1.2009
12345      22      50      31.1.2009
12345      23      0      31.1.2009
12345      24      0      31.1.2009
12345      25      0      28.2.2009
12345      26      0      28.2.2009
12345      27      0      31.3.2009
12345      28      0      30.4.2009

The last point to solve are values of DIST column. The query requires no gaps in DIST numbering.
SELECT orig.TRAN_ID, orig.DIST, orig.ACT_COST, orig.POST_DATE, chng.DIST_CHG, chng.ACT_COST_CHG, chng.POST_DATE_CHG
  FROM (
SELECT TABLE4.TRAN_ID AS TRAN_ID, Max(TABLE4.DIST) AS DIST, Last(TABLE4.ACT_COST) AS ACT_COST, TABLE4.POST_DATE AS POST_DATE
FROM TABLE4
GROUP BY TABLE4.TRAN_ID, TABLE4.POST_DATE
HAVING (((TABLE4.TRAN_ID)="12345") AND ((TABLE4.POST_DATE)=#28/2/2009#))
ORDER BY TABLE4.TRAN_ID, Max(TABLE4.DIST) )  orig 
INNER JOIN (
SELECT TOP 1 t1.TRAN_ID, t1.DIST AS DIST_CHG, t1.ACT_COST AS ACT_COST_CHG, t2.DIST, t2.ACT_COST, t1.POST_DATE AS POST_DATE_CHG
FROM Table4 AS t1 INNER JOIN Table4 AS t2 ON (t2.DIST = t1.DIST+1) AND (t2.TRAN_ID = t1.TRAN_ID)
WHERE t1.ACT_COST <> t2.ACT_COST
order by t1.DIST DESC ) chng  ON chng.TRAN_ID = orig.TRAN_ID
;

Open in new window

Please count it as an attempt - my part is working just for one TRAN_ID value only...
It's so close ... it returns the record below if "12345" is the only Tran ID.

TRAN_ID      DIST      ACT_COST      POST_DATE     DIST_CHG     ACT_COST_CHG      POST_DATE_CHG
12345            24      0                    2/28/2009               21          100.00                  1/31/2009  


It needs to work for multiple Tran ID's.

Like peter57r said I think it's too complex.
I know it is a little bit complex but you have no other solution yet...

You still did not answer the question if the DIST values are consecutive to allow
INNER JOIN Table t2 ON t2.TRAN_ID = t1.TRAN_ID AND t2.DIST = t1.DIST+1
It seems the TRAN_ID is a text field. Does it contain digits only?
ASKER CERTIFIED SOLUTION
Avatar of Pavel Celba
Pavel Celba
Flag of Czechia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial