SQL - Populate missing values with values of not null rows above it

I've forgotten how to do this!

I for each task_number I need to populate null values in field 'bugs_tag'
with the value above it (of the largest lower-numbered row number where the value is not null).

I tried a self-join with a.rowno +1 = b.rowno approach,
but that only gives me one add'l value.

I was trying to avoid subqueries but can do if that's expeditious.

Any suggestions?
Thanks!
RalphAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

PortletPaulEE Topic AdvisorCommented:
& I've forgotten how to read your mind....

You are probably looking at some data and attempting to describe that to us, something is above something ...
Why not share some of that data so we also can visualize what you mean?

We also need to know what database you are using! Is it Oracle? SQL Server? Postgres? ???? syntax can differ for each vendor
0
RalphAuthor Commented:
Sure!

Database is Presto.

From the SS I sent earlier, I've taken some of it and put in '-CAPS-' what I want to have filled in.

rowno	task_number	event_date	bugs_tag	owner_id
1	19839751	2017-07-01 00:00:15	exists	-1
2	19839751	2017-07-03 11:04:59	-EXISTS-	500619427
3	19839751	2017-07-13 06:27:50	-EXISTS-	1531268293
4	19839751	2017-07-13 10:44:15	-EXISTS-	632697446
5	19839751	2017-07-13 14:44:53	-EXISTS-	0
6	19839751	2018-03-15 19:26:59	-EXISTS-	100001606943892
7	19839751	2018-03-19 10:28:42	-EXISTS-	0
1	19839752	2017-07-01 00:00:02	exists	-1
2	19839752	2017-07-01 00:01:17	absent	-1
3	19839752	2017-07-01 00:01:17	closed	0
1	19839767	2017-07-01 00:00:03	absent	100008604789082
2	19839767	2017-07-03 14:39:21	-ABSENT-	511722769
3	19839767	2017-07-07 14:52:17	-ABSENT-	745635826
4	19839767	2017-11-01 10:47:29	exists	-1
5	19839767	2017-11-01 11:26:18	-EXISTS-	0
6	19839767	2017-11-01 12:04:18	closed	0
1	19839800	2017-07-01 00:02:05	absent	0
2	19839800	2017-07-01 00:02:05	-ABSENT-	1036139832
3	19839800	2017-07-01 00:02:05	exists	-1

Open in new window


I'm going to try taking, by task_number, the rows where there are 'bugs_tag' values,
and again (another view) where values are missing.
I'll look into using row #s from the first view to bound the range in the 2nd view, using the first view's lower-row-numbered bugs_tag to populate.

View 1
1	19839767	2017-07-01 00:00:03	absent	100008604789082
4	19839767	2017-11-01 10:47:29	exists	-1
6	19839767	2017-11-01 12:04:18	closed	0

Open in new window


View 2
2	19839767	2017-07-03 14:39:21		511722769   <-- These are between View 1's rownos 1 & 4,
3	19839767	2017-07-07 14:52:17		745635826   <-- so I would use row 1's 'absent' to populate these values
5	19839767	2017-11-01 11:26:18		0

Open in new window


Now to try to make this happen.

Any advice?
Thanks.
0
PortletPaulEE Topic AdvisorCommented:
A "traditional" way to "repeat" some value from a prior row is to use a "correlated subquery" inside the select clause e.g.
SELECT
      ROWNO, TASK_NUMBER, bugs_tag
    , (select bugs_tag from table1 t2
       where t2.task_number = t1.task_number
       and t2.rowno <= t1.rowno
       and t2.bugs_tag is not null
       order by rowno desc
       limit 1) x
FROM TABLE1 t1
;

Open in new window

I have not used Presto, but documentation suggests it supports common table expressions (with...), window functions, and lateral joins so the following should work an would be more efficient than the above (although as you will see, the query is very similar)
SELECT
      ROWNO, TASK_NUMBER, bugs_tag, x
FROM TABLE1 t1
cross join lateral (select bugs_tag x from table1 t2
       where t2.task_number = t1.task_number
       and t2.rowno <= t1.rowno
       and t2.bugs_tag is not null
       order by rowno desc
       limit 1) cjl
;

Open in new window

So the result from this query would look like this:
| rowno | task_number | bugs_tag |      x |
|-------|-------------|----------|--------|
|     1 |    19839751 |   exists | exists |
|     2 |    19839751 |   (null) | exists |
|     3 |    19839751 |   (null) | exists |
|     4 |    19839751 |   (null) | exists |
|     5 |    19839751 |   (null) | exists |
|     6 |    19839751 |   (null) | exists |
|     7 |    19839751 |   (null) | exists |
|     1 |    19839752 |   exists | exists |
|     2 |    19839752 |   absent | absent |
|     3 |    19839752 |   closed | closed |
|     1 |    19839767 |   absent | absent |
|     2 |    19839767 |   (null) | absent |
|     3 |    19839767 |   (null) | absent |
|     4 |    19839767 |   exists | exists |
|     5 |    19839767 |   (null) | exists |
|     6 |    19839767 |   closed | closed |
|     1 |    19839800 |   absent | absent |
|     2 |    19839800 |   (null) | absent |
|     3 |    19839800 |   exists | exists |

Open in new window


You can see this as an operating query at http://sqlfiddle.com/#!17/bd06e/5 (using Postgres 9.6)

As to "View1" and "View2"; I just do not understand what theses are meant to do. Sorry.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
RalphAuthor Commented:
The 'LATERAL' option did not work with my company's version of presto, but I got it done.

See:
 , bugs_n_owners3 (rowno, task_number, event_time, bugs_tag, owner_id)
   AS             (SELECT  * FROM bugs_n_owners2
                   UNION
                   SELECT  * FROM bno2_still_open
                  )
                  -- select * from bugs_n_owners3 order by 2, 1 limit 100 /*
                  -- 
                  
                  
                  -- From the above extract rows with 'bugs_tag' values
                  -- 
 , bno3_with      (positioner, rowno, task_number, event_time, bugs_tag, owner_id)
   AS             (SELECT ROW_NUMBER() OVER (PARTITION BY task_number ORDER BY task_number, rowno) AS positioner
                        , *
                   FROM   bugs_n_owners3
                   WHERE  bugs_tag IS NOT NULL
                  )
                  -- select * from bno3_with order by 3, 2 limit 100 /*
                  -- 
                  
                  -- 
                  -- From the above extract rows without 'bugs_tag' values
                  -- 
 , bno3_without   (rowno, task_number, event_time, bugs_tag, owner_id)
   AS             (SELECT *
                   FROM   bugs_n_owners3
                   WHERE  bugs_tag IS NULL
                  )
                  -- select * from bno3_without order by 2, 1 limit 100 /*
                  -- 
                  
                  -- 
                  -- THIS IS FUN!
                  -- The inner Q extracts all the parameters we need, (and more), for use
                  -- in selecting the blank bugs_tag cells to be replaced by the 'with' values from above,
                  -- then the bounding 'with' values are UNIONed in
                  -- 
 , bugs_n_owners  (rowno, task_number, event_time, bugs_tag, owner_id)
   AS             (SELECT  rowno
                        ,  bw3.task_number
                        ,  bw3.event_time
                        ,  bw4.tag_above        AS bugs_tag
                        ,  bw3.owner_id
                   FROM    bno3_without  bw3
                   JOIN   
                          (SELECT  bw1.task_number
                                ,  bw1.positioner
                                ,  bw2.positioner
                                ,  bw1.rowno            AS above
                                ,  bw2.rowno            AS below
                                ,  bw1.event_time
                                ,  bw1.bugs_tag         AS tag_above
                                ,  bw2.bugs_tag         AS tag_below
                           FROM    bno3_with bw1
                           --
                           JOIN    bno3_with bw2
                             ON    bw1.task_number    = bw2.task_number
                            AND    bw1.positioner + 1 = bw2.positioner
                            AND    bw2.rowno          > bw1.rowno + 1
                          ) bw4
                     ON    bw3.task_number = bw4.task_number
                   WHERE   bw3.rowno BETWEEN bw4.above AND bw4.below
                   UNION   SELECT rowno, task_number, event_time, bugs_tag, owner_id
                           FROM bno3_with
                  )
                  -- select * from bugs_n_owners where task_number = 19845056 order by 2, 1 /*
                  --
                  
                  --
                  -- Now we need to find the period that an owner retains the bugs tag
                  -- NOTE:  Up until now ALL FB Employees were here, requiring less logic to
                  --        track changes up to now.
                  -- Final join here filters to just keep members of 'Oculus QA All'
                  --

 , bug_retention  (id, task_number, took_ownership, retention_days)       
   AS             (SELECT  bo1.owner_id                                  AS id
                        ,  bo1.task_number
                        ,  CAST( FROM_UNIXTIME(bo1.event_time) AS VARCHAR) AS took_ownership
                        ,  CASE WHEN bo2.event_time - bo1.event_time > 86399 THEN CAST(SPLIT_PART(CAST(PARSE_DURATION(CONCAT(CAST(CAST(bo2.event_time - bo1.event_time AS BIGINT) AS VARCHAR), 's')) AS VARCHAR), ' ', 1) AS BIGINT) ELSE 0 END AS retention_days
                   FROM    bugs_n_owners bo1
                   JOIN    bugs_n_owners bo2
                     ON    bo1.task_number = bo2.task_number
                    AND    bo1.rowno + 1   = bo2.rowno
                   JOIN    (SELECT DISTINCT CAST(user_id AS BIGINT) AS user_id
                            FROM   employee_teams:di
                            WHERE  CONTAINS(team_names, 'Oculus QA All')
                           ) a
                     ON    bo1.owner_id = user_id
                   WHERE   bo1.bugs_tag = 'exists'
                     AND   bo1.owner_id > 0
                  )
 
SELECT id, task_number, took_ownership, retention_days
FROM   bug_retention 

Open in new window


Thanks!  Good to learn about the 'LATERAL' option in the future!
Go PostgreSQL!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
SQL

From novice to tech pro — start learning today.