Link to home
Start Free TrialLog in
Avatar of IT Guy
IT Guy

asked on

Output Data to a Table

I have the following query. I've tried to update a table (virtual in a SP) with the results, but I can't get it to work.

I've tried INSERT INTO (Fields.....) before the select last select statement, and I've also tried to copy the output into a ##TempTable (I don't want this, just experimenting) and neither process works.

How would I output the following query to a table?

if object_id('tempdb..#tmp_events','u') is not null
begin drop table #tmp_events end;
go

create table #tmp_events (RN int identity primary key, [date] datetime, EventClass char(1),StartDate datetime, EndDate datetime, [node] int);
go

insert #tmp_events ([date],EventClass,StartDate, EndDate,[node])
select [date]
      ,case when [EVENT] = 'START EVENT' then 'S'
            when [EVENT] = 'END EVENT' then 'E' else 'G' end as [EventClass]
      ,case when [EVENT] = 'START EVENT' THEN [DATE] ELSE NULL END AS [StartDate]
      ,case when [EVENT] = 'END EVENT' THEN [DATE] else NULL END AS [EndDate]
      ,[node]
from #your_data order by [node],[date];
go

; with cte2 as
( select C.RN,C.[date],C.[EventClass],C.[node],C.StartDate, C.EndDate
        ,case when C.[Node] <> isnull(P.[Node],0) then 'Y' else 'N' end as INC_Node
        ,case when C.startdate is not null and P.Startdate is null then 'Y'  else 'N' end as INC_Start
        ,case when C.enddate is null and P.enddate is not null then 'Y' else 'N' end as INC_End
  from #tmp_events C
  left outer join #tmp_events P on P.RN = C.RN-1    -- previous row
)
  select RN,[DATE],[EventClass], [NODE],groupvalue --,StartDate,EndDate, INC_Node, INC_Start, INC_End --, dense_rank() over (order by groupvalue) as [Grouping]  
  from cte2 C
  cross apply (select count(*) as groupvalue from cte2 C2 where C2.rn <= C.rn and (c2.INC_Start = 'Y' or c2.INC_Node = 'Y' OR c2.INC_End = 'Y') ) G
  order by RN;



Avatar of Mark Wills
Mark Wills
Flag of Australia image

G'Day,

Should be (famous last words) relatively straight forward I would have thought - guess you did too :)

Will have a look and get back to you shortly.

And pleased to see using the second query....
Ummm, what error are you getting ?

You will likely get the warning message :
Warning: Null value is eliminated by an aggregate or other SET operation.

Which is the reason for using aggregate in the cross apply so we always get a result - even if NULL. Warnings shouldnt necessarily halt a process - but can do.

You could do (just before the code) : SET ANSI_WARNINGS OFF
And follow up with resetting (after code) : SET ANSI_WARNINGS ON

e.g.
SET ANSI_WARNINGS OFF

drop table ##tempworktable 

; with cte2 as
( select C.RN,C.[date],C.[EventClass],C.[node],C.StartDate, C.EndDate
        ,case when C.[Node] <> isnull(P.[Node],0) then 'Y' else 'N' end as INC_Node
        ,case when C.startdate is not null and P.Startdate is null then 'Y'  else 'N' end as INC_Start
        ,case when C.enddate is null and P.enddate is not null then 'Y' else 'N' end as INC_End
  from #tmp_events C
  left outer join #tmp_events P on P.RN = C.RN-1    -- previous row
)
  select RN,[DATE],[EventClass], [NODE],groupvalue --,StartDate,EndDate, INC_Node, INC_Start, INC_End --, dense_rank() over (order by groupvalue) as [Grouping]   
into ##tempworktable 
  from cte2 C
  cross apply (select count(*) as groupvalue from cte2 C2 where C2.rn <= C.rn and (c2.INC_Start = 'Y' or c2.INC_Node = 'Y' OR c2.INC_End = 'Y') ) G 
  order by RN;


select * from ##tempworktable

--drop table ##tempworktable

SET ANSI_WARNINGS ON


Open in new window


If passing over to other procedures / functions / reporting (etc) then sometimes it is better to create an actual table with a uniquely obvious name explicitly for this specific function....

Have to be careful about / when sharing a table, and the global ##temp table above is not ideal.

In which case, I would be inclined to create a table first....

CREATE TABLE WRK_TEMP_EVENTS ( RN int primary key, [date] datetime, eventclass char(1), [Node] int, GroupValue int)
go

Open in new window


And then populate :

SET ANSI_WARNINGS OFF


truncate table WRK_TEMP_EVENTS


; with cte2 as
( select C.RN,C.[date],C.[EventClass],C.[node],C.StartDate, C.EndDate
        ,case when C.[Node] <> isnull(P.[Node],0) then 'Y' else 'N' end as INC_Node
        ,case when C.startdate is not null and P.Startdate is null then 'Y'  else 'N' end as INC_Start
        ,case when C.enddate is null and P.enddate is not null then 'Y' else 'N' end as INC_End
  from #tmp_events C
  left outer join #tmp_events P on P.RN = C.RN-1    -- previous row
)
insert WRK_TEMP_EVENTS (RN, [date],eventclass,[node],GroupValue)
  select RN,[DATE],[EventClass], [NODE],groupvalue --,StartDate,EndDate, INC_Node, INC_Start, INC_End --, dense_rank() over (order by groupvalue) as [Grouping]  
  from cte2 C
  cross apply (select count(*) as groupvalue from cte2 C2 where C2.rn <= C.rn and (c2.INC_Start = 'Y' or c2.INC_Node = 'Y' OR c2.INC_End = 'Y') ) G
  order by RN;


select * from WRK_TEMP_EVENTS

SET ANSI_WARNINGS ON

Open in new window


The advantage is you can add more structure to WRK_TEMP_EVENTS if you wand to handle things like a userid/ multiuser and suchlike.

1) The sub-query of the CROSS APPLY can contain no error, thus the entire result can be empty. Check whether you need OUTER APPLY instead.

2) The RN approach is not guaranteed to work. Use ROW_NUMBER() instead.

3) "How would I output the following query to a table?"
I wouldn't in the first place. In such a design scenario the caller has to use INSERT INTO .. EXECUTE sproc_name.
@ste5an

The query works fine. That was already proven in another question thread.

The RN is actually an identity from a table. It did start out as row_number()

Yes, one could INSERT INTO ... EXEC but also could insert directly from the CTE, or, use SELECT ... INTO FROM.

Best if we see what the problems are before we offer more suggestions...

I am wondering if there is a scope issue for the temp table #tmp_events - it really does depend on what symptoms / errors are occurring.
Yes, one could INSERT INTO ... EXEC but also could insert directly from the CTE, or, use SELECT ... INTO FROM.
You missed the point.

INSERT INTO .. EXECUTE ,, can have different scopes. CTE, or, use SELECT ... INTO FROM must operate in the same scope.
Thats exactly right @ste5an. You are of course quite correct, and no, I didnt "miss the point".

I even extrapolated the same point about scope and the #tmp table already in use. but anyway.... just guessing a bit until we find out more symptoms / errors etc.
Avatar of IT Guy
IT Guy

ASKER

G'Day Mark.

I think it's the volume of transactions that could be the issue. I limited the query to 5 nodes and it completes in seconds with values written to WRK_TEMP_EVENTS.

I'm running it now on the full record set which is about 168,000 records - it should work given the code has written data as expected, and if it doesn't work, it has to be the volume of data we're processing.

I expected to see records continually added to WRK_TEMP_EVENTS  in a similar way to how the select keeps adding records, but the query looks to be holding the processed records in memory before writing the everything to the table when the query has completed.
Avatar of IT Guy

ASKER

PS: I don't get any errors, just non-action. We are so close. Thanks for your help. 
Running the query without going to WRK_TEMP_EVENTS works as expected ?

It is only when outputting the results to the table ?

Volume shouldnt be an issue, but there could be errors within the procedure.... As ste5an said, could be the difference between cross apply and outer apply - which is why I ask about running the query without capturing data results in a separate table
Avatar of IT Guy

ASKER

Running the query, limited to 5 nodes, writes data to WRK_TEMP_EVENTS .

I'm running the query now without writing to WRK_TEMP_EVENTS to see how long it takes. Then I'll run it with the update to WRK_TEMP_EVENTS - and I'll know roughly what time it will complete.

But yes, the query works beautifully, the groupings are spot-on. I just need to let it complete after I know how long it should take (I don't want to wait all day watching 'Executing' - and timing it without the update will give me a good idea of when it should complete.

And strewth mate, you should be asleep :) I had a good Australian friend in high-school when I was living in England, and I can almost hear your Aussie accent!   
Hahaha.... Yes, sleep, kind of overrated and love that "strewth mate". Very Aussie.

Would be also interesting to show row counts of the #tmp table as well. Shouldnt take too long, although if I remember correctly the old procedure could take the best part of a day ?

We then have to check row counts to see if there is difference between #tmp and WRK and then determine if there was a violation of data constructs....

ASKER CERTIFIED SOLUTION
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of IT Guy

ASKER

Great. Thanks mate. Get some sleep. I'll try the new suggestion and let you know.
Avatar of IT Guy

ASKER

I added (WITH (NOLOCK) on all the table selects, I added 'outer apply' (thanks ste5an?) and I selected:

select * from WRK_TEMP_EVENTS WITH (NOLOCK)

And I can see data populating in WRK_TEMP_EVENTS - which is excellent news. So long as I know data is populating - which I can now see - that could be it. Done.

I just need to make the first select into a physical table (?? they're all virtual in a way, but a table table, if you know what I mean) and run.

I'll let you know. Thanks both.
Avatar of IT Guy

ASKER

Not sure if it's the OUTER APPLY or NOLOCK, but so far it processed ~50,000 records in 10 minutes which is a clear record. I think copying records into WRK_TEMP_EVENTS must free up a resource, because it wasn't close to being this fast before.
It will be hard to isolate. But if it is continuing, then it could be the OUTER APPLY

Unlikely, because the cross apply will return a count - even if zero - regardless of the where - which is why we are using an aggregate to return a count e.g.  
;with cte as
( select 1 as RN,'A' as a, 'B' as b
  union all
  select 2,'1','2'
  union all
  select 3,'Z','Z'
) select * from cte
  cross apply (select count(*) as groupvalue from cte cte2 where cte2.rn <= cte.rn and a=b) g

Open in new window



If you are measuring progress by : SELECT ... NOLOCK then that probably has shown rows that would otherwise be not available if respecting locking.

It will slow down, the more it has to count. In that regard, there's a couple of thoughts, one of which could well be to materialise the #tmp_events. We need to kepp it all tight and as simplified as possible.

Arguably, we just need a count (rather than select *) to show progress, and compare that to the count of #tmp_events.

10 minutes is a long time in my books for 50,000 rows - hopefully that is elapsed time and includes the time taken to build #tmp_events as well....

Though, cant argue too much if there are new records being set when compared to the old ways....

Avatar of IT Guy

ASKER

G'Day mate.

I'm interested in what you mean by this:

Arguably, we just need a count (rather than select *) to show progress, and compare that to the count of #tmp_events.

I already have 2 tables, 1 representing #tmp_events and one representing the output of the query.

What am I missing, pls?
If you are doing :  select * from WRK_TEMP_EVENTS WITH (NOLOCK)

Then to check progress, you only need a counter e.g.

select count(*) from WRK_TEMP_EVENTS WITH (NOLOCK)

Then compare that to the count taken after the build of #tmp_events - you dont have to return actual rows / column values to check progress. The total number of rows will end up being the same, so it is a direct comparison....

And depending on row counts / volumes, we can probably improve performance by changing #tmp_events into aa actual table (as per WRK_TEMP_EVENTS) where we could optimise even more so.... at the very least we could get a count from a different query window in much the same way you are getting results for WRK_TEMP_EVENTS.

The deciding factor would be how often / how permanent this type of analysis is required. It could be possible to make those tables a bit more incremental instead of having to recalculate everything from scratch each time.


Avatar of IT Guy

ASKER

This is how I see it.

Looping through a cursor of a recordset with really easy 'if this then do that' code - it took 3 times longer than this code.

Also, I kind of took a liberty with the question. Because the data is always only 1 START_EVENT and 1 END_EVENT or 2 START_EVENT and 2 END_EVENT. It's more uniform than I explained - but I wanted 1 proc to fit all - and your code does that well. I tested today on 1 and 2 EVENT data and it worked perfectly on each - and as I say, it's quicker than DECLARE ****** CURSOR FOR.

The runtime totally grows as the recordset grows, but I tested it on the largest recordsets, and it's more than acceptable, compared to the CURSOR results.

Nothing is ever perfect? But your code helped me to supply 1 database which fits both data requirements. So it's a win. And you're awesome. And it's nice to know you. Thank you. 
Thanks mate, very kind words. And my absolute pleasure. Thank you.
Had another curious thought....

Wondering how this approach would handle your volumes...


  ; with cte2 as
( select C.RN,C.[date],C.[EventClass],C.[node],C.StartDate, C.EndDate
        ,case when C.[Node] <> isnull(P.[Node],0)
                 or C.startdate is not null and P.Startdate is null
               or C.enddate is null and P.enddate is not null then 1 else 0 end as INC_Grouping
  from #tmp_events C
  left outer join #tmp_events P on P.RN = C.RN-1    -- previous row
)
  select RN,[DATE],[EventClass], [NODE], SUM(INC_Grouping) OVER(ORDER BY RN ROWS UNBOUNDED PRECEDING) groupvalue --,StartDate,EndDate, INC_Node, INC_Start, INC_End --, dense_rank() over (order by groupvalue) as [Grouping]  
  from cte2 C
--  cross apply (select count(*) as groupvalue from cte2 C2 where C2.rn <= C.rn and (c2.INC_Start = 'Y' or c2.INC_Node = 'Y' OR c2.INC_End = 'Y') ) G
  order by RN;

Open in new window


Avatar of IT Guy

ASKER

I suppose....106,421 records in 1 second is an improvement?

WOW WOW. Very cool. I want to give you more kudos.

Mate, I owe you a Fosters, if that's your thing.

That's much better - I'm happy now. That is very cool. Thanks for following up. Much appreciated.