Alexandre Simões
asked on
How to get the latestdata from this table...
Hi,
I'm using SQL 2005 or 2008
Imagine a table with the following appearance:
ID Col1 Col2 Col3
== ==== ==== ====
1 123 234 NULL
2 23 NULL 456
3 NULL 21 NULL
Is there a way to get the latest values of each column on the table?
On this case I wnat to get:
Col1 Col2 Col3
==== ==== ====
23 21 456
Note that I want to make this query generic for every table, so I don't know the schema of the table :)
I can do this on the C# side but it would be nice to have it all on the SQL inside a SP.
Thanks,
Alex
I'm using SQL 2005 or 2008
Imagine a table with the following appearance:
ID Col1 Col2 Col3
== ==== ==== ====
1 123 234 NULL
2 23 NULL 456
3 NULL 21 NULL
Is there a way to get the latest values of each column on the table?
On this case I wnat to get:
Col1 Col2 Col3
==== ==== ====
23 21 456
Note that I want to make this query generic for every table, so I don't know the schema of the table :)
I can do this on the C# side but it would be nice to have it all on the SQL inside a SP.
Thanks,
Alex
what do you mean by generic? you don't know the column names in advance?
try this
with a as (
select row_number() over(order by case when col1 is null then -1 else id end) as row1,
row_number() over(order by case when col2 is null then -1 else id end) as row2,
row_number() over(order by case when col3 is null then -1 else id end) as row3,
col1,col2,col3, count(id) over() as cnt
from your_table)
select max(case when cnt=row1 then col1 else null end) as col1,
max(case when cnt=row2 then col2 else null end) as col2,
max(case when cnt=row3 then col3 else null end) as col3
from a
with a as (
select row_number() over(order by case when col1 is null then -1 else id end) as row1,
row_number() over(order by case when col2 is null then -1 else id end) as row2,
row_number() over(order by case when col3 is null then -1 else id end) as row3,
col1,col2,col3, count(id) over() as cnt
from your_table)
select max(case when cnt=row1 then col1 else null end) as col1,
max(case when cnt=row2 then col2 else null end) as col2,
max(case when cnt=row3 then col3 else null end) as col3
from a
Here ya go
select
(select Col1 from ExampleTable where ID =
(select max(ID) from ExampleTable where Col1 IS NOT NULL)) Col1
,(select Col2 from ExampleTable where ID =
(select max(ID) from ExampleTable where Col2 IS NOT NULL)) Col2
,(select Col3 from ExampleTable where ID =
(select max(ID) from ExampleTable where Col3 IS NOT NULL)) Col3
results.bmp
ASKER
Hi, thanks for the replies.
momi_sabag: looks good and works fine, I tend not to like CTE's tho as they're slow.
HoggZilla: that's how I'm doing it right now. Works fine.
Now, both approaches work but what I really wanted from this question was the Generic part of it.
If possible I would like to make it dynamic to be used on several tables with different schemas rather that having to code on SP per table.
I know this will be slower but I don't mind if it would be a bit slower but generic.
Any ideas?
momi_sabag: looks good and works fine, I tend not to like CTE's tho as they're slow.
HoggZilla: that's how I'm doing it right now. Works fine.
Now, both approaches work but what I really wanted from this question was the Generic part of it.
If possible I would like to make it dynamic to be used on several tables with different schemas rather that having to code on SP per table.
I know this will be slower but I don't mind if it would be a bit slower but generic.
Any ideas?
Will the first column in the table be an integeter that can be used to determine insert order, eg Identity column?
ASKER
We certainly have to make some rules.
One can be assuming that the first columns is the identity and its type is int, no problem.
thanks,
Alex
One can be assuming that the first columns is the identity and its type is int, no problem.
thanks,
Alex
CTE query would be faster than the inline multiple queries I would think - need to be a very large table to check, and probably no real difference in smaller tables... And a bit of a contrast with your other comment about flexibility versus speed...
certainly the inline query is "easier" to do some dynamic SQL on, and identifying the MAX(ID) as a link is not real hardship, the problem is how do you determine the column ? Are there any other rules ? Are you going to know the table_name ?
You can grab the table definition from the information_schema using columns ordinal position and run a series of selects using each column that way, but it is not going to be very fast... Two ways, either loop, or, use a concatenation technique using for xml...
Assumed that each table has a column known as ID and you have to nominate the table name...
certainly the inline query is "easier" to do some dynamic SQL on, and identifying the MAX(ID) as a link is not real hardship, the problem is how do you determine the column ? Are there any other rules ? Are you going to know the table_name ?
You can grab the table definition from the information_schema using columns ordinal position and run a series of selects using each column that way, but it is not going to be very fast... Two ways, either loop, or, use a concatenation technique using for xml...
Assumed that each table has a column known as ID and you have to nominate the table name...
--
-- method 1 loop through information schema building up some dynamic sql
--
declare @table_name varchar(200)
declare @column_name varchar(200)
declare @sql varchar(max)
declare @i int
set @table_name = 'customers'
set @sql = ''
set @i = 1
while @i <= (select max(ordinal_position) from information_schema.columns where table_name = @table_name)
begin
select @column_name = column_name from information_schema.columns where table_name = @table_name and ordinal_position = @i
set @sql = @sql+'(select ['+@column_name+'] from '+@table_name+' where ID = (select max(ID) from '+@table_name+' where ['+@column_name+'] IS NOT NULL)) as ['+@column_name+'],'+char(13)+char(10)
set @i = @i + 1
end
print @sql
exec(@sql)
--
-- method 2 - a bit cooler, using a concatenation technique...
--
declare @table_name varchar(200)
set @table_name = 'customers'
declare @sql varchar(max)
declare @i int
set @i = (select max(ordinal_position) from information_schema.columns where table_name = @table_name)
set @sql = replace((select '(select ['+column_name+'] from '+table_name+' where ID = (select max(ID) from '+table_name+' where ['+column_name+'] IS NOT NULL)) as ['+column_name+']'+case when ordinal_position = @i then '' else ',#CRLF#' end [text()] from information_schema.columns where table_name = @table_name order by ordinal_position for xml path('')),'#CRLF#',char(13)+char(10))
print @sql
exec(@sql)
Apologies, do not do the exec (@sql) - the queries do have to have a 'Select' to begin with etc...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks! This is good stuff.
I'm working in your 2nd method and changed it a bit so I don't have to pass in the identity column.
As usually each table have it's own identity column name it would be a overhead to query the database for the identity column name before actually calling your code.
I'll be running more tests to better identify the best option.
Once again, thank you all.
Alex
I'm working in your 2nd method and changed it a bit so I don't have to pass in the identity column.
As usually each table have it's own identity column name it would be a overhead to query the database for the identity column name before actually calling your code.
I'll be running more tests to better identify the best option.
Once again, thank you all.
Alex
declare @table_name varchar(200)
set @table_name = 'Test'
declare @sql varchar(max)
declare @i int
declare @idcolumn_name varchar(50);
-- get identity column name
select @idcolumn_name = column_name
from information_schema.columns
where
table_schema = 'dbo' and table_name = @table_name
and columnproperty(object_id(table_name), column_name,'IsIdentity') = 1
order by table_name
if (@idcolumn_name is not null)
begin
-- generate sql
set @i = (select max(ordinal_position) from information_schema.columns where table_name = @table_name)
set @sql = 'select ' +
replace((
select '(select ['+column_name+'] from '+table_name+' where ' + @idcolumn_name + ' = (select max(' + @idcolumn_name + ') from '+table_name+' where ['+column_name+'] IS NOT NULL)) as ['+column_name+']'+
case when ordinal_position = @i then ''
else ',#CRLF#' end [text()]
from information_schema.columns
where table_name = @table_name
order by ordinal_position for xml path(''))
,'#CRLF#',char(13)+char(10))
print @sql
exec(@sql)
end
Running a quick query on the information schemas is not too big an ask - will not add all that much overhead, and really not too many options... Will be interesting to see the final answer...
Cheers,
Mark Wills
Cheers,
Mark Wills
ASKER
Running this query on a table with 155.000 records and 5 columns takes no more than 3 seconds on my laptop.
This will never happen as the query will never be on this amount of data but even on this kind of extreme scenario and taking in account the kind of query we're doing here, 3 seconds is a very good result.
I will run some more tests and close thread this shortly.
Thanks!
Alex
This will never happen as the query will never be on this amount of data but even on this kind of extreme scenario and taking in account the kind of query we're doing here, 3 seconds is a very good result.
I will run some more tests and close thread this shortly.
Thanks!
Alex
Seems reasonable. Not too many things you can do and retain that ultimate flexibility... As you said before there is a small price and it would seem that it really is a small price to pay... Have found laptops to occasionally outperform grunty servers - so - just be aware that the humble laptop is not a gaurantee that it is a lot slower than the server.
ASKER
Yeah, my laptop is not that bad at all and as this is meant to be applied on a SOA architecture as a logging service I really don't know how good will the "servers" be :)
I'm putting into battle the well known relational logging datamodel and a mirror datamodel approach.
Still, the 3 seconds delay is good for me as this isn't a "production" task.
Once again, thanks for your help.
I'm putting into battle the well known relational logging datamodel and a mirror datamodel approach.
Still, the 3 seconds delay is good for me as this isn't a "production" task.
Once again, thanks for your help.