Mac
asked on
SQL scripted data correction
Using SQL 2012 I perform a manual process each month cleaning up erroneous data captured by the system. Yes it would be better to stop the data at the source but that's harder than it sounds.
I would like to script this entire process so I can run it either with a single call or even automate the running so that it happens before the SSRS report is generated. But I cant figure out how to put all the Hard pieces together with what I have now.
An example from todays work, I receive a trend for the month with unexplained spikes in the data. I note the date and run the following:
The results look like this:
So I find the ID number of the bad record and run this:
The issue I would like to solve is that the value I enter here is an approximation of the values before and after. Just had to be close, we are looking for a trend not specific numbers.
How would I automatically calculate a value to place in that single record and how would I do that for a months worth of data?
I would need to run through the months values looking for a number that is WAY higher than its predecessor and WAY higher than its successor. Then figure out what value to place there that is a reasonable mid value for the direction the data is trending.
Does this make sense? it does in my head .:)
I would like to script this entire process so I can run it either with a single call or even automate the running so that it happens before the SSRS report is generated. But I cant figure out how to put all the Hard pieces together with what I have now.
An example from todays work, I receive a trend for the month with unexplained spikes in the data. I note the date and run the following:
Declare @StDate DateTime
Declare @EndDate DateTime
Set @StDate = '6/06/2015 00:00:00.000' -- <<<<<<<<< SET BEGIN DATE
--Set @EndDate = DateAdd(month,1,@StDate) -- One Month
SET @EndDate = DateAdd(day,1,@StDate) --One Day
Select dateadd(day,(datediff(day,0,TIMEandDATE)),0) as ReadingDate, TimeandDate,Value, id, tagid, _COMMENT
From MyTable
where
TagID = 3004
and value > 0.06 -- <<<<< This varies depending on what I determine is a reasonable value based on before and after values
and TimeAndDate >= @StDate
and TimeAndDate < @EndDate
The results look like this:
2015-06-06 00:00:00.000 2015-06-06 20:00:00.953 0.0713222 8253107 3004
2015-06-06 00:00:00.000 2015-06-06 20:04:59.963 0.07127935 8253134 3004
2015-06-06 00:00:00.000 2015-06-06 20:09:59.947 0.07303441 8253161 3004
2015-06-06 00:00:00.000 2015-06-06 20:30:00.117 0.07020926 8253269 3004
2015-06-06 00:00:00.000 2015-06-06 20:35:00.107 0.1464455 8253296 3004 <<<<< Invalid Data Captured
2015-06-06 00:00:00.000 2015-06-06 20:40:00.193 0.07341963 8253323 3004
2015-06-06 00:00:00.000 2015-06-06 20:50:00.287 0.07256353 8253377 3004
2015-06-06 00:00:00.000 2015-06-06 21:50:00.563 0.07243514 8253701 3004
2015-06-06 00:00:00.000 2015-06-06 22:10:00.280 0.07252073 8253809 3004
So I find the ID number of the bad record and run this:
Update MyTable
Set Value = 0.073 , -- <<<<<<<<<<<<<<<<< SET CALCULATED (Reasonable)VALUE HERE
_COMMENT = LTRIM(ISNULL(_COMMENT,''+'Orig Value '+ CAST(Value AS VARCHAR(20)))) -- Places old value into comment field
where id = 8253296 ---- <<<<<<<<<<<<<<<<<<<<<< ENTER ID OF INVALID DATA HERE
GO
Which then produces this:2015-06-06 00:00:00.000 2015-06-06 20:00:00.953 0.0713222 8253107 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:04:59.963 0.07127935 8253134 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:09:59.947 0.07303441 8253161 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:15:00.040 0.06931031 8253188 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:20:00.037 0.06853986 8253215 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:25:00.130 0.06926751 8253242 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:30:00.117 0.07020926 8253269 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:35:00.107 0.073 8253296 3004 Orig Value 0.146445 <<< CORRECTED VALUE
2015-06-06 00:00:00.000 2015-06-06 20:40:00.193 0.07341963 8253323 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:45:00.183 0.06935316 8253350 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:50:00.287 0.07256353 8253377 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 20:55:00.267 0.06931031 8253404 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 21:00:00.253 0.06776935 8253431 3004 NULL
2015-06-06 00:00:00.000 2015-06-06 21:05:00.350 0.06717008 8253458 3004 NULL
The issue I would like to solve is that the value I enter here is an approximation of the values before and after. Just had to be close, we are looking for a trend not specific numbers.
How would I automatically calculate a value to place in that single record and how would I do that for a months worth of data?
I would need to run through the months values looking for a number that is WAY higher than its predecessor and WAY higher than its successor. Then figure out what value to place there that is a reasonable mid value for the direction the data is trending.
Does this make sense? it does in my head .:)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Salad-Dodger, do you still need help with this question?
ASKER
Apologies, It was a holiday weekend here and I (and my wife) was determined not to let work intrude this time.
It looks like Chaau has provided an intriguing solution but I have not yet seen if I can implement it. I will do that early in the week. I'm using 2012 so I hope that doesn't present a problem. Fiddle doesn't provide a 2012 option..
Did you have an alternative solution?
It looks like Chaau has provided an intriguing solution but I have not yet seen if I can implement it. I will do that early in the week. I'm using 2012 so I hope that doesn't present a problem. Fiddle doesn't provide a 2012 option..
Did you have an alternative solution?
Not really. The LAG and LEAD functions was introduced in SQL Server 2012 to make this kind of solutions more easy.
ASKER
Oh good. 2012 is what I use because I used PERCENTAGE function elsewhere.
I noticed Fiddler only had 2008 and 2014 as options so I got a little concerned that I would be on an island with this version.
I noticed Fiddler only had 2008 and 2014 as options so I got a little concerned that I would be on an island with this version.
Use 2014 then. It's almost a tuned 2012 ;)
ASKER
Nice answer. Quick too. I'm having a little trouble integrating it into my system, but that's clearly my issue and I shouldn't keep the thread open because of that. Thank you a great deal for the help.