catch the code behind a sql error

there is a specific sql error that happens few times a week.. (maybe one error every other day.. there is no pattern of when the error happens)..

I need to catch the exact process causing the error without overwhelming the profile process...

I don't know when I will see the same error again on the error log.. so if can capture all entries, I am going to capture 100s of 1000s of records in profiler.

if you knew the error number that you are after, how would you go about pinpointing the code that caused it? (if you did not know when it will occur nextx or which database etc)
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Barry CunneyCommented:
Hi 25112,
You could use SQL Alerts to help narrow down potential suspects.

Create a SQL Server event alert and specify the error number you mention.
Also, in the Response options, specify that the alert executes a job which records details, to a table, of current processes executing. Then you will know everything that was executing on your SQL Server when the error number occurred.

The following SQL can be used to get details of current processes:
,qt.text AS [Parent Query]
,session_Id AS [Spid]
,DB_NAME(sp.dbid) AS [Database]
,SUBSTRING (qt.text, (er.statement_start_offset/2) + 1, --#1
((CASE WHEN er.statement_end_offset = -1
ELSE er.statement_end_offset
END - er.statement_start_offset)/2) + 1) AS [Individual Query]
FROM sys.dm_exec_requests er --#2
INNER JOIN sys.sysprocesses sp ON er.session_id = sp.spid
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle)as qt
WHERE session_Id > 50
AND session_Id NOT IN (@@SPID)
ORDER BY (sp.cpu + sp.memusage + sp.physical_io) DESC

Open in new window


Please let me know if you need more advice on how to create SQL Server alerts
25112Author Commented:
hi, I tried the same.. but 'job' section is greyed out.. how can you have the code kicked off at the same time as the error event?

pl see attached.

the exception is "Exception      Error: 823, Severity: 24, State: 7      .Net SqlClient Data Provider            RAZAUser      2040      95      2014-07-15 11:18:20.727      40      RAZA      823      616980      1      EDWEBP801            0XB8DD0A8021C70743BBFDEF23F08E70E1            0      POSRAR1                        RAZAUser      24      7      164986481      0                                                                                    
Barry CunneyCommented:
Hi 25112,
Please confirm that SQL Server Agent is running on your server.
If so, please confirm that you can crate a job by right-clicking on SQL Server Agent in Management Studio and selecting New | Job
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

25112Author Commented:
yes agent is running- confirmed.. I can create a dummy job no problem..
other jobs are running today - no problems there either..

just that in the response, i see that section greyed out..

(in the new job...'Steps' and 'Schedules' are left empty, right? notification and targets also empty.. i just say 'new job'... give it a name and go to alerts.. am i missing anything else?)
Barry CunneyCommented:
Hi 25112,
Start off by creating a new alert outside of the job
Right-click on SQL Server Agent and choose New | Alert
25112Author Commented:
ok.. that works better.. :) thanks Barry..

now will sql server be very quick to capture all those records in that millisecond when the error happens!?

i am trying to put them in a temp table to analyze it later.. would that be your approach too?
Barry CunneyCommented:
Hi 25112,
For the SELECT in my original post turn this into an INSERT....SELECT and insert to a physical table so as you can then query this table.
Yes there is the possibility that the error and underlying process only exists for milliseconds, but you have to try and attempt to capture some degree of evidence.
On the Alert you can also get it to e-mail you.

So if you get an e-mail and nothing logged by the job in the table then we plan further at this stage.
... maybe we modify the job to only do more basic logging first, like only the PID to prove that we can potentially grab some evidence when the error occurs
25112Author Commented:
Hi, you are right.. it happened quickly and nothing captured...

The error already happened  in the last hour.. but the query did not store the data... is it because the error comes and goes so fast..? and by the time the event is triggered, it is already gone?

kindly suggest your thought/idea on capturing pid?
Barry CunneyCommented:
Hi 25112,
Did you set up an e-mail operator on the Alert so that you get an e-mail when the error number 823 occurs?
25112Author Commented:
yes. that works Barry. (email)

so we have the procedure in place.. only thing is to capture in live what statement caused the error...

in see that the error happens so quickly and our alert send out the email and captures in the table.. but in those millisecond, the spid is probably relieved for another new process.. so the procedure as good as it is, does not help in this case?

profiler is the alternative, but heavy resource and makes users slow on live server app access.

thanks for any other insights you can give.
Marten RuneSQL Expert/Infrastructure ArchitectCommented:
Have you pursuied the root cause analysis for: Error: 823, Severity: 24,State: 7

This indicates failing HW, storage to be exact, DBCC CHECKDB is your friend in this case.
This may happend when a proc writes to a faulty page, i e it depends on what the where clause directs the write/update to take Place. So it's not a faulty application, but faulty hardware.

You need to run CheckDB as soon as possible om theese databases. If they are corrupt, you want to look for new hardware for your system.

Reggards Marten
Marten RuneSQL Expert/Infrastructure ArchitectCommented:
Quote from Paul Randalls page: "...823 is a nasty error to get – it says that an I/O operation failed at the OS level and the I/O subsystem is causing corruption – SQL Server didn't even get a chance to check the page checksum (and potentially raise an 824 error)..."

Read up and take action immediatily

Regards, and good luck, Marten

PS Check backups on Another system, and run DBCC CHECKDB on this restored db to see how bad this is hitting you.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.