Link to home
Start Free TrialLog in
Avatar of Skip House
Skip HouseFlag for United States of America

asked on

SELECT query occassionally misses a row of data.

I'm looking for reasons why a SELECT query does not return all rows that satisfy the selection criteria.

 I have created such a query and incorporated it in a stored procedure that runs periodically at 15 minute intervals.  Most of the time it returns all of the expected rows.  At other times a row is missing even though the data in the tables being queried remains unchanged as does the selection criteria for three successive executions of the query.  

The routine processes 20 to 30 transactions a day correctly, but every couple of days it misses a transaction.

I suspect this might be due to locking, but it seems improvable that records could remain locked for a 30 minute period.  

Am I overlooking any other possible reason or reasons for this behavior.  

Thanks in advance for your time and consideration.
Avatar of yo_bee
yo_bee
Flag of United States of America image

Without your sql statement and your data set it's hard to say.

Seem very unlikely that it's missing data and more likely the  data does not meet your where criteria or join statement.
Avatar of Skip House

ASKER

Normally, I would agree with your suggestion, but the primary criterion is a date and time stamp.  To fix the data that was missed, I simply update that value to the current time and wait until my routine runs again when it is immediately returned as expected.  Nothing was changed in any other data value or the stored procedure.  
That to me suggests that it is neither criteria or the SQL query, but rather something outside my code, which is the reason I thought of locks.
The main difference however, is that I make the correction after normal business hours when the associated database application (an ERP package) is not being used.
So it is what I stated your where clause is not met until you manipulate the value to meet your criteria.
What's your where clause and what is the date and time that originally is in the record.
>>"but the primary criterion is a date and time stamp.  <<

My guess is you are creating a tiny gap at the top end of your time range.

I do hope you are NOT using between to define your date range. In particular I trust you are not doing something like this:

   where mytimestamp between '2018-11-18' and '2018-11-18 23:59:59'

NEVER try to define the end point precisely. Just avoid using between, and do it this way:

   where mytimestamp >= '2018-11-18' and mytimestamp  < '2018-12-18'  /* note this is now "up one" */

For more on this topic please see: "Beware of Between"

the best practice with date and time ranges is to avoid BETWEEN and to always use the form:

WHERE col >= '20120101' AND col < '20120201'
This form works with all types and all precisions, regardless of whether the time part is applicable.
Itzik Ben-Gan

Also:
"Bad habits to kick : mis-handling date / range queries"
"What do BETWEEN and the devil have in common?"
No between criteria, just greater than to pickup changes in data.
That's why just updating the time stamp to the current date and time causes the record to be returned as expected when I go in after hours to fix it.
I'm currently of the grid so I don't have access to my stored proc, but the where clause is something along the lines of: WHERE [time stamp column] > [a time stored in a procedure variable]
Can you post your select query?
Are you using "WITH (NOLOCK)" in the query?  If so, it could indeed miss rows at times, because you've "told" SQL it can do that.
No WITH (NOLOCK) for precisely that reason,  though I contemplated using it before I changed the procedure to try initially two and then three attempts to retrieve changes at 15 minute intervals.
As I said before, I am off the grid so I can't post the actual queries now. I'll post them by Thursday.
Here is the actual select query that works most of the time but occasionally fails to pickup a record that satisfies the criteria on occasion.  Keep in mind that when I rerun this query in off-hours and reset the value of @EarliestUpload the missing record is returned.

      SELECT SOM_SalesOrderID AS SalesOrderID, SOM_RecordID AS SalesOrderRecordID, CUS_CustomerID AS CustomerID,
            SOM_CUS_RecordID AS CustomerRecordID, SOM_SalesOrderDate AS OrderDate, SOM_BookDate AS BookedDate, NULL, NULL,
            SOM_LastModifiedDate AS ModifiedOn, 0 AS Processed
      FROM dbo.Shipment INNER JOIN dbo.SalesOrder
                  ON Shipment.SH_SOM_RecordID = SalesOrder.SOM_RecordID
            INNER JOIN dbo.Customer
                  ON SOM_CUS_RecordID = CUS_RecordID
      WHERE SOM_RecordID NOT IN (SELECT SalesOrderRecordID FROM tblDexSOHeader)
            AND SH_ShipDate > @EarliestUpload
You say that the query returns as expected once you reset the value of @EarliestUpload. This raises a couple of questions. How are you resetting the value (i.e. are you hardcoding a datetime variable)? If so, does it contain a time element? How was the @EarliestUpload value established originally and did it contain a time element? How is the 15 minute interval established? You indicate that it runs periodically every 15 minutes, does that mean it runs precisely every 15 minutes? It seems to me that the problem, if it's corrected after physically resetting the @EarliestUpload value, must lie in how the value is established and how it's being compared.
I really don't think the problem lies is the datetime value because I have a good indication the date gets set correctly each time it is run.  This is part of a stored proc that is executed by SQL Agent four times an hour at 04, 19, 34, and 49 minutes past the hour.  @EarliestUpload is from a table of the three most recent processing dates and times when the proc is executed.  

Is it possible that GETDATE() fails to return the system time?
Are you using DATETIME2 in any of the columns referenced by @EarliestUpload?

Is @EarliestUpload declared as datetime or datetime2?

Should you be using SYSDATETIME instead of getdate()?
(sysdatetime, perhaps ironically, returns a datetime2 value)

nb: datetime (lack of) precision can lead to rounding error.
No, all date and time references are set to DATETIME.  My application does not need the extra range or precision of DATETIME2 since the query uses greater than queries and picks up transactions over at least a half hour range of time.

Skip
Your where clause is the issue here. The computer system time is very precise and it could be that the records are just shy of 30 mins.  

Without seeing the @earliestupload value it is hard to say if what I am thinking is the reason why.
That is not the issue.  The process makes three attempts to find data that has changed in the last 30 minutes at fifteen minute intervals.  Thus a particular transaction update at one point in time has at a minimum two opportunities to be identified and returned.  

I set it up this way to get around the fact that its record could be locked the first or second time my procedure runs.  I'm assuming any blocking lock would not last for any extended amount of time by the user taking their sweet time to enter a shipment.  I'm assured that users complete their transactions within a minute or two.
But the system is doing what you asking it to do.  So something is wrong with the data that it is falling out of the range of your Where Clause.

You stated earlier in this thread that if you go in and change the value the script works. So something is different about the original value to the modified value.
The source data has a DATETIME value (SH_ShipDate) that I is set to the current date and time when a shipment is made.  That is the value that I compare to @EarliestUpload.  When SH_ShipDate is greater than @EarliestUpload ( which is 30 minutes earlier than the time of processing) the details are returned.

To fix (i.e. retrieve the missing record so that it will be transferred to the CRM system), I simply update SH_ShipDate to the current date and time.  When I do this (usually after hours) it is immediately picked up by the query.  Once I see that it have been successfully processed, I revert it to the actual date and time of the shipment.
Disregard the "I" in the first sentence of my last comment.  I should read "that is set...".
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.