Solved

Sql Stored Procedure and records in "use"

Posted on 2016-08-25
10
49 Views
Last Modified: 2016-08-25
I have a VB App that pulls emails from our users subscribed database with a [status] = 0  and sends as an smtp email to a 3rd party api.
As it is successfully sent it updates the sql table [status] = 1

By itself it handles about 12,000 records an hour.

I need to double that ...

I know I can add as separate tasks or even another replica of the same app.

Here is my issue...
If I have one app pulling 10,000 addresses from the SQL Table in my stored procedure

How do I make sure the replica app or second task doesn't pull the same emails into it's own 10,000 addresses recordset?
0
Comment
Question by:lrbrister
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +3
10 Comments
 
LVL 50

Expert Comment

by:Vitor Montalvão
ID: 41770272
How do I make sure the replica app or second task doesn't pull the same emails into it's own 10,000 addresses recordset?
If you work inside a transaction the records will be locked and no other process can access them until you release the lock with a COMMIT or ROLLBACK command.
0
 

Author Comment

by:lrbrister
ID: 41770276
So... lets say there are 15000 records
Two processes call the same stored procedure that gets 10,000 records at a time
First app runs and gets first 10k records

Second app runs 2 seconds later and will skip the first 10,000 records and just pulls the remaining 5,000 records?
0
 
LVL 34

Expert Comment

by:ste5an
ID: 41770278
12k rows is pretty few for SQL Server. No need for a replica.

You need to look at the bottleneck: Can that 3rd party API handle this? How long does your current program needs to send one mail? What size has one email? Do you have the necessary bandwidth?

My first attempt after looking at this information and those numbers would be distributing the load over more SMTP sender processes. Cause my tingling developer sense tell me, that it is just here, where we need to do something. Often a simple round-robin does the trick:

Just select rows by user modulo number_of_running_senders  and give each process a unique counter.
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 

Author Comment

by:lrbrister
ID: 41770285
ste5an
There are usually 300-400 k records to run.

The third party api can handle petty much whatever we send it ...supposedly

However...
Each of these emails is sent with a unique HTML Content Body for that particular individual.
As the record is pulled in there is a "REPLACE" on the vb side... the HTML CAN be fairly extensive.
Then the record is send.

That's why the low rate
0
 
LVL 13

Expert Comment

by:Nakul Vachhrajani
ID: 41770294
You basically need a look-up/log table where the first process logs the Ids for the addresses being fetched.

When the replica app runs, it should check (by looking at the log table) that it is processing only values that the other app has not processed.

NOTE: Depending upon your implementation, you may want to clear out/truncate the log table once all E-mails have been sent.

Here's a quick sample.

USE tempdb;
GO
--Safety Check
IF OBJECT_ID('dbo.AddressTable','U') IS NOT NULL
BEGIN
    DROP TABLE dbo.AddressTable;
END
GO

--Safety Check
IF OBJECT_ID('dbo.LogTable','U') IS NOT NULL
BEGIN
    DROP TABLE dbo.LogTable;
END
GO

--Safety Check
IF OBJECT_ID('dbo.proc_SendEmails','P') IS NOT NULL
BEGIN
    DROP PROCEDURE dbo.proc_SendEmails;
END
GO

--Create required objects
CREATE TABLE dbo.AddressTable 
       ([PersonId]     INT           NOT NULL,
        [EmailAddress] NVARCHAR(255) NOT NULL,
        [Status]       BIT           NOT NULL
       );
GO

CREATE TABLE dbo.LogTable
      ([PersonId]    INT      NOT NULL,
       [ProcessDate] DATETIME NOT NULL
      );
GO
  
--Assume this SP is sending the E-mails                       
CREATE PROCEDURE dbo.proc_SendEmails 
        @batchSize INT
AS
BEGIN
    SET NOCOUNT ON;

    CREATE TABLE #someTable ([DummyValue] INT NULL,
                             [DummyDate]  DATETIME NULL
                            )

    --When you fetch the E-mails to process,
    --log the Ids into your log table
    INSERT INTO #someTable (DummyValue, DummyDate)
        OUTPUT inserted.DummyValue, 
               inserted.DummyDate  
        INTO dbo.LogTable ([PersonId], [ProcessDate])
    SELECT TOP (@batchSize) 
           at.PersonId AS DummyValue,
           GETDATE()   AS DummyDate
    FROM dbo.AddressTable AS at
    LEFT OUTER JOIN dbo.LogTable AS lt ON at.PersonId = lt.PersonId
    WHERE at.[Status] = 1
      AND at.PersonId IS NOT NULL
      AND lt.PersonId IS NULL;
END
GO

--Generate some test data. NOTE: I have taken test data from the AdventureWorks sample database
USE tempdb;
GO
INSERT INTO dbo.AddressTable (PersonId, EmailAddress, [Status])
SELECT ea.BusinessEntityID,
       ea.EmailAddress,
       CASE (ea.BusinessEntityID % 2) WHEN 1 THEN 1 ELSE 0 END AS [Status]
FROM AdventureWorks2012.Person.EmailAddress AS ea;
GO

--Now, keep running the SP and observe the values logged in dbo.LogTable
USE tempdb;
GO
EXEC dbo.proc_SendEmails @batchSize = 5;
GO

Open in new window

0
 
LVL 26

Expert Comment

by:Shaun Kline
ID: 41770311
If status is an integer data type, you could set the status to a value other than 0 or 1 to indicate the record is being processed when you retrieve the records. After the email is sent you could then set the status to 1.
0
 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 300 total points
ID: 41770351
Don't "corrupt" the existing status column.  It confuses the meaning of that column and is less clear overall.  Instead, add another column(s) to indicate: "row has been pulled for processing but has not been processed yet".

This could be a simple bit flag and/or a datetime of when it was pulled (NULL = not pulled yet) and/or an identifier for what pulled it, etc..

Then code the pull query to skip any rows that are marked as already pulled.

You could consider adding a filtered index on those pull-marking columns if the base table is very large, to cut down the cost of scanning for non-pulled rows.
1
 
LVL 34

Assisted Solution

by:ste5an
ste5an earned 200 total points
ID: 41770391
How do I make sure the replica app or second task doesn't pull the same emails into it's own 10,000 addresses recordset?

Round-robin:

DECLARE @Sample TABLE
    (
      UserID INT ,
      EmailAddress VARCHAR(255)
    );

INSERT  INTO @Sample
VALUES  ( 1, '' ),
        ( 2, '' ),
        ( 3, '' ),
        ( 4, '' ),
        ( 5, '' ),
        ( 7, '' );


DECLARE @NumberOfApps INT = 3;

SELECT  * ,
        S.UserID % @NumberOfApps AS AppID
FROM    @Sample S;

Open in new window

1
 

Author Closing Comment

by:lrbrister
ID: 41770399
A combination of these look like the best approach and easiest to easily "follow"
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 41770466
Fyi, you can mark the rows and pull data from them in the same statement:


DECLARE @unique_id_for_this_process_batch uniqueidentifier
SET @unique_id_for_this_process_batch = NEWID()

/* Mark rows as in-process while pulling details from them so that they can actually be processed. */
UPDATE TOP (10000) tn
SET process_start_date = GETDATE() /*, process_id = @unique_id_for_this_process_batch */
OUTPUT @unique_id_for_this_process_batch,
     INSERTED.process_start_date, INSERTED.id, ...<other_columns_needed_for_processing>
    INTO dbo.inprocess_data ( <column_names> )
FROM dbo.table_name tn
WHERE
    tn.process_start_date IS NULL

<code to process rows in "dbo.inprocess_data" table for the current unique id>
1

Featured Post

Free eBook: Backup on AWS

Everything you need to know about backup and disaster recovery with AWS, for FREE!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question