[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 192
  • Last Modified:

Speed up Large(ish) Inserts

Hi.

In one of our databases we have an orders table (Order) that has a unique ID and other order specific details. We then have a link table that links the main order to products.

On any particular order there can be up to 3000 products. The tables are part of transactional replication.

In the client application a string of product ID's is built and passed to a stored procedure, along with the orderID. The stored procedure then breaks up the string and does an insert for each one in turn.

Tables are:

Order:
OrderID
...
...

Product:
ProductID
...
...

OrderProduct_Link:
ptOrder
ptProduct
Charged


This is slow as hell ! Can anyone help me speed it up? Any advice/help is greatly appreciated.

danny
0
LFMSupport
Asked:
LFMSupport
  • 7
  • 5
  • 4
  • +2
2 Solutions
 
HilaireCommented:
could you post the code of your stored procedure ?
0
 
LFMSupportAuthor Commented:
As requested. Table names are slightly different:

CREATE  PROCEDURE "sp_UpdateOrderFormsArray"
(
@Array Text,
@Separator char(1),
@ptCompID numeric,
@ptMaintID numeric,
@ptOrderID numeric,
@IsCredit bit
)
AS

SET NOCOUNT ON
DECLARE @Separator_Position int  -- This is used to locate each separator character
DECLARE @Array_Value varchar(10)  -- this holds each array value as it is returned
DECLARE @CurMaintDate datetime
DECLARE @PrevMaintDate dateTime
DECLARE @Chargeable numeric
DECLARE @IsRoyalty numeric

DECLARE @TextLen int      -- Length of @Array
DECLARE @Start int            -- Current position in @Array
DECLARE @LookAhead int      -- Distance to look for next separator in @Array
DECLARE @Pattern char(3)

SET @LookAhead = 10

SET @Pattern = '%' + @Separator + '%'    -- Work out the pattern ONCE only
SET @TextLen = DATALENGTH(@array)
SET @Start       = 1

WHILE @Start < @TextLen
BEGIN
      SET @Array_Value = SUBSTRING( @Array, @Start, @LookAhead )             -- get a little string which starts with the token
      SET @Separator_Position = PATINDEX( @Pattern, @Array_Value )       -- just search the little string
      IF @Separator_Position = 0
            SET @Start = @Start + LEN( @Array_Value)
      ELSE
            BEGIN
                  SET @Array_Value = LEFT( @Array_Value, @Separator_Position - 1 )
                  SET @Start = @Start + @Separator_Position
            END
            SET @CurMaintDate = (SELECT CurMaintDate FROM LFMCompMaint WHERE CompMaintID = @ptMaintID)
            SET @PrevMaintDate = (SELECT PrevMaintDate FROM LFMCompMaint WHERE CompMaintID = @ptMaintID)
      
            SET @IsRoyalty = (SELECT DISTINCT LFMFormLicensors_Link.ptFormID
                          FROM LFMFormLicensors_Link INNER JOIN LFMCompFormSupplied_Link ON
                          LFMFormLicensors_Link.ptFormID = LFMCompFormSupplied_Link.ptFormID INNER JOIN
                          LFMCompMaint ON LFMCompFormSupplied_Link.ptCompID = LFMCompMaint.ptCompID
                          WHERE (LFMCompFormSupplied_Link.DateSupplied BETWEEN CONVERT(DATETIME, @PrevMaintDate, 102) AND
                          CONVERT(DATETIME, @CurMaintDate, 102)) AND (LFMFormLicensors_Link.ptFormID = @array_value))

            SET @Chargeable = (SELECT DISTINCT ptFormID FROM LFMCompForm_Link
                                WHERE (ptFormID = @array_value)
                              AND (ptCompID = @ptCompID)
                              AND (ptMaintID = @ptMaintID)
                              AND (Subscribe = 1))

            IF @Chargeable IS NULL
            BEGIN
                  IF @IsCredit = 0
                  BEGIN
                        SET @Chargeable = 1
                  END
                  ELSE
                  BEGIN
                        SET @Chargeable = 0
                  END
            END
            ELSE
            BEGIN
                  IF @IsCredit = 0
                  BEGIN
                        SET @Chargeable = 0
                  END
                  ELSE
                  BEGIN
                        SET @Chargeable = 1
                  END
            END

            --Check if royalty should be charged or not.
            IF NOT @IsRoyalty IS NULL
            BEGIN
                  SET @Chargeable = 1
            END
      INSERT INTO LFMOrderForm_Link(ptOrderID, ptFormID, Chargeable) VALUES (@ptOrderID, @Array_Value, @Chargeable)
END

SET NOCOUNT OFF
0
 
anthonywjones66Commented:
Here are some simple initial tweaks:-

This:-

         SET @CurMaintDate = (SELECT CurMaintDate FROM LFMCompMaint WHERE CompMaintID = @ptMaintID)
         SET @PrevMaintDate = (SELECT PrevMaintDate FROM LFMCompMaint WHERE CompMaintID = @ptMaintID)


can be simplified to:-

      SELECT @CurMaintDate =  CurMaintDate, @PrevMaintDate = PrevMaintDate FROM LFMCompMaint WHERE CompMaintID = @ptMaintID

 but it is the two SELECT DISTINCTS which is hurting the most.


 Replace this line:-

         SET @Chargeable = (SELECT DISTINCT ptFormID FROM LFMCompForm_Link
 
With:-
        IF NOT @IsRoyalty IS NULL
         SET @Chargeable = (SELECT DISTINCT ptFormID FROM LFMCompForm_Link

there is no point running the second select distinct if the first doesn't return an null since the logic further down will force @chargeable to be 1 anyway.

You might also consider change DISTINCT prFormID to SUM(1) in both of these.  I think that may be a little quicker but you ought test with and with out this tweak.


I would recommend you ensure you have SP3 of SQL server installed then become familar with the OPENXML function.

The best way to pass an array of values in to SQL procedure is to pass it as XML.  For example:-

<forms><form ID="1" /><form ID="2" /><form ID="3" /></forms>

I'll get back to how that's done a later if I have time and you are interested.

Anthony.
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LFMSupportAuthor Commented:
Hi Anthony,

Thanks very much for the tips. I'll implement them and see how it goes.

Does anyone else have any input. I mean, is this really the best way to go about doing something like this or is there another way?

Danny
0
 
nmcdermaidCommented:
I'm interested in why you are importing concatenated data then breaking it back up again in the database? Is this the format it arrives in?

It would be faster if you just had a standard input file (ie CSV or something)
0
 
LFMSupportAuthor Commented:
The comma seperated list of ID's is simply the way it arrives from the VB application that uses the database.

Danny
0
 
mcmonapCommented:
Hi LFMSupport,

A query:

SELECT DISTINCT LFMFormLicensors_Link.ptFormID
FROM LFMFormLicensors_Link INNER JOIN LFMCompFormSupplied_Link ON
LFMFormLicensors_Link.ptFormID = LFMCompFormSupplied_Link.ptFormID INNER JOIN
LFMCompMaint ON LFMCompFormSupplied_Link.ptCompID = LFMCompMaint.ptCompID
WHERE (LFMCompFormSupplied_Link.DateSupplied BETWEEN CONVERT(DATETIME, @PrevMaintDate, 102) AND
CONVERT(DATETIME, @CurMaintDate, 102)) AND (LFMFormLicensors_Link.ptFormID = @array_value))

In the above query you select a distinct list of LFMFormLicensors_Link.ptFormID based on three tables two dates and LFMFormLicensors_Link.ptFormID.  Am I going mad?  The field you are selecting is in the criteria therefore if it exists @IsRoyalty will always equal @array_value.  If this is the case I think the code below would be more efficient:

IF EXISTS(SELECT  1
      FROM      LFMCompFormSupplied_Link cf
            JOIN LFMCompMaint cm ON cf.ptCompID = cm.ptCompID
      WHERE      cf.DateSupplied BETWEEN CONVERT(DATETIME, @PrevMaintDate, 102) AND CONVERT(DATETIME, @CurMaintDate, 102)
            AND cf.ptFormID = @array_value)
      SET @IsRoyalty = @array_value

I think...?
0
 
anthonywjones66Commented:
Here is another tweak you can apply:-

Add this declare above the loop:-

DECLARE @tbl TABLE (OrderID numeric, FormID numeric, Chargeable bit)

Then at then end of the loop replace this:-

    INSERT INTO LFMOrderForm_Link(ptOrderID, ptFormID, Chargeable) VALUES (@ptOrderID, @Array_Value, @Chargeable)
END

with:-

    INSERT INTO @tbl VALUES (@ptOrderID, @Array_Value, @Chargeable)
END

INSERT INTO LFMOrderForm_Link(ptOrderID, ptFormID, Chargeable) SELECT * FROM @tbl

This will cause all inserts to the table to occur in a single implicit transaction.  Ought to be a little faster then 3000+ individual inserts.

Anthony.

0
 
LFMSupportAuthor Commented:
Anthony, again thanks for the tip.

mcmonap: I don't think the replacement query you posted would work. The @IsRoyalty variable is to determine if a record is what we deem Royalty forms. These are kept in the LFMFormLicensors_Link table.

Danny
0
 
anthonywjones66Commented:
Danny,

I think mcmonap's query will work with a little tweaking. The only thing @IsRoyalty needs to be is either null or not null.  It will be not null if the query returns any records but it will need to be expilicitly set to null in an ELSE if no records exist.  The  EXISTS may actually be faster depending on how clever SQL server is.  If when the query is running it knows that it's being run in the context of an EXISTS then as soon as it realises that there will be some output it can stop and simply return true.  Of course SQL server might not for reasons unknown to us be able to do that but the EXISTS technique certainly won't be any slower and makes the intent of your code more clear as well.

You can then change the types of @IsRoyalty and @IsChargeable to bit instead of numeric.

In fact it probably makes sense to post a combination of all the techniques so far.  This is what I think your procedure ought to look like:-

CREATE  PROCEDURE "sp_UpdateOrderFormsArray"
(
@Array Text,
@Separator char(1),
@ptCompID numeric,
@ptMaintID numeric,
@ptOrderID numeric,
@IsCredit bit
)
AS

SET NOCOUNT ON
DECLARE @Separator_Position int  -- This is used to locate each separator character
DECLARE @Array_Value varchar(10)  -- this holds each array value as it is returned
DECLARE @CurMaintDate datetime
DECLARE @PrevMaintDate dateTime
DECLARE @Chargeable bit

DECLARE @TextLen int     -- Length of @Array
DECLARE @Start int          -- Current position in @Array
DECLARE @LookAhead int     -- Distance to look for next separator in @Array
DECLARE @Pattern char(3)

DECLARE @tbl TABLE (OrderID numeric, FormID numeric, Chargeable bit)

SET @LookAhead = 10

SET @Pattern = '%' + @Separator + '%'    -- Work out the pattern ONCE only
SET @TextLen = DATALENGTH(@array)
SET @Start      = 1

WHILE @Start < @TextLen BEGIN

     SET @Array_Value = SUBSTRING( @Array, @Start, @LookAhead )           -- get a little string which starts with the token
     SET @Separator_Position = PATINDEX( @Pattern, @Array_Value )      -- just search the little string
     IF @Separator_Position = 0
          SET @Start = @Start + LEN( @Array_Value)
     ELSE BEGIN
          SET @Array_Value = LEFT( @Array_Value, @Separator_Position - 1 )
          SET @Start = @Start + @Separator_Position
     END --IF
                
        IF EXISTS (SELECT 1 FROM LFMCompForm_Link
                            WHERE (ptFormID = @array_value)
                         AND (ptCompID = @ptCompID)
                         AND (ptMaintID = @ptMaintID)
                         AND (Subscribe = 1))
            SET @Chargeable = ~ @IsCredit
      ELSE
            SET @Chargeable = @IsCredit

--      END IF

      If @Chargeable = 0 BEGIN
      
            SELECT @CurMaintDate = CurMaintDate, @PrevMaintDate = PrevMaintDate FROM LFMCompMaint WHERE CompMaintID = @ptMaintID

               IF EXISTS (SELECT 1
                      FROM LFMFormLicensors_Link INNER JOIN LFMCompFormSupplied_Link ON
                      LFMFormLicensors_Link.ptFormID = LFMCompFormSupplied_Link.ptFormID INNER JOIN
                      LFMCompMaint ON LFMCompFormSupplied_Link.ptCompID = LFMCompMaint.ptCompID
                      WHERE (LFMCompFormSupplied_Link.DateSupplied BETWEEN CONVERT(DATETIME, @PrevMaintDate, 102) AND
                      CONVERT(DATETIME, @CurMaintDate, 102)) AND (LFMFormLicensors_Link.ptFormID = @array_value)))
                  SET @Chargeable = 1

      END --IF

   INSERT INTO @tbl VALUES (@ptOrderID, @Array_Value, @Chargeable)
END --WHILE

INSERT INTO LFMOrderForm_Link(ptOrderID, ptFormID, Chargeable) SELECT * FROM @tbl

SET NOCOUNT OFF


To break it down.

@IsRoyalty has been factored out altogether.  @Chargeable is now a bit type.

The new @tbl is declared.

I've switched the order in which the original SELECT DISTINCTS are run and changed them to use the IF EXISTS technique.

You had a nested IF structure which boiled down to XOR function.  I've merged that into the IF EXISTS.  The ~ operator is a bitwise NOT.  You might need to start at that for a little while to confirm it does what your original did.

Now if the result of the above is chargeable the other two SELECT queries are unneccessary and are skipped.  On of them is the more complex and possibly slower than all the others.  Since @IsRoyalty at this point would simply be assigned to @Chargeable @IsRoyalty is eliminated.

Finally the loop simply inserts to a local in memory table.

Once the loop is finished all the records are written to the table in single operation.

Anthony.

BTW are you sure you want to be using numeric for ID field types? Int would seem to be a more appropriate data type for those fields, just a thought.
0
 
LFMSupportAuthor Commented:
Anthony, wow!

I'm gonna have to go away and test that for a bit and then come back post the results.

I have a couple of queries though. Do you think it is going to be significantly quicker inserting in to an in memory table and then in to the "real" table as opposed to inserting in to the "real" table directly?

Also, what are the advantages of INT over NUMERIC?

Many thanks,
Danny
0
 
anthonywjones66Commented:
When inserting to an in memory table the server doesn't need to take any locking precautions no other connection will ever see the table the inserts will not need to be logged since in memory tables are expected to be volatile.  On the other hand inserting to a real table requires that part of the table receives a lock.  Also the inserts need to be properly logged in case the server fails before your SP has completed.

These 'Real' operations are much quicker when performed in one insert statement and are treated as a single transaction.


INTs can store over 4 Billion different integers and therefore more than adequate as an ID for most tables they are half the size of NUMERIC.

NUMERICS can store a great range of values with greater precision and with fractional components.  However they are larger you would only use them if you needed to store large or non-integer values with a specifed level precision.

Anthony.
0
 
mcmonapCommented:
In addition table variables require no log writes and, if you have enough memory, no disk writes at all.
0
 
LFMSupportAuthor Commented:
I've not had a chance to test the amended SP - will be able to do this tomorrow now.

On the INT/NUMERIC front - is it an easy task to change the field types or will this cause all kinds of problems with relationships et al?

Thanks for the help so far everyone.

Danny
0
 
mcmonapCommented:
Hi LFMSupport,

It may or may not cause issues with relationships, this will depend on whether the fields you wich to change are directly involved in the constraint.  If they are it almost certainly will, I'm guessing you also have a product that sits on top of the db (perhaps one that produces legal forms?), this might also be affected by the change of field types.

Also as I posted earlier this query:
SELECT      1
FROM      LFMFormLicensors_Link
      INNER JOIN LFMCompFormSupplied_Link ON LFMFormLicensors_Link.ptFormID = LFMCompFormSupplied_Link.ptFormID
      INNER JOIN LFMCompMaint ON LFMCompFormSupplied_Link.ptCompID = LFMCompMaint.ptCompID
WHERE      (LFMCompFormSupplied_Link.DateSupplied BETWEEN CONVERT(DATETIME, @PrevMaintDate, 102) AND CONVERT(DATETIME, @CurMaintDate, 102))
AND (LFMFormLicensors_Link.ptFormID = @array_value)))

Can be optimised by removing the LFMFormLicensors_Link table, since you are checking only for records where the LFMFormLicensors_Link.ptFormID = @array_value and LFMFormLicensors_Link.ptFormID = LFMCompFormSupplied_Link.ptFormID then your can instead do LFMCompFormSupplied_Link.ptFormID = @array_value and omit the LFMFormLicensors_Link table.  You can also use aliases as below which makes code (IMO) more readable:
SELECT      1
FROM      LFMCompFormSupplied_Link cf
      JOIN LFMCompMaint cm ON cf.ptCompID = cm.ptCompID
WHERE      (cf.DateSupplied BETWEEN CONVERT(DATETIME, @PrevMaintDate, 102) AND CONVERT(DATETIME, @CurMaintDate, 102))
      AND (cf.ptFormID = @array_value)))
0
 
anthonywjones66Commented:
Danny,

If this is a live system then you will probably be better off leaving the Numerics as they are.  However if you want to try it out then I would try it this way.

Make a backup and restore the DB to a test server or as a separate DB.
Then place the graph of related tables in a diagram and change all the Numerics to INTs grab.
Save the Script for making those changes and then cancel the diagram.

In Query analyser attempt to run the script against the test DB you might need to tweak the script a little (SQL server can get it's own scripting wrong) to get it to work.  You may also need to restore the back up to test as you go.  If you can get the script to run to your satisfaction make another backup of your live DB and run the script.  You can change your SP parameters at your leisure since Numerics will coerce to INTs without complaint.


mcmonap,

If the other two tables include records for values of ptFormID that are not found in LFMFormLicensors_Link then the optimisation will change the results.

Anthony.
0
 
mcmonapCommented:
Hi anthonywjones66,

That is very true, I hadn't considered that.
0
 
LFMSupportAuthor Commented:
Hi guys,

Sorry for the late response - I got massivly sidetracked!

I've just implemented the new stored procedure and run a few test orders and also a few real orders. So far the results are positive with a noticable speed increase.

Unless there are any objections I'm gonna split the points between anthony and mcmonap?

Danny
0

Featured Post

Fill in the form and get your FREE NFR key NOW!

Veeam is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

  • 7
  • 5
  • 4
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now