Solved

Insert using Union All - Fast database import

Posted on 2011-09-09
8
298 Views
Last Modified: 2012-05-12
I'm trying to make fast record inserts into a MSSQL database. The input is object oriented xml files (750.000 lines), which are parsed into different sql statements.

This week I have been testing inserts using Union all, but the performance is disappointing (10.000 records in 5 to 7 seconds). Please look at my sample, maybe you can tell me what to do.


/****** Object:  Table [dbo].[Test]    Script Date: 09/09/2011 12:15:52 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[Test](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[Name] [nchar](100) NOT NULL,
	[Address] [nchar](100) NOT NULL,
	[Zip] [int] NOT NULL,
	[City] [nchar](100) NOT NULL,
	[Country] [nchar](100) NOT NULL
) ON [PRIMARY]

GO


------------------------------------------------------------------------------
static void TestIt()
{
    int cnt=0;
    StringBuilder sql;
    Stopwatch sw = new Stopwatch();
            

    for (int y = 0; y < 5; y++)
    {
        sw.Reset();
        sw.Start();

        SqlConnection conn = new SqlConnection("MyConnectionString");
        conn.Open();

        try
        {
            for (int i = 0; i < 100; i++)
            {
                sql = new StringBuilder();
                sql.Append("INSERT INTO Test (Name, Address, Zip, City, Country) SELECT \r\n");

                for (int j = 0; j < 100; j++)
                {
                    sql.AppendFormat("'{0}', '{1}', {2}, '{3}', '{4}' UNION ALL SELECT \r\n",
                        "My name is unknown", "Somewhere near the street", cnt, "The big city ABCDEFFGHIJKLMN", "CountryCountryCountryCountryCountry");
                            
                    cnt++;
                }

                // Subtract the last "union all select"
                sql.Length -= 19;

                // Insert this batch
                SqlCommand cmd = new SqlCommand(sql.ToString(), conn);
                cmd.ExecuteNonQuery();
                                    }
        }
        finally
        {
            conn.Close();
        }
        sw.Stop();
        Debug.WriteLine(sw.ElapsedMilliseconds);
    }
}

Open in new window

0
Comment
Question by:VPuser
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 36509791
yes performance is disappointing...

there was a similar question about 9-12 months ago on EE which tried the full range of insert statement
possibilities but couldn't achieve an "adequate" insert rate/time...

unfortunately i cant find it at present...

not that i'm convinced its much better for you , have you considered just passing the xml document to a stored procedure...

or using bulk insert / BCP ?
0
 
LVL 3

Assisted Solution

by:Srm74
Srm74 earned 166 total points
ID: 36509875
In my opinion bulk insert is by far the fastest.
But you have to be aware of the size of you bulk...
The SQL server has a buffer of app. 2 MB (configureable).
If you exceed this buffer performance will decrease alot.

So if you cant do it in one bulk insert, do a series.. Alot faster than single insert...

 
0
 

Author Comment

by:VPuser
ID: 36509910
I'm not doing single inserts, but at batch of 100 records, and yes I found out that if the "batch" is too large the performance decreases, but I didn't know that i was 2MB.

I think that the xml structure is to complex to parse it to a spc, there is build a lot off business intelligence into the parser.

How does bulk insert work?

 
0
Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

 
LVL 3

Expert Comment

by:Srm74
ID: 36509988
BULK INSERT runs in-process with the database engine of SQL Server and thus avoids passing data through the network layer of the Client API - this makes it faster than BCP and DTS / SSIS.

Also, with BULK INSERT, you can specify the ORDER BY of the data, and if this is the same as the PK of the table, then the locking occurs at a PAGE level. Writes to the transaction logs happen at a page level rather than a row level as well.

In the case of regular INSERT, the locking and the Transaction log writes are at a row level. That makes BULK INSERT faster than an INSERT statement.
0
 
LVL 50

Accepted Solution

by:
Lowfatspread earned 167 total points
ID: 36510076
still cant find the EE question i was refering to

however this maybe useful to you

http://www.ikriv.com/en/prog/info/SqlInsert.html
0
 
LVL 1

Assisted Solution

by:ananken
ananken earned 167 total points
ID: 36510291
In order to perform a bulk insert, what you have to do is:
Parse the XML you have into a CSV (comma-delimited) format. In case multiple tables are involved, you need multiple CSV's.
Prepare the tables to bulk insert the CSV data into. These can either be concrete DB tables, or Temp tables.
Once everything is ready (files and tables), run the code below:
          BULK INSERT <table name> FROM '<local SQL file path' WITH (FIELDTERMINATOR=',',ROWTERMINATOR ='\n')
For example::

BULK INSERT #RawData FROM 'C:\Loading\File1.csv' WITH (FIELDTERMINATOR=',',ROWTERMINATOR ='\n') 

Open in new window


Enjoy,
ananken.
0
 

Author Comment

by:VPuser
ID: 36517215
Thanks a lot.

SQLBulkCopy performs a lot better (10.000 rows in 0.3sec and 100.000 rows in 3.5sec), and I like the code structure in it.

The Bulk import is a bit slower (100.000 rows in 4.5sec), and the structure is not as nice as BulkCopy, so I will go for the SQLBulkCopy.

/Morten
0
 

Author Closing Comment

by:VPuser
ID: 36517224
Thanks guys!
0

Featured Post

Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question