David Bach
asked on
More Efficient Inserting Of Rows For SQL Server
Greetings:
A database with a table called "RawTestData". This contains just short of 1,000,000 rows.
I need to parse each row into 3 columns and store the result in table (in the same database) called "RawTestDataFormatted".
A C# method is used to insert each row.
The rows are committed at a rate of 23.18 rows per second. At this rate it will take 11.8 hours to complete.
I'm looking for a more efficient way of committing the rows. My C# method is as follows:
Thank you,
David Bach
A database with a table called "RawTestData". This contains just short of 1,000,000 rows.
I need to parse each row into 3 columns and store the result in table (in the same database) called "RawTestDataFormatted".
A C# method is used to insert each row.
The rows are committed at a rate of 23.18 rows per second. At this rate it will take 11.8 hours to complete.
I'm looking for a more efficient way of committing the rows. My C# method is as follows:
protected bool SaveRawTestDataFormatted(string strStudentName,
int intTestNumber,
DateTime dtSection,
ref long bigRawTestDataId)
{
cmdParse.Parameters.Clear();
try
{
cnnParse.ConnectionString = ConfigurationManager.ConnectionStrings["ParseConnectString"].ConnectionString;
cmdParse.Parameters.Add("@strStudentName", SqlDbType.NVarChar, 50).Value = strStudentName;
cmdParse.Parameters.Add("@intTestNumber", SqlDbType.Int).Value = intTestNumber;
cmdParse.Parameters.Add("@dtSection", SqlDbType.DateTime).Value = dtSection;
cmdParse.Parameters.Add("@bigRawTestDataId", SqlDbType.BigInt).Direction = ParameterDirection.Output;
cmdParse.Connection = cnnParse;
cmdParse.CommandText = "up_i_RawTestDataFormatted";
cmdParse.CommandType = CommandType.StoredProcedure;
cnnParse.Open();
cmdParse.ExecuteNonQuery();
cnnParse.Close();
bigRawTestDataId = Convert.ToInt64(cmdParse.Parameters["@bigRawTestDataId"].Value);
}
catch (Exception ex)
{
Response.Write("Error: " + ex.Message);
return false;
}
return true;
}
The T-SQL is as follows:USE [DB_117672_training]
GO
/****** Object: StoredProcedure [dbo].[up_i_RawTestDataFormatted] Script Date: 6/14/2019 1:40:30 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:
-- Create date: 6/13/2019
-- Description: Insert raw data formatted row
-- =============================================
ALTER PROCEDURE [dbo].[up_i_RawTestDataFormatted]
-- Add the parameters for the stored procedure here
@strStudentName nvarchar(50),
@intTestNumber int,
@dtSection datetime,
@bigRawTestDataId bigint OUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
INSERT RawTestDataFormatted
(strStudentName,
intTestNumber,
dtSection)
VALUES (@strStudentName,
@intTestNumber,
@dtSection)
SET @bigRawTestDataId = SCOPE_IDENTITY()
END
Any assistance you might provide would be most appreciated.Thank you,
David Bach
You probably want BULK INSERT for this task.
Another unanswered question: Why C#? Why don't you do it T-SQL?
If it must be in C#, then you should look into parallel operations. Maybe a simple Parallel.For can do it, assuming that you have an optimal posibitility to partition your data.
If it must be in C#, then you should look into parallel operations. Maybe a simple Parallel.For can do it, assuming that you have an optimal posibitility to partition your data.
ASKER
Greetings kaufmed and ste5an;
kaufmed - I was hoping to avoid creating a flat file and loading this back in to SQL using Bulk Insert.
ste5an - The solution does not require it be done in C#. if you have a suggestion using T-SQL I am most interested? I can see I confused the description of what I'm after. My apologies. I'm using C# to invoke the T-SQL INSERT.
I look forward to your responses.
Much thanks,
David Bach
kaufmed - I was hoping to avoid creating a flat file and loading this back in to SQL using Bulk Insert.
ste5an - The solution does not require it be done in C#. if you have a suggestion using T-SQL I am most interested? I can see I confused the description of what I'm after. My apologies. I'm using C# to invoke the T-SQL INSERT.
I look forward to your responses.
Much thanks,
David Bach
Please post the table DDL of your table including existing indices. Also add some sample data. And explain your extraction/splitting logic.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Where does the source data come from?
ASKER
John - if I understand you correctly, I would generate an INSERT for each row then subsequently send the 132,930 INSERT statements to Table-A along with INSERT statements for 4M to 5M rows to Table-B which has a 1 to many relationship with Table-A.
Is this correct, John?
kaufmed - The source is Table-C which contains raw log data which I then parse so the data is formatted.
ste5an - the DDL is as follows for the Table-A:
Update - my client's hosting company got back to me with the following answer:
Please let me know if you have additional input for me.
Much thanks,
David Bach
Is this correct, John?
kaufmed - The source is Table-C which contains raw log data which I then parse so the data is formatted.
ste5an - the DDL is as follows for the Table-A:
/****** Object: Table [dbo].[RawTestDataFormatted] Script Date: 6/16/2019 2:40:22 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[RawTestDataFormatted](
[bigRawTestDataId] [bigint] IDENTITY(1,1) NOT NULL,
[strStudentName] [nvarchar](50) NOT NULL,
[intTestNumber] [int] NOT NULL,
[dtSection] [datetime] NOT NULL,
CONSTRAINT [PK_RawTestDataFormatted] PRIMARY KEY CLUSTERED
(
[bigRawTestDataId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
The DDL for Table-B is:/****** Object: Table [dbo].[RawTestDataFormattedAnswer] Script Date: 6/16/2019 2:43:32 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[RawTestDataFormattedAnswer](
[fkbigRawTestDataId] [bigint] NOT NULL,
[intQuestionNumber] [int] NOT NULL,
[chrAnswer] [char](1) NOT NULL,
CONSTRAINT [PK_RawTestDateFormattedAnswers] PRIMARY KEY CLUSTERED
(
[fkbigRawTestDataId] ASC,
[intQuestionNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[RawTestDataFormattedAnswer] WITH CHECK ADD CONSTRAINT [FK_RawTestDataFormattedAnswer_RawTestDataFormatted] FOREIGN KEY([fkbigRawTestDataId])
REFERENCES [dbo].[RawTestDataFormatted] ([bigRawTestDataId])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[RawTestDataFormattedAnswer] CHECK CONSTRAINT [FK_RawTestDataFormattedAnswer_RawTestDataFormatted]
GO
I learned from my client's hosting company, they do not allow the BULK INSERT T-SQL command to be issued. I'm considering generating INSERT statements for each row for both tables and using SSMS to process them. I'm hoping this might be faster than an open and close for the database for each row to be processed.Update - my client's hosting company got back to me with the following answer:
I'm afraid we don't support any type of BULK INSERT operations because of the load it puts on the server. The only suggestion I can offer is that you load your data locally, then backup your database, upload it via FTP, and restore it using our Control Panel.This is an idea I may try as well.
Please let me know if you have additional input for me.
Much thanks,
David Bach
ASKER
Greetings kaufmed, ste5an and John:
I want to thank all of you for your diligence, time and patience with me.
I was able to use John's suggestion to execute roughly 25,000 to 30,000 INSERT statements at a time to fill Table-A. I also abandoned the idea of 2 tables and deleted Table-B. After 6 passes of INSERT statements I was able to fill Table-A with the data required.
Thank you again,
David Bach
I want to thank all of you for your diligence, time and patience with me.
I was able to use John's suggestion to execute roughly 25,000 to 30,000 INSERT statements at a time to fill Table-A. I also abandoned the idea of 2 tables and deleted Table-B. After 6 passes of INSERT statements I was able to fill Table-A with the data required.
Thank you again,
David Bach
Glad it worked to your liking..i am on vacations and i see messages occasionally