Solved

Bulk Insert using Xquery is very slow.

Posted on 2008-10-24
4
1,057 Views
Last Modified: 2013-11-10
I am using Xquery to fetch information from many xml files and bulk insert to the database. My query follows

insert into RKDFndStdFinVal_TempMain
(RepNo,PerEndDate,PerType,StmtDate,StmtType,COACode,[Value],LicFlag)
SELECT
ref.value ('(xRepNo/.)[1]', 'varchar(20)') as RepNo,
ref.value ('(xPerEndDate/.)[1]', 'varchar(20)') as PerEndDate,
ref.value ('(xPerType/.)[1]','varchar(15)') as PerType,
ref.value ('(xStmtDate/.)[1]', 'varchar(20)') as StmtDate,
ref.value ('(xStmtType/.)[1]', 'varchar(5)') as StmtType,
ref.value ('(xCOACode/.)[1]', 'varchar(10)') as COACode,
ref.value ('(xValue/.)[1]', 'float') as [Value],
ref.value ('(xLicFlag/.)[1]', 'varchar(5)') as LicFlag
FROM   TestXML
CROSS APPLY XMLDoc.nodes ('//GNF_ConvertedXML/RKDFndStdFinVal/row') R(ref) ;


I have around 60,000 xml files and the number of data row is huge. For 100 XML files itslef it will exceed 1 million +.  For 100 xml files its taking 1 hour for insertion. But i want all the insertion for 60000 files to be done in 15 hours.

Is there any way i can optimize the query or change any property in mssql server configuration to achive this result?
0
Comment
Question by:anand_dan27
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 

Author Comment

by:anand_dan27
ID: 22801792
i have changed SQL Server database recovry model to Bulk_Logged. Still the same result.

ALTER DATABASE <name> SET RECOVERY BULK_LOGGED
 
0
 
LVL 17

Expert Comment

by:HoggZilla
ID: 22808193
XQuery is probably the best solution, have you tried opening multiple threads using SSIS Packages? You could use XML Source in SSIS but it might be slower than XQuery, not sure.
If you can divide the files into logical groups, process multiple groups simultaneously using synchronous threads. One SSIS Package can execute all of the processes. If you use XML Source, you could use the Union All component to contol the inserts to the same table.
0
 
LVL 17

Expert Comment

by:HoggZilla
ID: 22808207
I meant to add this link, as it might also be useful if you have not already seen it. Good luck.
http://blogs.msdn.com/mattm/archive/2007/12/11/using-xml-source.aspx
 
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 500 total points
ID: 22808780
Why not expand the xml into temp table and then run your queeries ? ie do the bulk insert, or openrowsets to insert into either a variable, table, whatever, and then use openxml to query from. Found that for large files, openxml can be a bit quicker than Xquery.

maybe you could post a small sample ?

e.g. (for you need to swap My_Views to RKDFndStdFinVal_TempMain - or your temp staging table as appropriate) ...


--views.xml contains : <NewDataSet><Views><SystemID>1</SystemID><ViewName>sam_view</ViewName></Views></NewDataSet>
 
CREATE TABLE My_VIEWS (SystemID varchar,ViewName  varchar(20))
 
DECLARE @xmlstring xml
DECLARE @idoc int;
 
SET @xmlstring = (select CONVERT(varchar(max), BulkColumn, 2) FROM  OPENROWSET(Bulk 'c:\ee\views.xml', SINGLE_BLOB) [rowsetresults])
 
EXEC sp_xml_preparedocument @idoc OUTPUT,@xmlstring;
--insert would go here
SELECT * FROM OPENXML (@idoc,'/NewDataSet/Views',2) with My_Views ;
EXEC sp_xml_removedocument @idoc 

Open in new window

0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Via a live example, show how to shrink a transaction log file down to a reasonable size.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question