?
Solved

openXML with large XML file

Posted on 2005-04-15
9
Medium Priority
?
2,149 Views
Last Modified: 2008-11-05
Hi,

   I am working with openXML to get data from XML file and write to database. I am reading the file into varchar variable and then access the openXML method. However varchar maximum size is 8000.Hence if my file size is more than 8000 this method does not work.
Does anyone know of a method to use openXML with large files?
0
Comment
Question by:LTSave
9 Comments
 
LVL 6

Expert Comment

by:alextr2003fr
ID: 13794990
Maybe you should try to read your file into text instead of varchar.
0
 

Author Comment

by:LTSave
ID: 13796102
Text type cannot be declared in stored procedure. hence I cannot use Text.
0
 
LVL 26

Expert Comment

by:Hilaire
ID: 13805210
I have developped a custom stored procedure to read from a big XML file directly from the disk (any location visible from the server) or from a text field.
Both versions can read much more than 8000 characters.

Let me know which one fits your needs better and I'll post the code.

Hilaire
0
Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.

 

Author Comment

by:LTSave
ID: 13806012
Hello Hilaire,

I would be interested in the stored procedure which reads from a big XML file on the disk.

Thanks
0
 
LVL 26

Accepted Solution

by:
Hilaire earned 1000 total points
ID: 13806101
This version handles files up to 74000 characters
It returns a handle to the pre-processed XML file as an output parameter, as does sp_xml_preparedocument.
So basically :
- use it instead of sp_xml_preparedocument
- use the returned pointer with OPENXML
- "close" the file with the "normal" sp_xml_removedocument procedure

HTH
Hilaire

create proc usp_OpenXML_From_File(@strPathToFile varchar(500), @hdoc int output) as
declare @objStream int, @retval int, @filesize int, @chunknum int
declare @chunkstr nvarchar(4000), @strSQL nvarchar(4000)
declare @str1 nvarchar(4000), @str2 nvarchar(4000), @str3 nvarchar(4000), @str4 nvarchar(4000), @str5 nvarchar(4000),
      @str6 nvarchar(4000), @str7 nvarchar(4000), @str8 nvarchar(4000), @str9 nvarchar(4000), @str10 nvarchar(4000),
      @str11 nvarchar(4000), @str12 nvarchar(4000), @str13 nvarchar(4000), @str14 nvarchar(4000), @str15 nvarchar(4000),
      @str16 nvarchar(4000), @str17 nvarchar(4000), @str18 nvarchar(4000), @str19 nvarchar(4000), @str20 nvarchar(4000)
declare @ErrorSource varchar(255),  @ErrorDesc varchar(255)

-- create Stream Object from file
exec @retval = sp_OACreate 'ADODB.Stream', @objStream OUT
--open the stream object.  default stream type is text and should fit our needs
exec sp_OASetProperty @objStream, 'Type', 2 -- text(default)
exec sp_OASetProperty @objStream, 'CharSet', 'utf-8'
exec sp_OAMethod @objStream, 'Open'
--load file content into stream
exec @retval = sp_OAMethod @objStream, 'LoadFromFile', Null, @strPathToFile
IF @retval = 0 -- succesfully loaded file
begin
      -- get file size
      exec sp_OAGetProperty @objStream, 'Size', @filesize OUT
      -- you need to check file size and decide if it's too big or not
      -- depending on the number of @strN you declared :
      -- eg with @str1 to @str20, you can handle files up to 20*3700 = 74000 characters
      if @filesize <= 74000
      begin
            set @chunknum = 1
            while @filesize > 0
            begin
                  exec sp_OAMethod @objStream, 'ReadText', @chunkstr out, 3700
                  set @chunkstr = replace(@chunkstr, char(39), char(39) + char(39))
                  select       @str1 = isnull(case when @chunknum = 1 then @chunkstr else @str1 end,''),
                        @str2 = isnull(case when @chunknum = 2 then @chunkstr else @str2 end,''),
                        @str3 = isnull(case when @chunknum = 3 then @chunkstr else @str3 end,''),
                        @str4 = isnull(case when @chunknum = 4 then @chunkstr else @str4 end,''),
                        @str5 = isnull(case when @chunknum = 5 then @chunkstr else @str5 end,''),
                        @str6 = isnull(case when @chunknum = 6 then @chunkstr else @str6 end,''),
                        @str7 = isnull(case when @chunknum = 7 then @chunkstr else @str7 end,''),
                        @str8 = isnull(case when @chunknum = 8 then @chunkstr else @str8 end,''),
                        @str9 = isnull(case when @chunknum = 9 then @chunkstr else @str9 end,''),
                        @str10 = isnull(case when @chunknum = 10 then @chunkstr else @str10 end,''),
                        @str11 = isnull(case when @chunknum = 11 then @chunkstr else @str11 end,''),
                        @str12 = isnull(case when @chunknum = 12 then @chunkstr else @str12 end,''),
                        @str13 = isnull(case when @chunknum = 13 then @chunkstr else @str13 end,''),
                        @str14 = isnull(case when @chunknum = 14 then @chunkstr else @str14 end,''),
                        @str15 = isnull(case when @chunknum = 15 then @chunkstr else @str15 end,''),
                        @str16 = isnull(case when @chunknum = 16 then @chunkstr else @str16 end,''),
                        @str17 = isnull(case when @chunknum = 17 then @chunkstr else @str17 end,''),
                        @str18 = isnull(case when @chunknum = 18 then @chunkstr else @str18 end,''),
                        @str19 = isnull(case when @chunknum = 19 then @chunkstr else @str19 end,''),
                        @str20 = isnull(case when @chunknum = 20 then @chunkstr else @str20 end,''),
                        @chunknum = @chunknum + 1, @filesize = @filesize - 3700
            end
      end
end

exec('declare @handle int;exec sp_xml_preparedocument @handle out, ''' + @str1 + @str2 + @str3 + @str4 + @str5
 + @str6 + @str7 + @str8 + @str9 + @str10 + @str11 + @str12 + @str13 + @str14 + @str15
 + @str16 + @str17 + @str18 + @str19 + @str20+ ''';declare c1 cursor global for select @handle')
open global c1
fetch c1 into @hdoc
close global c1
deallocate global c1
exec sp_OAMethod @objStream, 'Close'
exec sp_OADestroy @objStream
go
0
 
LVL 26

Expert Comment

by:Hilaire
ID: 13806183
Sample code of how to use it

DECLARE @hDoc int
EXEC usp_OpenXML_From_File 'c:\temp\test.xml', @hDoc OUTPUT

--  Reading the file using OPENXML
     SELECT *
     FROM OPENXML(@hDOC, '/Project/Tasks/Task', 2)
     WITH
     (
          taskID nvarchar(255) '@ID'
     )

exec sp_xml_removedocument @hdoc

Notes :
- the code above can easily be extended to open bigger files, just add more @strN variables ...
- the file must be visible from the SQL Server, but UNC paths are OK provided that the SQL Server has read permissions

eg a valid path could be

'\\file-server\publicSharedDirectory\filename.xml'

0
 

Author Comment

by:LTSave
ID: 13806555
Thanks Hilaire, your code is of great help.
0
 
LVL 8

Expert Comment

by:mani_sai
ID: 22885718
Hilaire:
Can you give the code sample which reads the large XML data from the text field?
Thank you very much.
0
 
LVL 26

Expert Comment

by:Hilaire
ID: 22889219
Here's the code I digged out from my personal archives. It comes "as is" and I won't have much time to give further support for the moment, but It's worth a try.

This code was developped on a sql server 2000 database, I'm pretty sure SQL server 2005 can handle this in a much better way, using native functions. I can't provide much help for the moment, since I happen to work mostly on Oracle databases these days (my customers do ^^), so I don't even have a suitable test environment at the time being.

/*
Here's a stored procedure that can return a valid handle on a xml file stored in a text column.
It does not use a text variable.
A reference to the text field is passed to the stored procedure with a table_name.column_name string AND a text pointer.
It's not perfect but handles XML strings up to around 300 Ko. Not perfect but better than the 8Ko limit with sp_xml_preparedocument ... The 300 Ko limitation is due to the fact that the stored procedure uses dynamic SQL.
*/
--The procedure
create proc usp_prepare_textXmlDoc (@textcol sysname, @txtPtr varbinary(16), @hDoc int output)
as begin
        declare @sql nvarchar(4000), @sql1 nvarchar(800), @sql2 nvarchar(2800), @sql3 nvarchar(400)
        declare @i int, @chunks# int, @tmpPtr varbinary(16)
        -- create temp table
        select cast('' as text) as txt into #temp
        set @i=0
        -- update #temp table with dynamic SQL
        set @sql = N'updatetext #temp.txt @tmpPtr null null ' + @textcol + ' @txtPtr'
        select @tmpPtr = textptr(txt) from #temp
        exec sp_executesql @sql, N'@tmpPtr varbinary(16),@txtPtr varbinary(16)', @tmpPtr, @txtPtr
        -- calculate number of chunks : chunk size is 7500 instead of 8000, cause we still have to escape single quotes
        select @chunks# = datalength(txt)/7500+1 from #temp
        while @i < @chunks#
        begin
                set @sql1 = coalesce(@sql1 + N',@c', N'declare @c') + cast(@i as nvarchar) + N' varchar(8000)'
                set @sql2 = coalesce(@sql2 + N',@c', N';select @c') + cast(@i as nvarchar) + N'=replace(substring(txt,' + cast(@i*7500+1 as nvarchar) + N',7500),char(39),char(39)+char(39))'
                set @sql3 = coalesce(@sql3 + N'+@c', N' from #temp; exec(''declare @handle int;exec sp_xml_preparedocument @handle out, ''''''+@c') + cast(@i as nvarchar)
                set @i = @i + 1
        end
        set @sql = @sql1 + @sql2 + @sql3 + '+'''''';declare c1 cursor global for select @handle'');open global c1;fetch c1 into @hdoc;close global c1;deallocate global c1'
        exec sp_executesql @sql, N'@hdoc int output', @hDoc out
        drop table #temp
end
go

-- How to use it
declare @txtPtr varbinary(16), @hdoc int
select @txtPtr = textptr(MATERIAL) from LIBRARY where LIBRARYKEY = 1
-- use your own table and column name to retrieve the actual text field
exec usp_prepare_textXmlDoc 'LIBRARY.MATERIAL', @txtPtr , @hDoc output
select @hdoc as [handle] -- this should be a valid handle if the text field content is well-formed XML
-- do your stuff with openXML here ...
exec sp_xml_removedocument @hdoc

Hope this will help

Hilaire
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Your data is at risk. Probably more today that at any other time in history. There are simply more people with more access to the Web with bad intentions.
In this blog post, we’ll look at how using thread_statistics can cause high memory usage.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …
Suggested Courses

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question