Import XML file in sqlserver 2000

hi,

i want to import an xml file into sql server.

how can i start, pls help me with example.

regards,

M AS

devdeptAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
Aneesh RetnakaranConnect With a Mentor Database AdministratorCommented:
SELECT    *
FROM       OPENXML (@idoc, '/newsitem/newsitempart/newsobject/text/p')
WITH (  
value nvarchar(4000) '.') ---------Make sure that this is NVARCHAR
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
Search on Books online for 'OpenXML' there are some examples

DECLARE @idoc int
DECLARE @doc varchar(1000)
SET @doc ='
<ROOT>
<Customer CustomerID="VINET" ContactName="Paul Henriot">
   <Order CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00">
      <OrderDetail OrderID="10248" ProductID="11" Quantity="12"/>
      <OrderDetail OrderID="10248" ProductID="42" Quantity="10"/>
   </Order>
</Customer>
<Customer CustomerID="LILAS" ContactName="Carlos Gonzlez">
   <Order CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00">
      <OrderDetail OrderID="10283" ProductID="72" Quantity="3"/>
   </Order>
</Customer>
</ROOT>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Execute a SELECT statement that uses the OPENXML rowset provider.
SELECT    *
FROM       OPENXML (@idoc, '/ROOT/Customer',1)
            WITH (CustomerID  varchar(10),
                  ContactName varchar(20))

EXEC sp_xml_removedocument @idoc
0
 
devdeptAuthor Commented:
tnx a lot for your reply.  even sample is working fine.

But xml file which i am having is very completed which is as below


<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="http://idsdat06.reuters.com/rtrsnewsml/2000-10-06/rtrsnewsmlsubset01.xsl"?>
<!DOCTYPE newsitem SYSTEM "http://idsdat06.reuters.com/rtrsnewsml/2000-10-06/rtrsnewsml.dtd">


<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->

<newsitem
 itemid="OLR070241"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241"
 date="2006-05-10T19:31:04+00:00"
 publisher="http://www.reuters.com/ids"
 xml:lang="ar"
 parts="1"
 revision="01"
>

<title>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</title>

<headline>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</headline>

<!-- Byline not present in IIM data -->

<copyright xml:lang="en-GB">(c) Reuters 2006. All rights reserved. Republication or redistribution of Reuters content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Reuters. Reuters and the Reuters sphere logo are registered trademarks and trademarks of the Reuters group of companies around the world.</copyright>

<credit>Reuters</credit>

<dateline>2006-05-10 19:31:04 GMT (Reuters)</dateline>

<caption>WMEGN-GAZA-RAID:&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</caption>

<!-- Citation undefined -->
<!-- tagline undefined -->

<!-- Text of News object -->

<newsitempart role="MAIN"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN" >
<newsobject mimetype="text/x-newstext"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN1"
 mediatype="text" >
<text>
<p>&#1588;&#1607;&#1608;&#1583;: &#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</p>
</text>
</newsobject>

</newsitempart>

<!-- METADATA -->

<metadata>
      <codes class="TP:ID">
            <code code="RTRIDS"/>
      </codes>
      <codes class="IDS:DTDINFO">
            <code code="RTRSNewsML DTD, includes rtrsnewsml.metadata.dtd, rtrsnewsml.handling.dtd, rtrsnewsml.dc.dtd"/>
      </codes>
      <codes class="IDS:FILENAME">
            <code code="2006-05-10T193104Z_01_OLR070241_RTRIDST_0_WMEGN-GAZA-RAID.XML"/>
      </codes>
      <codes class="IDS:MEDIATYPE">
            <code code="TEXT"/>
      </codes>
      <codes class="TP:NUMLOCALDDFS">
            <code code="0"/>
      </codes>
      <codes class="TP:LOCALDDFLIST">
      </codes>
<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->
      <codes class="IIM:DATETIMECREATED">
            <code code="20060510 193104 GMT+00:00 (IIM Timezone Offset currently NOT applied, so always 00:00)"/>
      </codes>
      <codes class="IIM:LANGUAGE">
            <code code="ar"/>
      </codes>
      <codes class="IIM:EDITSTATUS">
            <code code="REUTERS"/>
      </codes>
<!-- No Category code present in IIM data -->
      <codes class="IIM:SUPPLEMENTALCATEGORY">
            <code code="OVR"/>
      </codes>
      <codes class="N2000:TOPIC">
            <code code="OVR"/>
      </codes>
<!-- No N2000:NamedItem codes present in IIM data -->
<!-- No Keywords present in IIM data -->
      <!-- Supplemental category mapping table version: 2.4.1 (Class version: 00.00.12) -->
      <!--  No subject references mapped  -->
      <codes class="IDS:FFHTESTDATA">
            <code code="FeedHandler v4.01"/>
            <code code="NewsMLGenerator v3.1"/>
            <code code="NewsMLGeneratorIIM v3.35"/>
            <code code="NewsMLGeneratorIIMText v03.03"/>
            <code code="NewsMLGeneratorIIMPicture v3.2"/>
            <code code="NewsMLGeneratorIIMGraphic v3.1"/>
            <code code="ContentManager v03.01"/>
      </codes>

      <codes class="IDS:NEWSMLDUID">
            <code code="MTFH07237_2006-05-10_19-31-04_OLR070241_NEWSML"/>
      </codes>

      <codes class="IDS:NEWSMLDUIDBASE">
            <code code="MTFH07237_2006-05-10_19-31-04_OLR070241"/>
      </codes>

      <codes class="IIM:SPECIALINSTRUCTIONS">
            <code code="New Story"/>
      </codes>


<!-- DC codes -->

      <dc element="dc.date.created" value="2006-05-10T19:31:04+00:00"/>
      <dc element="dc.date.published" value="2006-05-10T19:31:04+00:00"/>
      <dc element="dc.creator.program" value="RTR_JANUS 2.300"/>
</metadata>

<!-- HANDLING -->
<handling>
      <routing>WMEGN</routing>
      <product>TXT</product>
      <priority>2</priority>
      <service>RTR_TNS</service>
      <slug>WMEGN-GAZA-RAID</slug>
</handling>


<!-- SOURCE DATA -->
<sourcedata id="MTFH07237_2006-05-10_19-31-04_OLR070241_SOURCEDATA">
DISCLAIMER: Please note that this data is RTRSNewsML compliant, but will NOT contain ALL the expected markup.
Input File: mdf-10-05-2006-19-31-00S.II2
</sourcedata>
</newsitem>





from this file i need only values under columns newsobject and SUPPLEMENTALCATEGORY

in above example &#1588;&#1607;&#1608;&#1583;: &#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;

and

OVR

are the values.



your guidelines will be appreciated.

REgards,

M AS
0
2018 Annual Membership Survey

Here at Experts Exchange, we strive to give members the best experience. Help us improve the site by taking this survey today! (Bonus: Be entered to win a great tech prize for participating!)

 
devdeptAuthor Commented:

Just for information newsobject value is arabic which our EE is not supporting. so you can take instead of arabic as english.

Regards,

M AS
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
I'll take a look into it and will send you the Query from Home, Hang on for 2 hrs...
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
Sorry for this delayed reply.. actually it was heavily raining here at the night

I found some errors in your xml , i have added one additional tag 'xml' , the code is not complete any way I hope you can complete it ..


declare @doc varchar(8000)
SET @doc  = '<xml>
<newsitem
 itemid="OLR070241"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241"
 date="2006-05-10T19:31:04+00:00"
 publisher="http://www.reuters.com/ids"
 xml:lang="ar"
 parts="1"
 revision="01"/>

<title>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</title>

<headline>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</headline>

<!-- Byline not present in IIM data -->

<copyright xml:lang="en-GB">(c) Reuters 2006. All rights reserved. Republication or redistribution of Reuters content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Reuters. Reuters and the Reuters sphere logo are registered trademarks and trademarks of the Reuters group of companies around the world.</copyright>

<credit>Reuters</credit>

<dateline>2006-05-10 19:31:04 GMT (Reuters)</dateline>

<caption>WMEGN-GAZA-RAID:&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</caption>

<!-- Citation undefined -->
<!-- tagline undefined -->

<!-- Text of News object -->

<newsitempart role="MAIN"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN" >
<newsobject mimetype="text/x-newstext"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN1"
 mediatype="text" >
<text>
<p>&#1588;&#1607;&#1608;&#1583;: &#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</p>
</text>
</newsobject>

</newsitempart>

<!-- METADATA -->

<metadata>
     <codes class="TP:ID">
          <code code="RTRIDS"/>
     </codes>
     <codes class="IDS:DTDINFO">
          <code code="RTRSNewsML DTD, includes rtrsnewsml.metadata.dtd, rtrsnewsml.handling.dtd, rtrsnewsml.dc.dtd"/>
     </codes>
     <codes class="IDS:FILENAME">
          <code code="2006-05-10T193104Z_01_OLR070241_RTRIDST_0_WMEGN-GAZA-RAID.XML"/>
     </codes>
     <codes class="IDS:MEDIATYPE">
          <code code="TEXT"/>
     </codes>
     <codes class="TP:NUMLOCALDDFS">
          <code code="0"/>
     </codes>
     <codes class="TP:LOCALDDFLIST">
     </codes>
<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->
     <codes class="IIM:DATETIMECREATED">
          <code code="20060510 193104 GMT+00:00 (IIM Timezone Offset currently NOT applied, so always 00:00)"/>
     </codes>
     <codes class="IIM:LANGUAGE">
          <code code="ar"/>
     </codes>
     <codes class="IIM:EDITSTATUS">
          <code code="REUTERS"/>
     </codes>
<!-- No Category code present in IIM data -->
     <codes class="IIM:SUPPLEMENTALCATEGORY">
          <code code="OVR"/>
     </codes>
     <codes class="N2000:TOPIC">
          <code code="OVR"/>
     </codes>
<!-- No N2000:NamedItem codes present in IIM data -->
<!-- No Keywords present in IIM data -->
     <!-- Supplemental category mapping table version: 2.4.1 (Class version: 00.00.12) -->
     <!--  No subject references mapped  -->
     <codes class="IDS:FFHTESTDATA">
          <code code="FeedHandler v4.01"/>
          <code code="NewsMLGenerator v3.1"/>
          <code code="NewsMLGeneratorIIM v3.35"/>
          <code code="NewsMLGeneratorIIMText v03.03"/>
          <code code="NewsMLGeneratorIIMPicture v3.2"/>
          <code code="NewsMLGeneratorIIMGraphic v3.1"/>
          <code code="ContentManager v03.01"/>
     </codes>

     <codes class="IDS:NEWSMLDUID">
          <code code="MTFH07237_2006-05-10_19-31-04_OLR070241_NEWSML"/>
     </codes>

     <codes class="IDS:NEWSMLDUIDBASE">
          <code code="MTFH07237_2006-05-10_19-31-04_OLR070241"/>
     </codes>

     <codes class="IIM:SPECIALINSTRUCTIONS">
          <code code="New Story"/>
     </codes>


<!-- DC codes -->

     <dc element="dc.date.created" value="2006-05-10T19:31:04+00:00"/>
     <dc element="dc.date.published" value="2006-05-10T19:31:04+00:00"/>
     <dc element="dc.creator.program" value="RTR_JANUS 2.300"/>
</metadata>

<!-- HANDLING -->
<handling>
     <routing>WMEGN</routing>
     <product>TXT</product>
     <priority>2</priority>
     <service>RTR_TNS</service>
     <slug>WMEGN-GAZA-RAID</slug>
</handling> </xml>'

DECLARE @idoc int

exec sp_xml_preparedocument @idoc OUTPUT, @doc

SELECT    *
FROM       OPENXML (@idoc, '/xml/newsitem')
WITH (       itemid  varchar(10) ,
      id      varchar(100) ,
      date    varchar(100),
publisher  varchar(100),
[xml:lang]  varchar(10),
parts int,
revision varchar(2)
)


SELECT    *
FROM       OPENXML (@idoc, '/xml/metadata/codes', 1)
WITH (       class  varchar(100) '@class',
      code      varchar(20) 'code/@code' )

SELECT    *
FROM       OPENXML (@idoc, '/xml/metadata/dc', 1)
WITH (       element  varchar(10) '@element',
      value      varchar(20) '@value' )

--- from handling section
SELECT    *
FROM       OPENXML (@idoc, '/xml/handling')
WITH (       Routing  varchar(10) 'routing',
      product      varchar(10) 'product',
      priority int 'priority',
service varchar(10) 'service',
slug varchar(10) 'slug'
)
EXEC sp_xml_removedocument @idoc
0
 
devdeptAuthor Commented:
Tnx alot.

I have two more things

1. Query to retrieve data from newsitempart/newsobject i.e data between <text > and </text>

2. My xml sheet has arabic data also. Whenever i take arabic data it gives following error.

XML parsing error: An Invalid character was found in text content.


Regards,

M Abdussami
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
>1. Query to retrieve data from newsitempart/newsobject i.e data between <text > and </text>

SELECT    *
FROM       OPENXML (@idoc, '/xml/newsitempart/newsobject/text/p')
WITH (  
value nvarchar(4000) '.')


>2. My xml sheet has arabic data also. Whenever i take arabic data it gives following error

Do this
Save the xml as a .xml file and open it in Internet explorer, it will give you where exaclty the error occured
0
 
devdeptAuthor Commented:
When i open in internet explorer it is not giving any error. But in @doc variable when i set contents of xml file which is having arabic contents it gives error in TSQL. is there is any way to read such encode xml files.

Regards,

M Abdussami
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
can u post the XML herre
0
 
devdeptAuthor Commented:
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="http://idsdat06.reuters.com/rtrsnewsml/2000-10-06/rtrsnewsmlsubset01.xsl"?>
<!DOCTYPE newsitem SYSTEM "http://idsdat06.reuters.com/rtrsnewsml/2000-10-06/rtrsnewsml.dtd">


<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->

<newsitem
 itemid="OLR070241"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241"
 date="2006-05-10T19:31:04+00:00"
 publisher="http://www.reuters.com/ids"
 xml:lang="ar"
 parts="1"
 revision="01"
>

<title>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</title>

<headline>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</headline>

<!-- Byline not present in IIM data -->

<copyright xml:lang="en-GB">(c) Reuters 2006. All rights reserved. Republication or redistribution of Reuters content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Reuters. Reuters and the Reuters sphere logo are registered trademarks and trademarks of the Reuters group of companies around the world.</copyright>

<credit>Reuters</credit>

<dateline>2006-05-10 19:31:04 GMT (Reuters)</dateline>

<caption>WMEGN-GAZA-RAID:&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</caption>

<!-- Citation undefined -->
<!-- tagline undefined -->

<!-- Text of News object -->

<newsitempart role="MAIN"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN" >
<newsobject mimetype="text/x-newstext"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN1"
 mediatype="text" >
<text>
<p>&#1588;&#1607;&#1608;&#1583;: &#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</p>
</text>
</newsobject>

</newsitempart>

<!-- METADATA -->

<metadata>
      <codes class="TP:ID">
            <code code="RTRIDS"/>
      </codes>
      <codes class="IDS:DTDINFO">
            <code code="RTRSNewsML DTD, includes rtrsnewsml.metadata.dtd, rtrsnewsml.handling.dtd, rtrsnewsml.dc.dtd"/>
      </codes>
      <codes class="IDS:FILENAME">
            <code code="2006-05-10T193104Z_01_OLR070241_RTRIDST_0_WMEGN-GAZA-RAID.XML"/>
      </codes>
      <codes class="IDS:MEDIATYPE">
            <code code="TEXT"/>
      </codes>
      <codes class="TP:NUMLOCALDDFS">
            <code code="0"/>
      </codes>
      <codes class="TP:LOCALDDFLIST">
      </codes>
<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->
      <codes class="IIM:DATETIMECREATED">
            <code code="20060510 193104 GMT+00:00 (IIM Timezone Offset currently NOT applied, so always 00:00)"/>
      </codes>
      <codes class="IIM:LANGUAGE">
            <code code="ar"/>
      </codes>
      <codes class="IIM:EDITSTATUS">
            <code code="REUTERS"/>
      </codes>
<!-- No Category code present in IIM data -->
      <codes class="IIM:SUPPLEMENTALCATEGORY">
            <code code="OVR"/>
      </codes>
      <codes class="N2000:TOPIC">
            <code code="OVR"/>
      </codes>
<!-- No N2000:NamedItem codes present in IIM data -->
<!-- No Keywords present in IIM data -->
      <!-- Supplemental category mapping table version: 2.4.1 (Class version: 00.00.12) -->
      <!--  No subject references mapped  -->
      <codes class="IDS:FFHTESTDATA">
            <code code="FeedHandler v4.01"/>
            <code code="NewsMLGenerator v3.1"/>
            <code code="NewsMLGeneratorIIM v3.35"/>
            <code code="NewsMLGeneratorIIMText v03.03"/>
            <code code="NewsMLGeneratorIIMPicture v3.2"/>
            <code code="NewsMLGeneratorIIMGraphic v3.1"/>
            <code code="ContentManager v03.01"/>
      </codes>

      <codes class="IDS:NEWSMLDUID">
            <code code="MTFH07237_2006-05-10_19-31-04_OLR070241_NEWSML"/>
      </codes>

      <codes class="IDS:NEWSMLDUIDBASE">
            <code code="MTFH07237_2006-05-10_19-31-04_OLR070241"/>
      </codes>

      <codes class="IIM:SPECIALINSTRUCTIONS">
            <code code="New Story"/>
      </codes>


<!-- DC codes -->

      <dc element="dc.date.created" value="2006-05-10T19:31:04+00:00"/>
      <dc element="dc.date.published" value="2006-05-10T19:31:04+00:00"/>
      <dc element="dc.creator.program" value="RTR_JANUS 2.300"/>
</metadata>

<!-- HANDLING -->
<handling>
      <routing>WMEGN</routing>
      <product>TXT</product>
      <priority>2</priority>
      <service>RTR_TNS</service>
      <slug>WMEGN-GAZA-RAID</slug>
</handling>


<!-- SOURCE DATA -->
<sourcedata id="MTFH07237_2006-05-10_19-31-04_OLR070241_SOURCEDATA">
DISCLAIMER: Please note that this data is RTRSNewsML compliant, but will NOT contain ALL the expected markup.
Input File: mdf-10-05-2006-19-31-00S.II2
</sourcedata>
</newsitem>
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
Let me checK..
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
It is working fine for me both in SQL server and IE


declare @doc varchar(8000)
SET @doc  = '
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="http://idsdat06.reuters.com/rtrsnewsml/2000-10-06/rtrsnewsmlsubset01.xsl"?>
<!DOCTYPE newsitem SYSTEM "http://idsdat06.reuters.com/rtrsnewsml/2000-10-06/rtrsnewsml.dtd">


<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->

<newsitem
 itemid="OLR070241"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241"
 date="2006-05-10T19:31:04+00:00"
 publisher="http://www.reuters.com/ids"
 xml:lang="ar"
 parts="1"
 revision="01"
>

<title>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</title>

<headline>&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</headline>

<!-- Byline not present in IIM data -->

<copyright xml:lang="en-GB">(c) Reuters 2006. All rights reserved. Republication or redistribution of Reuters content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Reuters. Reuters and the Reuters sphere logo are registered trademarks and trademarks of the Reuters group of companies around the world.</copyright>

<credit>Reuters</credit>

<dateline>2006-05-10 19:31:04 GMT (Reuters)</dateline>

<caption>WMEGN-GAZA-RAID:&#1588;&#1607;&#1608;&#1583;:&#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</caption>

<!-- Citation undefined -->
<!-- tagline undefined -->

<!-- Text of News object -->

<newsitempart role="MAIN"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN" >
<newsobject mimetype="text/x-newstext"
 id="MTFH07237_2006-05-10_19-31-04_OLR070241_MAIN1"
 mediatype="text" >
<text>
<p>&#1588;&#1607;&#1608;&#1583;: &#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;&#1586;</p>
</text>
</newsobject>

</newsitempart>

<!-- METADATA -->

<metadata>
     <codes class="TP:ID">
          <code code="RTRIDS"/>
     </codes>
     <codes class="IDS:DTDINFO">
          <code code="RTRSNewsML DTD, includes rtrsnewsml.metadata.dtd, rtrsnewsml.handling.dtd, rtrsnewsml.dc.dtd"/>
     </codes>
     <codes class="IDS:FILENAME">
          <code code="2006-05-10T193104Z_01_OLR070241_RTRIDST_0_WMEGN-GAZA-RAID.XML"/>
     </codes>
     <codes class="IDS:MEDIATYPE">
          <code code="TEXT"/>
     </codes>
     <codes class="TP:NUMLOCALDDFS">
          <code code="0"/>
     </codes>
     <codes class="TP:LOCALDDFLIST">
     </codes>
<!-- NOTE/CR01: IIM 2:60 Timezone Offset is currently NOT being applied by the Media Server -->
     <codes class="IIM:DATETIMECREATED">
          <code code="20060510 193104 GMT+00:00 (IIM Timezone Offset currently NOT applied, so always 00:00)"/>
     </codes>
     <codes class="IIM:LANGUAGE">
          <code code="ar"/>
     </codes>
     <codes class="IIM:EDITSTATUS">
          <code code="REUTERS"/>
     </codes>
<!-- No Category code present in IIM data -->
     <codes class="IIM:SUPPLEMENTALCATEGORY">
          <code code="OVR"/>
     </codes>
     <codes class="N2000:TOPIC">
          <code code="OVR"/>
     </codes>
<!-- No N2000:NamedItem codes present in IIM data -->
<!-- No Keywords present in IIM data -->
     <!-- Supplemental category mapping table version: 2.4.1 (Class version: 00.00.12) -->
     <!--  No subject references mapped  -->
     <codes class="IDS:FFHTESTDATA">
          <code code="FeedHandler v4.01"/>
          <code code="NewsMLGenerator v3.1"/>
          <code code="NewsMLGeneratorIIM v3.35"/>
          <code code="NewsMLGeneratorIIMText v03.03"/>
          <code code="NewsMLGeneratorIIMPicture v3.2"/>
          <code code="NewsMLGeneratorIIMGraphic v3.1"/>
          <code code="ContentManager v03.01"/>
     </codes>

     <codes class="IDS:NEWSMLDUID">
          <code code="MTFH07237_2006-05-10_19-31-04_OLR070241_NEWSML"/>
     </codes>

     <codes class="IDS:NEWSMLDUIDBASE">
          <code code="MTFH07237_2006-05-10_19-31-04_OLR070241"/>
     </codes>

     <codes class="IIM:SPECIALINSTRUCTIONS">
          <code code="New Story"/>
     </codes>


<!-- DC codes -->

     <dc element="dc.date.created" value="2006-05-10T19:31:04+00:00"/>
     <dc element="dc.date.published" value="2006-05-10T19:31:04+00:00"/>
     <dc element="dc.creator.program" value="RTR_JANUS 2.300"/>
</metadata>

<!-- HANDLING -->
<handling>
     <routing>WMEGN</routing>
     <product>TXT</product>
     <priority>2</priority>
     <service>RTR_TNS</service>
     <slug>WMEGN-GAZA-RAID</slug>
</handling>


<!-- SOURCE DATA -->
<sourcedata id="MTFH07237_2006-05-10_19-31-04_OLR070241_SOURCEDATA">
DISCLAIMER: Please note that this data is RTRSNewsML compliant, but will NOT contain ALL the expected markup.
Input File: mdf-10-05-2006-19-31-00S.II2
</sourcedata>
</newsitem>
'
DECLARE @idoc int

exec sp_xml_preparedocument @idoc OUTPUT, @doc

SELECT    *
FROM       OPENXML (@idoc, '/newsitem')
WITH (      itemid  varchar(10) ,
     id     varchar(100) ,
     date    varchar(100),
publisher  varchar(100),
[xml:lang]  varchar(10),
parts int,
revision varchar(2)
)


SELECT    *
FROM       OPENXML (@idoc, '/newsitem/metadata/codes', 1)
WITH (      class  varchar(100) '@class',
     code     varchar(20) 'code/@code' )

SELECT    *
FROM       OPENXML (@idoc, '/newsitem/metadata/dc', 1)
WITH (      element  varchar(10) '@element',
     value     varchar(20) '@value' )

--- from handling section
SELECT    *
FROM       OPENXML (@idoc, '/newsitem/handling')
WITH (      Routing  varchar(10) 'routing',
     product     varchar(10) 'product',
     priority int 'priority',
service varchar(10) 'service',
slug varchar(10) 'slug'
)

SELECT    *
FROM       OPENXML (@idoc, '/newsitem/newsitempart/newsobject/text/p')
WITH (  
value nvarchar(4000) '.')
EXEC sp_xml_removedocument @idoc
0
 
devdeptAuthor Commented:
Ofcourse it is working fine at my side also if the contents of my xml is only english. But in my real data <text> </text> is arabic (unicode). When i process with that real data which is having arabic contents it is giving error message.

So we need to know how to parse if we have a unicode xml file.

Thanks a lot for continues support.

Regards,

M Abdussami
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
>. But in my real data <text> </text> is arabic (unicode)

Is it giving error in SQL or in IE
0
 
devdeptAuthor Commented:
yes it is nvarchar
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
I am getting Arabic characters here
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
&#1588;&#1607;&#1608;&#1583;: &#1594;&#1575;&#1585;&#1577; &#1580;&#1608;&#1610;&#1577; &#1575;&#1587;&#1585;&#1575;&#1574;&#1610;&#1604;&#1610;&#1577; &#1601;&#1610; &#1594;&#1586;&#1577; &#1604;&#1605; &#1610;&#1578;&#1590;&#1581; &#1607;&#1583;&#1601;&#1607;&#1575; &#1576;&#1593;&#1583;/&#1585;&#1608;&#1610;&#1578;&#1585;
0
 
devdeptAuthor Commented:
:)..

Thanks a lot.

It is working fine after i did little modification in page encoding of xml file from 'utf-8'  to  'windows-1256'


Regards,

M Abdussami
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
Ok.. for me UTF-8 is working fine ..
0
All Courses

From novice to tech pro — start learning today.