Link to home
Start Free TrialLog in
Avatar of tobinmarch
tobinmarch

asked on

Need a Stored Procedure which gets XML data from a URL and inserts it into a table.

I don't know if this is possible but I'll let you take a look at this.

Below is a sample of the XML data recieved when typing a URL in the browser.

<?xml version="1.0" standalone="yes" ?>
 <!DOCTYPE pswapi (View Source for full doctype...)>
- <data>
 <transaction TransactionID="123" SubscriberID="123" EventCode="D" SubAccountID="0" SiteID="123" ProductCode="xxxx" Amount="0.00" IPAddress="DELETE" ScheduledAction="N" ScheduledDate="0000-00-00 00:00:00" User1="11" User2="" Timestamp="2003-04-19 03:21:01" OriginalTransactionID="" />
 <transaction TransactionID="321" SubscriberID="321" EventCode="D" SubAccountID="0" SiteID="321" ProductCode="yyyy" Amount="0.00" IPAddress="DELETE" ScheduledAction="N" ScheduledDate="0000-00-00 00:00:00" User1="22" User2="" Timestamp="2003-04-19 03:21:01" OriginalTransactionID="" />
 </data>

My ISP is running SQL Server 2000 and IIS 5.0.

Ideal Solution is to have a Stored Prrocedure run at regular intervals via Scheduled Transactions in SQL Server which gets the above data and inserts it into a table.

Due to the urgency I am making this worth 500 pts.

Good Luck, Thanks
Avatar of amit_g
amit_g
Flag of United States of America image

DECLARE @oXMLDom int
DECLARE @hr int
DECLARE @src varchar(255), @desc varchar(255)
DECLARE @bLoaded bit
DECLARE @xml as varchar(2000)
DECLARE @ixml int

EXEC @hr = sp_OACreate 'Microsoft.XMLDOM', @oXMLDom OUT

IF @hr <> 0
BEGIN
     EXEC sp_OAGetErrorInfo @oXMLDom, @src OUT, @desc OUT
     SELECT hr=convert(varbinary(4),@hr), Source=@src, Description=@desc

     RETURN
END

--select @oXMLDom

EXEC @hr = sp_OASetProperty @oXMLDom, 'async', false

--select @xml

IF @hr <> 0
BEGIN
     EXEC sp_OAGetErrorInfo @oXMLDom, @src OUT, @desc OUT
     SELECT hr=convert(varbinary(4),@hr), Source=@src, Description=@desc

     RETURN
END

EXEC @hr = sp_OAMethod @oXMLDom, 'load', @bLoaded OUT, 'http://localhost/Test/SQLTest2.xml'

--select @bLoaded

IF @hr <> 0
BEGIN
     EXEC sp_OAGetErrorInfo @oXMLDom, @src OUT, @desc OUT
     SELECT hr=convert(varbinary(4),@hr), Source=@src, Description=@desc

     RETURN
END

EXEC @hr = sp_OAGetProperty @oXMLDom, 'xml', @xml OUT

--select @xml

IF @hr <> 0
BEGIN
     EXEC sp_OAGetErrorInfo @oXMLDom, @src OUT, @desc OUT
     SELECT hr=convert(varbinary(4),@hr), Source=@src, Description=@desc

     RETURN
END

EXEC sp_xml_preparedocument @ixml OUTPUT, @xml

-- do insert here
SELECT *
FROM OPENXML (@ixml, '/data/transaction',1)
     WITH (
               TransactionID  varchar(10),
               SubscriberID varchar(20),
               ProductCode varchar(20)
-- add more columns and give correct type
          )

EXEC @hr = sp_OADestroy @oXMLDom

IF @hr <> 0
BEGIN
     EXEC sp_OAGetErrorInfo @oXMLDom, @src OUT, @desc OUT
     SELECT hr=convert(varbinary(4),@hr), Source=@src, Description=@desc

     RETURN
END
Avatar of tobinmarch
tobinmarch

ASKER

I am working on this solution...I need access to the stored procedures being called from this script before i can fully test it.
Does you ISP allow you access to set up Virtual Directories for SQL server?  If that is the case you can set this up to create an xml file on the fly.  Look it up.  Really handy!
The ideal solution to this problem would be to code this logic in a vbs or vb or any other language and schedule the vbs file or the exe in windows scheduler. If you can get this permission from the ISP, do it this way.

Speaking of that, by giving execute permission to these stored procedures (sp_OA*), your ISP would essentially give you almost complete access to his hosting m/c !!!
Would it be possible to copy these sp_OA* stored procedures to my database, rename them and call them from the script?
No.  They are Extended Stored Procedures.  That is, they are in a DLL, in this case odsole70.dll

Anthony
I don't want to over simplify the problem, but SQL Server handles XML fine without a bunch of over-complicated scripting:


http://support.microsoft.com/default.aspx?scid=kb;en-us;Q316005
>>SQL Server handles XML fine without a bunch of over-complicated scripting<<

I agree.

Anthony
As I have already written, the ideal and the simplest way to do so would be in an vb/c++ exe or in vbs file and schedule it in windows scheduler (may be 10-15 lines of code) but the way the problem has been written, it looks like we don't have that option. The proposed code is not complicated because of SQL server not handling the xml well but because we want to pull xml from web using http from the stored procedure. The use of sp_OA* is just for that reason. If the xml file was available locally, there is no need to use these.

Also the code looks complicated because of error handling after every sp_OA call, if we remove those statements the real code is only ...

DECLARE @hr int
DECLARE @src varchar(255), @desc varchar(255)
DECLARE @bLoaded bit
DECLARE @xml as varchar(2000)
DECLARE @ixml int

EXEC @hr = sp_OACreate 'Microsoft.XMLDOM', @oXMLDom OUT

EXEC @hr = sp_OASetProperty @oXMLDom, 'async', false

EXEC @hr = sp_OAMethod @oXMLDom, 'load', @bLoaded OUT, 'http://localhost/Test/SQLTest2.xml'

EXEC @hr = sp_OAGetProperty @oXMLDom, 'xml', @xml OUT

EXEC sp_xml_preparedocument @ixml OUTPUT, @xml

SELECT *
FROM OPENXML (@ixml, '/data/transaction',1)
    WITH (
              TransactionID  varchar(10),
              SubscriberID varchar(20),
              ProductCode varchar(20)
-- add more columns and give correct type
         )

EXEC @hr = sp_OADestroy @oXMLDom
Any time you use sp_OA it adds complications and unreliability--I would only use them as last restort.  Why don't you check out SQLXML upgrades for SQL--I bet these will add the functionality you need:

http://www.microsoft.com/downloads/details.aspx?FamilyId=4023DEEA-F179-45DE-B41D-84E4FF655A3B&displaylang=en


Brett
I am not going to be able to use the sp_OA* stored procedures on my SQL Server due to security settings at my ISP :(

I have read the article referred to me by arbert:

http://support.microsoft.com/default.aspx?scid=kb;en-us;Q316005

I suspect this may be a good direction to follow. I have a question about the XML Data Source they are refering to inthis article. Their XML data is structured like this:

<ROOT>
  <Customers>
    <CustomerId>1111</CustomerId>
    <CompanyName>Sean Chai</CompanyName>
    <City>NY</City>
  </Customers>
  <Customers>
    <CustomerId>1112</CustomerId>
    <CompanyName>Tom Johnston</CompanyName>
    <City>LA</City>
  </Customers>
  <Customers>
    <CustomerId>1113</CustomerId>
    <CompanyName>Institute of Art</CompanyName>
  </Customers>
</ROOT>

Each of the values are surounded by individual tags...eg.<CustomerID>1111</CustomerID>. My XML data looks like. <Transaction TransactionID="1111" etc.../>. The values are embedded in the Transaction tag. How can I reference this data????

Microsoft is referencing their data like this:
<ElementType name="CustomerId" dt:type="int" />

How would I reference the data in my XML file??
I suspected that from the beginning but if you need this implemented in stored procedure, that is the only solution.

You can read your XML like this ...

DECLARE @ixml int
DECLARE @xml as varchar(2000)

set @xml = 'Place your XML here'

EXEC sp_xml_preparedocument @ixml OUTPUT, @xml

SELECT *
FROM OPENXML (@ixml, '/data/transaction',1)
    WITH (
              TransactionID  varchar(10),
              SubscriberID varchar(20),
              ProductCode varchar(20)
-- add more columns and give correct type
         )

You still need to solve the problem of getting xml from url. Your problem has 2 parts

1) Getting the xml from url using only SQL Server.
2) Parsing the xml and inserting records in SQL.

The problem 2 has a very simple solution that is listed here.

Problem 1 is still pending.
I don't like it when people quote "that is the only solution".  A stored proc is NOT your only solution here.  You can easily setup a scheduled DTS package to accomplish the same thing.  The only gotcha is that you need to install the SQLXML services that Microsoft released after SQL2000 was released.  After you install this, you can access XML just like any other data source.

Brett
Can I use a .vbs script to get the data and execute a stored procedure passing the xml as a variable. this could be set up to run in windows scheduler provided my ISP will allow it. I think they will as I know them personally.

If this would work could someone provide me with a sample of the vbscript?
What does this service pack allow me to do?
http://www.microsoft.com/downloads/details.aspx?FamilyId=4023DEEA-F179-45DE-B41D-84E4FF655A3B&displaylang=en

It states that you can:
Load XML into SQL Server

But how??
arbert, I never said that stored proc is the only solution. What I wrote was that if you need this done via stored procedure then this is the only solution i.e. to use sp_OA*. Please read the comments carefully. Here is what I wrote ...

"... if you need this implemented in stored procedure, that is the only solution."

In fact, I have already mentioned ...

"The ideal solution to this problem would be to code this logic in a vbs or vb or any other language and schedule the vbs file or the exe in windows scheduler..."

tobinmarch, yes you can do it in vbs file. In fact that would be much simpler and better way to do so. Here is how you can do that.

'VBS File CODE

Const adVarChar = 200
Const adParamInput = &H0001
Const adCmdStoredProc = &H0004

dim oXMLDom, bLoaded, sXML

set oXMLDom = CreateObject("Microsoft.XMLDOM")

oXMLDom.async = false

bLoaded = oXMLDom.Load("http://localhost/Test/Test.xml")

if (not bLoaded) then
     MsgBox("Error - XML Not Loaded")
end if

sXML = oXMLDom.xml

dim oConn, oCmd, oParameter

set oConn = CreateObject("ADODB.Connection")
set oCmd = CreateObject("ADODB.Command")

oConn.Open("Provider=SQLOLEDB.1;Persist Security Info=True;Initial Catalog=pubs;Data Source=.;User Id=sa;PASSWORD=sql;")

oCmd.ActiveConnection = oConn
oCmd.CommandText = "usp_InsertFromXML"
oCmd.CommandType = adCmdStoredProc

set oParameter = oCmd.CreateParameter("@xml", adVarChar, adParamInput, 2000, sXML)

oCmd.Parameters.Append oParameter

oCmd.Execute

MsgBox("Done")

set oParameter = nothing
set oCmd = nothing
set oConn = nothing
set oXMLDom = nothing

// Stored Procedure

CREATE PROCEDURE usp_InsertFromXML(@xml as varchar(2000))  AS

declare @ixml as int

EXEC sp_xml_preparedocument @ixml OUTPUT, @xml

insert into DataTransactions
(
              TransactionID,
              SubscriberID,
              ProductCode
-- add more columns
)
SELECT *
FROM OPENXML (@ixml, '/data/transaction',1)
    WITH (
              TransactionID  varchar(10),
              SubscriberID varchar(20),
              ProductCode varchar(20)
-- add more columns and give correct type
         )
GO
I tried the above code and I get the Error "Error - XML Not Loaded" when i use:
bLoaded = oXMLDom.Load("https://secure.pswbilling.com/cgi-bin/reports/transaction-search.xcgi?AccountID=xxx&AcctPassword=xxx")

If we can get this to work I think we will have a suitable solution.
Cut and paste the URL

https://secure.pswbilling.com/cgi-bin/reports/transaction-search.xcgi?AccountID=xxx&AcctPassword=xxx

in some browser and make sure you get correct xml.
Tobinmarch, to answer your question above "What does this service pack allow me to do?", Like I said above,
"After you install this, you can access XML just like any other data source."  Basically you can connect to the XML file just like you would a database.....

brett

I do get correct xml each time
The data I get looks exactly like that in my original question, except for the values of course.
TobinMarch, what version of MDAC do you have loaded on that machine--the older versions didn't have the all (or any) of the XML pieces loaded.  You might look at downloading MDAC 2.7

http://microsoft.com/data


Brett
The data I get looks exactly like that in my original question, except for the values of course.
I will load it tonight and re-test.
I will load it tonight and re-test.
I have tested the VB Script and it works when I use a local file on my development SQL/IIS box using the following XML file:

(Source)

<data>
  <transaction TransactionID="111" SubscriberID="111" EventCode="D" SubAccountID="0" SiteID="111" ProductCode="xxx" Amount="0.00" IPAddress="DELETE" ScheduledAction="N" ScheduledDate="0000-00-00 00:00:00" User1="11" User2="" Timestamp="2003-04-22 03:20:30" OriginalTransactionID="" />
  <transaction TransactionID="222" SubscriberID="717030" EventCode="D" SubAccountID="0" SiteID="222" ProductCode="yyy" Amount="0.00" IPAddress="DELETE" ScheduledAction="N" ScheduledDate="0000-00-00 00:00:00" User1="22" User2="" Timestamp="2003-04-22 03:20:30" OriginalTransactionID="" />
</data>

*but...

The XML code I am recieving from the internet looks like that in the question I posted at the top of the page. It has...

<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE pswapi (View Source for full doctype...)>


...preceeding the data. This is causing the VB Script to get an error loading XML Data. If you can fix this then I believe we will have a solution.
The DOCTYPE element contains incorrect data. Ask the provider about it and say that the xml that they produce is not loadable in MSXML parser.
OK
This is what the tech at PSW said:

Here is what I believe is happening to your requests.  Either the way you are requesting the file, or how VB is interpreting the page returned, you're seeing the rendered page that IE is displaying.  If you take that page and view the source of it, you'll see that nowhere do we use the text "(View Source for full doctype...)".  In fact if you view the source, you'll see this:
 
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE pswapi [
  <!ELEMENT data (transaction*)>
  <!ATTLIST transaction       TransactionID ID #REQUIRED
                              SubscriberID NMTOKEN #REQUIRED
                              EventCode NMTOKEN #REQUIRED
                              SubaccountID NMTOKEN #REQUIRED
                              SiteID NMTOKEN #REQUIRED
                              ProductCode NMTOKEN #REQUIRED
                              Amount NMTOKEN #REQUIRED
                              IPAddress NMTOKEN #REQUIRED
                              ScheduledAction NMTOKEN #REQUIRED
                              ScheduledDate NMTOKEN #REQUIRED
                              User1 NMTOKEN #REQUIRED
                              User2 NMTOKEN #REQUIRED
                              Timestamp NMTOKEN #REQUIRED
                              OriginalTransactionID NMTOKEN #REQUIRED >
]>
<data>

I told him that I thought there was a problem with the DOCTYPE  element and the text "(View Source for full doctype...)" was messing up the script.

Please help figure this out for me, I think we are close to resolving this.
The tech is a absolutely right.  When you see the document in IE, this is what you see (I just did it):

  <?xml version="1.0" standalone="yes" ?>
  <!DOCTYPE pswapi (View Source for full doctype...)>
- <data>
  <transaction TransactionID="111" SubscriberID="111" EventCode="D" SubAccountID="0" SiteID="111" ProductCode="xxx" Amount="0.00" IPAddress="DELETE" ScheduledAction="N" ScheduledDate="0000-00-00 00:00:00" User1="11" User2="" Timestamp="2003-04-22 03:20:30" OriginalTransactionID="" />
  <transaction TransactionID="222" SubscriberID="717030" EventCode="D" SubAccountID="0" SiteID="222" ProductCode="yyy" Amount="0.00" IPAddress="DELETE" ScheduledAction="N" ScheduledDate="0000-00-00 00:00:00" User1="22" User2="" Timestamp="2003-04-22 03:20:30" OriginalTransactionID="" />
  </data>

Remember that IE does NOT check for valid XML documents, it only checks to see that the document is well-formed.

In this case the document is well formed, but is not a valid XML document.  The reason is that it is not valid is because it does not comply with the DTD.  The top most element must match the name of the DOCTYPE declaration.

Anthony
So what is the solution to my problem here? Is it something we can code arround or do I need to complain once again to the tech?
ASKER CERTIFIED SOLUTION
Avatar of amit_g
amit_g
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Works great!! Thanks to all for the help.
I appreciate all your help with this. I would have struggled for a long time if it hadn't been for your skill and quick responses.

Thanks TOBIN
Anytime. I also learned few things.

Thanks for the A.
I ran into one complication. I am getting an error when I try to parse more than 27 records. Probably 25 records if I use real data.

I believe the problem is that in the Stored Proc. we are defining the incoming xml data as a varchar(2000) and the data is greater than 2000 bits/bytes or whatever. I have changed this to 8000 which is that maximum allowed and can only process 27 records that way. When I leave it at 2000 I can only get 6 records.

I would like to be able to process an unlimited (within reason of course) number of records.

What can be done to fix this?

Desperatly tobinmarch

I will award more points to whoever can fix this.
You said in the first post that this is scheduled process. So one way (easy one) to solve this would be to increase the frequency i.e. run it more often. I hope the data you get is incremental so in each run you will get less data.

If this is not possible you have 2 alternatives

1) Get XML directly into stored procedure. For that you will need to use sp_OA* and that you have ruled out already.

2) Parse the XML in vbs file itself and do one insert at a time. I can give you direction to do so if you decide to take this alternative.
I believe i have resolved this by changing the XML data's data type to text. In the vbs file i changed the line:

set oParameter = oCmd.CreateParameter("@xml", adVarChar, adParamInput, 2000, sXML)

to

set oParameter = oCmd.CreateParameter("@xml", adVarChar, adParamInput, 2147483647, sXML)

2147483647 is the max length of a text data type.

I was able to import 10000 records this way.

I believe all is good. Thanks again for the help.
I need the stored procedure to insert only transaction which are unique. The TransactionID field has a unique value. How can this be done easily???
So the data is not incremental. Try this little change in the stored proc

insert into DataTransactions
(
             TransactionID,
             SubscriberID,
             ProductCode
-- add more columns
)
SELECT *
FROM OPENXML (@ixml, '/data/transaction',1)
   WITH (
             TransactionID  varchar(10),
             SubscriberID varchar(20),
             ProductCode varchar(20)
-- add more columns and give correct type
        ) as DT1 -- Note the alias here
-- add following condition
where
     not exists (select 1 from DataTransactions DT2 where DT2.TransactionID = DT1.TransactionID)
Works fine..

One more thing, the value in for Timestamp is not being inserted in the database. Maybe I am missing something obvious but here is my SP...

CREATE PROCEDURE usp_InsertFromXML(@xml AS TEXT)  AS

DECLARE @ixml AS INT

EXEC sp_xml_preparedocument @ixml OUTPUT, @xml

INSERT INTO DataTransactions
     (
     TransactionID,
     SubscriberID,
     EventCode,
     SubAccountID,
     SiteID,
     ProductCode,
     Amount,
     IPAddress,
     ScheduledAction,
     ScheduledDate,
     User1,
     User2,
     _Timestamp,
     OriginalTransactionID
     )
SELECT *
FROM OPENXML (@ixml, '/data/transaction',1)
   WITH (
             TransactionID  varchar(10),
             SubscriberID varchar(20),
             EventCode varchar(50),
     SubAccountID varchar(50),
     SiteID varchar(50),
     ProductCode varchar(50),
     Amount varchar(50),
     IPAddress varchar(50),
     ScheduledAction varchar(50),
     ScheduledDate varchar(50),
     User1 varchar(50),
     User2 varchar(50),
     Timestamp varchar(50),
     OriginalTransactionID varchar(50)
     )
     AS DT1
     WHERE
         not exists (SELECT 1 FROM DataTransactions DT2 WHERE DT2.TransactionID = DT1.TransactionID)
GO


Timestamp is a reserverd word and I think that is mesing up the insert...

_Timestamp is the DB Field in the DB and it is a varchar(50).
I don't think that is the porblem but to be sure enclose any name that is reserved word in [ ].

//...
    ScheduledDate varchar(50),
    User1 varchar(50),
    User2 varchar(50),
    [Timestamp] varchar(50), -- ******************
    OriginalTransactionID varchar(50)
//...