Solved

XML Download and Bulk Insert into MS SQL Database

Posted on 2004-08-31
7
440 Views
Last Modified: 2008-01-09
I have a very serious problem.  Our programmer left the company and I've been tasked to find a solution to the following problem, although I have no programming experience and very little SQL experience.  He created a program that downloads XML data from a website, then bulk inserts the data into a table in an MS SQL database.  The problem is it ignores data that is previously in the table, but I need to be able to update that data if need be.  In other words, I want to update existing records if the data has changed between inserts.

Here's the program:

<!-- ---------------------------------------------------------------
-
- Title   : XXXXTrans.wsf
- Author  : XXXX
- Created : 30 Jun 2004
- Purpose : Query daily XXXX transactions and import into
-           XXXX database.
- Params  : HttpTimeout    : Timeout in seconds, -1 = Infinite.
-                            Default is 10 minutes (600 seconds).
-           TransStartDate : Transaction start date relative to
-                            today. Default is -1 (yesterday).
-           TransEndDate   : Transaction end date relative to today
-                            Default is 0 (today).
- Returns : N/A
- Notes   :
- History :
-
---------------------------------------------------------------- -->

<job id="XXXX XML Import">
<script language="JScript" src="lib\AddrList.js"/>
<script language="JScript" src="lib\DateUtils.js"/>
<script language="JScript" src="lib\FileUtils.js"/>
<script language="JScript" src="lib\MailUtils.js"/>
<script language="VBScript" src="QueryEfsNet.vbs"/>
<script language="JScript">


var ErrorLogFile = "TransErr.log";
var Database     = 'XXXX';
var SqlServer    = 'XXXX';

var XmlTempFile  = "XXXX.xml";


var g_SessionLog = '';
function Trace(Text)
{
  WScript.Echo(Text);
  g_SessionLog = g_SessionLog + "\r\n" + Text;
}

var Args = WScript.Arguments;
if(Args.length > 3)
{
  var WshShell = new ActiveXObject('WScript.Shell');
  WshShell.Popup('Invalid script arguments.\r\n\r\nSyntax: [HttpTimeout [TransStartDate [TransEndDate]]]', 30, 'Syntax Error');
  WScript.Quit(255);
}

function lpad(v)
{
  return (v < 10 ? '0' : '') + v;
}

function RelativeDate(Diff)
{
  var Now   = new Date();
  var Today = new Date(Now.getYear(), Now.getMonth(), Now.getDate());
  var Rel   = new Date(Today.valueOf() + (MILLISECONDSPERDAY * Diff));  
  var m     = Rel.getMonth() + 1;
  var d     = Rel.getDate();
  var y     = Rel.getYear();
  return y.toString().substr(2, 2) + lpad(m) + lpad(d);
}

var Timeout        = 600;
var TransStartDate = RelativeDate(-14);
var TransEndDate   = RelativeDate(0);

switch(Args.length)
{
  case 3:
    TransEndDate   = RelativeDate(parseInt(Args(2)));
  case 2:
    TransStartDate = RelativeDate(parseInt(Args(1)));
  case 1:
    Timeout        = parseInt(Args(0));
  default:
    break;
}

Trace('Timeout       = ' + Timeout);
Trace('TxStartDate   = ' + TransStartDate);
Trace('TxEndDate     = ' + TransEndDate);

var Success = true;  
var ErrorLogExists = false;

var Provider   = 'Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;'
var Catalog    = 'Initial Catalog=' + Database + ';'
var DataSource = 'Data Source=' + SqlServer + ';';

try {
  var XmlInputFile = fso().GetAbsolutePathName(XmlTempFile);
  Trace('XML temp file = ' + XmlInputFile);

  var ArchivePath = fso().GetParentFolderName(XmlInputFile) + '\\Archive\\';
  Trace('Archive path  = ' + ArchivePath);

  Trace('Querying XXXX for transaction data...');
  var URL = XXXXQueryTransactions(TransStartDate, TransEndDate);
  Trace('Query result URL = ' + URL);

  if(URL == '')
    throw new Error(0, 'Empty URL returned from EfsNet');

  var Http = new ActiveXObject("HttpUtil.Http");
  var ResCode = Http.Get(URL, XmlInputFile, Timeout);
  Trace('HTTP Result = ' + ResCode);

  if(ResCode != '200')
    throw new Error(0, 'HTTP GET failed');

  if(!fso().FileExists(XmlInputFile))
    throw new Error(0, 'XML data file not found: ' + XmlInputFile);

  if(fso().FileExists(ErrorLogFile))
    fso().DeleteFile(ErrorLogFile);

  Trace('Archiving XML data file...');
  var TS = TimeStamp('') + '_';
  fso().CopyFile(XmlInputFile, ArchivePath + TS + 'XXXX.xml', false);

  Trace('Creating XML Loader Com Object...');
  var Bulk = new ActiveXObject("SQLXMLBulkLoad.SQLXMLBulkLoad");

  Trace('Configuring loader options...');
  Bulk.ConnectionString = Provider + Catalog + DataSource;
  Bulk.IgnoreDuplicateKeys = true;
  //Bulk.Transaction = true;
  Bulk.ErrorLogFile = ErrorLogFile;

  Trace('Executing XML Bulk Load...');
  Bulk.Execute("XXXXTransMap.xml", XmlInputFile);
}
catch(e) {
  Trace(e.description);
  Success = false;  
}

try {
  ErrorLogExists = fso().FileExists(ErrorLogFile);
}
catch(e) {
  Trace(e.description);
  Success = false;  
}

var SendTo     = e_XXXX;
var SendCC     = '';
var SendFrom   = e_XXXX;
var Subject    = 'XXXX XML Import';
var MsgBody    = (!Success || ErrorLogExists) ? 'Error(s) occurred during import.\r\n' : 'Import succeeded.\r\n';
var Attachment = ErrorLogExists ? ErrorLogFile : '';

SendMail(SendTo, SendCC, Subject, MsgBody + g_SessionLog, Attachment, SendFrom);

</script>
</job>
//eof

Thanks for the help,

David
0
Comment
Question by:uxphreak
  • 2
  • 2
7 Comments
 
LVL 4

Expert Comment

by:chipple
ID: 11960700
Is emptying the table everytime an option? Or do you need to keep some older data?
0
 

Author Comment

by:uxphreak
ID: 11963678
I need to keep all the data in case it needs to be referenced again.

David
0
 
LVL 4

Accepted Solution

by:
chipple earned 250 total points
ID: 11963818
Then the only option I see is to bulk load data in a temporary empty table, and then process it manually.

However with the information that you give it's not possible to help much further than saying this, and things may get a bit complex.
I guess your database structure is in XXXXTransMap.xml, so if you could show what's in there it could be helpful.
0
 
LVL 26

Assisted Solution

by:rdcpro
rdcpro earned 250 total points
ID: 11965388
Using the SQL XML Bulk load component, the IgnoreDuplicateKeys option will leave the rows already in the table untouched.  If you set this to false, it will fail on the insert when it gets to a row that's already there.

I'm not sure that the bulk loader component can do what you want; it uses the bulk load interface of SQL Server's SQLOLEDB native OLE DB provider, and I don't think you can update existing rows.  Read SQL Books Online about the BULK INSERT operation, and it's limitations.  This is really what an UpdateGram is for, but it's not efficient for large volumes of data.  

How big is this insert?

If it's large, Chipple may be right--you might have to load to a temp table, and write a trigger that calls a sproc to process the newly updated records. Maybe you can do this as a DTS package.

Regards,
Mike Sharp
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 11965406
Also, it is possible that there is existing data in the table that is not in the feed at all?  

Regards,
Mike Sharp
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Specific format 21 208
API Soap Calls 4 126
Optimum versions of Selenium Webdriver with Python On Windows 7 1 143
RSS is valid, but improvement recommendations are confusing. 3 53
The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
In an interesting question (https://www.experts-exchange.com/questions/29008360/) here at Experts Exchange, a member asked how to split a single image into multiple images. The primary usage for this is to place many photographs on a flatbed scanner…

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question