Solved

DTS: Non-normalized Flat-file to Normalized SQL Table

Posted on 2002-05-18
7
400 Views
Last Modified: 2013-11-30
Hi.

I have a need to import a pipe-delimited flat-file into SQL, but the problem is that there are a varying number of fields in each row of the flat-file, although field 3 onwards are all the same type of data. For those who know it, my file is the output from the NTResKit ADDUSERS.EXE which enumerates NT Groups and their memberships. For those who don't know it, the file is pretty much this kind of format:

Group1|Description1|Member1|Member2|...|MemberN
Group2|Description2|Member1|Member2|...|MemberN

What I need to be able to do is to import each Group-Member combination as a separate row into SQL so I would then have:

Group1 Description1 Member1
Group1 Description1 Member2
Group1 Description1 Member3
Group1 Description1 MemberN
Group2 Description2 Member1
Group2 Description2 Member2

Or, if you want to get really fancy, then one table for the Groups and their Descriptions and another table for Members and link the two by some ID key would be fantastic!

And finally, for some extra points, if you could get it to repeat this script for ALL files in a directory (all same format) that really would be the Holy Grail!

But the main thing is at least importing ONE file. Any help much appreciated - especially welcomed as VBScript for DTS since I could then understand what's going on and adapt it next time this crops up!!

Thanks,
Calum.
0
Comment
Question by:Calum2
  • 6
7 Comments
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7029930
Okay I know this is a delayed response but I had allocate time to think about this one.

Setting up DTS to import the file with a pipe column delimiter and an LF row delimiter will give us the denormalized table with the total number of columns (N) and the maximun number of users in a given group (u) where
N = u + 2 In other words the total number of columns will be 2 (one for the name and one for the desription) plus the total number of users from the group that has the most users.  In the other groups that have less users, the remaining fields after the last user listed are null.

Like this:
Col1    Col2   Col3    Col4    Col5
Name1   Desc1  User1   User3  <Null>
Name2   Desc2  User2  <Null>  <Null>
Name3   Desc3  User1   User2   User3


Now if we import this table into a temp staging table in an SQL server database with an Auto-increment ID field we have this:

GroupID  Col1    Col2   Col3    Col4    Col5
1        Name1   Desc1  User1   User3  <Null>
2        Name2   Desc2  User2  <Null>  <Null>
3        Name3   Desc3  User1   User2   User3


Now we can write N - 2 queries to place the users in a table with the groupid and user name

Create Table UserTable
GroupID Int,
UserName VarChar(100)
Primary Key NonClustered (GroupID, UserName)

Insert Into UserTable
Select GroupID, Col3 From TempGroupTable Where Col3 Is Not Null

Insert Into UserTable
Select GroupID, Col4 From GroupTable Where Col4 Is Not Null

Insert Into UserTable
Select GroupID, Col5 From GroupTable Where Col5 Is Not Null

And a last query to create the Master GroupTable

Create Table GroupTable
GroupID Int,
Name VarChar(15),
Description VarChar(100),
Primary Key Clustered (GroupID)

Insert Group Table
Select GroupID, Name, Description
From TempGroupTable

Drop table TempGroupTable

Okay Stay tuned for the automation of this int a single DTS package . . .
0
 
LVL 5

Accepted Solution

by:
spcmnspff earned 200 total points
ID: 7029984
The trick is to find out N so you know how many iterations you need of the UserTable insert.  After you import the data into a temporary staging table this can be found by Counting the records in syscolumns.

Create Proc UserInsert As

Declare @N Int

Select @N = Count(*)
from syscolumns SC, sysobects SO
Where SC.ID = AO.ID and SO.Name = 'TempGroupTable'


/*The next challange is to iterate N - 3 insert queries:
I would do this using dynamic SQL.  If we name thecolumns in our staging table col1, col2, col2, coln . . .
*/


Decare @i int, @SQL VarChar(500)
Set @i = 3

While @i <= @N
Begin
    Set @SQL = 'Insert Into UserTable Select GroupID, Col'
    + Cast(@i as VarChar(10)) +
    ' From TempGroupTable Where Col'
    + Cast(@i as VarChar(10)) +
    ' Is Not Null'

    Exec (@SQL)
    Set @i = @i + 1
End


So here's the summary for the DTS package:
1.) Import file into a staging table that has an autoid
2.) Perform an executeSQL task to Truncate the user table and Call the insert user sp  
4.) Perform a second executeSQL task to truncate the group table and insert the new data.
5.) Truncate the staging table.

It's going to be a little tricky but I think I've laid out an adequate plan here . . . let me know how it works.  Have Fun!  
0
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7029998
A couple of minor changes . . . In my example Set '@i=3' should be set '@i=2' And 'While @i<=@N' should be 'While @i<=@N-1'.  Thanks =)

0
VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

 
LVL 5

Expert Comment

by:spcmnspff
ID: 7031072
One more typo . . . to find @N:

Select @N = Count(*)
from syscolumns SC, sysobects SO
Where SC.ID = AO.ID and SO.Name = 'TempGroupTable'

Should read:

Select @N = Count(*)
from syscolumns SC, sysobects SO
Where SC.ID = SO.ID and SO.Name = 'TempGroupTable'


Oh well I'm sure you get the idea! =)
0
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7068987
Were you able to try this?
0
 

Author Comment

by:Calum2
ID: 7074626
Sorry - have been off on a completely IT-free vacation!

I ended up looping through my files with a VBScript, parsing out the data and loading it up into SQL using ADO - not fast, not cool, but worked.

However, I would like to award you the points for taking the trouble to present this suggested solution - very much appreciate your time and hard work.

Thanks.
0
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7075292
Thanks, I apperciate it.  That did present an interestin challenge. =)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now