?
Solved

DTS: Non-normalized Flat-file to Normalized SQL Table

Posted on 2002-05-18
7
Medium Priority
?
435 Views
Last Modified: 2013-11-30
Hi.

I have a need to import a pipe-delimited flat-file into SQL, but the problem is that there are a varying number of fields in each row of the flat-file, although field 3 onwards are all the same type of data. For those who know it, my file is the output from the NTResKit ADDUSERS.EXE which enumerates NT Groups and their memberships. For those who don't know it, the file is pretty much this kind of format:

Group1|Description1|Member1|Member2|...|MemberN
Group2|Description2|Member1|Member2|...|MemberN

What I need to be able to do is to import each Group-Member combination as a separate row into SQL so I would then have:

Group1 Description1 Member1
Group1 Description1 Member2
Group1 Description1 Member3
Group1 Description1 MemberN
Group2 Description2 Member1
Group2 Description2 Member2

Or, if you want to get really fancy, then one table for the Groups and their Descriptions and another table for Members and link the two by some ID key would be fantastic!

And finally, for some extra points, if you could get it to repeat this script for ALL files in a directory (all same format) that really would be the Holy Grail!

But the main thing is at least importing ONE file. Any help much appreciated - especially welcomed as VBScript for DTS since I could then understand what's going on and adapt it next time this crops up!!

Thanks,
Calum.
0
Comment
Question by:Calum2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
7 Comments
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7029930
Okay I know this is a delayed response but I had allocate time to think about this one.

Setting up DTS to import the file with a pipe column delimiter and an LF row delimiter will give us the denormalized table with the total number of columns (N) and the maximun number of users in a given group (u) where
N = u + 2 In other words the total number of columns will be 2 (one for the name and one for the desription) plus the total number of users from the group that has the most users.  In the other groups that have less users, the remaining fields after the last user listed are null.

Like this:
Col1    Col2   Col3    Col4    Col5
Name1   Desc1  User1   User3  <Null>
Name2   Desc2  User2  <Null>  <Null>
Name3   Desc3  User1   User2   User3


Now if we import this table into a temp staging table in an SQL server database with an Auto-increment ID field we have this:

GroupID  Col1    Col2   Col3    Col4    Col5
1        Name1   Desc1  User1   User3  <Null>
2        Name2   Desc2  User2  <Null>  <Null>
3        Name3   Desc3  User1   User2   User3


Now we can write N - 2 queries to place the users in a table with the groupid and user name

Create Table UserTable
GroupID Int,
UserName VarChar(100)
Primary Key NonClustered (GroupID, UserName)

Insert Into UserTable
Select GroupID, Col3 From TempGroupTable Where Col3 Is Not Null

Insert Into UserTable
Select GroupID, Col4 From GroupTable Where Col4 Is Not Null

Insert Into UserTable
Select GroupID, Col5 From GroupTable Where Col5 Is Not Null

And a last query to create the Master GroupTable

Create Table GroupTable
GroupID Int,
Name VarChar(15),
Description VarChar(100),
Primary Key Clustered (GroupID)

Insert Group Table
Select GroupID, Name, Description
From TempGroupTable

Drop table TempGroupTable

Okay Stay tuned for the automation of this int a single DTS package . . .
0
 
LVL 5

Accepted Solution

by:
spcmnspff earned 600 total points
ID: 7029984
The trick is to find out N so you know how many iterations you need of the UserTable insert.  After you import the data into a temporary staging table this can be found by Counting the records in syscolumns.

Create Proc UserInsert As

Declare @N Int

Select @N = Count(*)
from syscolumns SC, sysobects SO
Where SC.ID = AO.ID and SO.Name = 'TempGroupTable'


/*The next challange is to iterate N - 3 insert queries:
I would do this using dynamic SQL.  If we name thecolumns in our staging table col1, col2, col2, coln . . .
*/


Decare @i int, @SQL VarChar(500)
Set @i = 3

While @i <= @N
Begin
    Set @SQL = 'Insert Into UserTable Select GroupID, Col'
    + Cast(@i as VarChar(10)) +
    ' From TempGroupTable Where Col'
    + Cast(@i as VarChar(10)) +
    ' Is Not Null'

    Exec (@SQL)
    Set @i = @i + 1
End


So here's the summary for the DTS package:
1.) Import file into a staging table that has an autoid
2.) Perform an executeSQL task to Truncate the user table and Call the insert user sp  
4.) Perform a second executeSQL task to truncate the group table and insert the new data.
5.) Truncate the staging table.

It's going to be a little tricky but I think I've laid out an adequate plan here . . . let me know how it works.  Have Fun!  
0
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7029998
A couple of minor changes . . . In my example Set '@i=3' should be set '@i=2' And 'While @i<=@N' should be 'While @i<=@N-1'.  Thanks =)

0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
LVL 5

Expert Comment

by:spcmnspff
ID: 7031072
One more typo . . . to find @N:

Select @N = Count(*)
from syscolumns SC, sysobects SO
Where SC.ID = AO.ID and SO.Name = 'TempGroupTable'

Should read:

Select @N = Count(*)
from syscolumns SC, sysobects SO
Where SC.ID = SO.ID and SO.Name = 'TempGroupTable'


Oh well I'm sure you get the idea! =)
0
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7068987
Were you able to try this?
0
 

Author Comment

by:Calum2
ID: 7074626
Sorry - have been off on a completely IT-free vacation!

I ended up looping through my files with a VBScript, parsing out the data and loading it up into SQL using ADO - not fast, not cool, but worked.

However, I would like to award you the points for taking the trouble to present this suggested solution - very much appreciate your time and hard work.

Thanks.
0
 
LVL 5

Expert Comment

by:spcmnspff
ID: 7075292
Thanks, I apperciate it.  That did present an interestin challenge. =)
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have a large data set and a SSIS package. How can I load this file in multi threading?
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question