how to read in a csv data dictionary and build SQL Server tables from it.

Greetings,

I have a csv sheet that is a data dictionary.  It looks like this:

TableName	PhysicalTableName	PrimaryKey	ForeignKey	ForeignKey	FieldName	PhysicalFieldName	DATATYPE	LENGTH
 Owner Table	OWNER	X	Lease ID	API14	*Owner ID	*OWNR_ID	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**Lease ID List	**LSE_ID_LIST	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**API14 List	**API14_LIST	VARCHAR	50
 Owner Table	OWNER				Lease Count	LSE_CNT	VARCHAR	50

Open in new window


As you can see, this is not about importing a csv full of data.  This is a csv with schema metadata.
I need to know if there's a way in SQL Server to build the tables defined in this csv.  There are over 2000 lines in this document.
I'm hoping I don't have to run this manually.

If anyone has any ideas, I'd greatly appreciate it.
Thanks
LVL 9
Evan CutlerVolunteer Chief Information OfficerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Pawan KumarDatabase ExpertCommented:
can you give me one example what you need from the CSV data we are inserting into the table.

This would be very very difficult to build the SQL even after insertion of CSV into the table. Please use below code to import the data.

USE [DBNAME]
GO

CREATE TABLE [dbo].[p1](
	[ Owner Table] [varchar](50) NULL,
	[OWNER] [varchar](50) NULL,
	[X] [varchar](50) NULL,
	[Lease ID] [varchar](50) NULL,
	[API14] [varchar](50) NULL,
	[*Owner ID] [varchar](50) NULL,
	[*OWNR_ID] [varchar](50) NULL,
	[VARCHAR] [varchar](50) NULL,
	[50] [varchar](50) NULL
) ON [PRIMARY]

GO

BULK
INSERT [p1]
FROM 'c:\pawan\p\p1.txt'
WITH
(
	FIELDTERMINATOR = '	',
	ROWTERMINATOR = '\n'
)
GO

Open in new window


OUTPUT

/*------------------------
SELECT *  FROM [p1]
------------------------*/
 Owner Table                                       OWNER                                              X                                                  Lease ID                                           API14                                              *Owner ID                                          *OWNR_ID                                           VARCHAR                                            50
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------------------------------------------------
TableName                                          PhysicalTableName                                  PrimaryKey                                         ForeignKey                                         ForeignKey                                         FieldName                                          PhysicalFieldName                                  DATATYPE                                           LENGTH
 Owner Table                                       OWNER                                              X                                                  Lease ID                                           API14                                              *Owner ID                                          *OWNR_ID                                           VARCHAR                                            50
 Owner Table                                       OWNER                                              Owner ID                                           Lease ID                                           API14                                              **Lease ID List                                    **LSE_ID_LIST                                      VARCHAR                                            50
 Owner Table                                       OWNER                                              Owner ID                                           Lease ID                                           API14                                              **API14 List                                       **API14_LIST                                       VARCHAR                                            50
 Owner Table                                       OWNER                                              NULL                                               NULL                                               NULL                                               Lease Count                                        LSE_CNT                                            VARCHAR                                            50

(5 row(s) affected)

Open in new window

0
Evan CutlerVolunteer Chief Information OfficerAuthor Commented:
I don't have that.

I have a blank database.  The tables inside this database is defined in the CSV.
Again, This is not a data file.  This is, at best for lack of better term, a data dictionary.
There are over 200 tables defined in this document.  I really don't want to build them from the csv manually.

As you can see in the example, I have the Table_Name, Column_name, Data_Type, and Data_Length.
Does SQL Server have the ability to ingest this metadata, and create tables from the definition?

Thanks
0
Evan CutlerVolunteer Chief Information OfficerAuthor Commented:
In the example above....
somehow I need the information:

TableName	PhysicalTableName	PrimaryKey	ForeignKey	ForeignKey	FieldName	PhysicalFieldName	DATATYPE	LENGTH
 Owner Table	OWNER	X	Lease ID	API14	*Owner ID	OWNR_ID	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**Lease ID List	**LSE_ID_LIST	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**API14 List	**API14_LIST	VARCHAR	50
 Owner Table	OWNER				Lease Count	LSE_CNT	VARCHAR	50

Open in new window


to turn into:
CREATE TABLE OWNER (
     OWNR_ID varchar(50),
     LSE_ID_LIST varchar(50),
     API14_LIST varchar(50),
     ...
)

Open in new window


I realize there's asterisk's in my information...just ignore them.  Thanks

Appreciate it.
)
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Pawan KumarDatabase ExpertCommented:
There is nothing built in in SQL server ..we have to create some logic to get the statements.. do u have have data also in this CSV file?
eg. data for OWNER  table?
0
Evan CutlerVolunteer Chief Information OfficerAuthor Commented:
Thanks.  No.  The OWNER table is going to be populated by a SSIS package to be built.
so you are suggesting creating some external code to build the tables?
0
Pawan KumarDatabase ExpertCommented:
so you are suggesting creating some external code to build the tables?
yes read the data from the inserted table and build sql for each tableaname in the table and generate sql using dynamic sql.
0
Pawan KumarDatabase ExpertCommented:
We can do like below to create the CREATE TABLE statements.

--

DECLARE @SQL AS VARCHAR(MAX) = ''

SET @SQL = '
;WITH CTE AS
(
	SELECT * FROM p1 WHERE [ Owner Table] <> ''TableName''
)
,CTE1 AS 
(
	SELECT DISTINCT LTRIM(RTRIM([OWNER])) tName FROM CTE
)
,CTE2 AS
(
	SELECT * FROM CTE1 c1
	CROSS APPLY
	(
		SELECT CONCAT(REPLACE([*OWNR_ID],''*'',''''),'' '',[VARCHAR],''('',[50],'')'') l FROM CTE c
		WHERE c.[OWNER] = c1.tName
	)r
)
SELECT CONCAT( ''CREATE TABLE '' , tName , '' ( '' , STUFF 
				((
					SELECT '', '' + l
					FROM CTE2 a
					WHERE ( a.tName = c.tName )
					FOR XML PATH('''')
				) ,1,2,'''') , '' ) '' , ''
GO ''
				) cusr
FROM CTE2 c 
GROUP BY tName '
EXEC (@SQL)

Open in new window


OUTPUT

/*------------------------
OUTPUT
------------------------*/
cusr
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CREATE TABLE OWNER ( OWNR_ID VARCHAR(50), LSE_ID_LIST VARCHAR(50), API14_LIST VARCHAR(50), LSE_CNT VARCHAR(50) ) 
GO 

(1 row(s) affected)

Open in new window


After you got the statement then just copy them to a new query windows and press F5 and all our tables will be created.
0
Scott PletcherSenior DBACommented:
If you have some time, I'm working on a far more comprehensive and flexible approach.  Please let me know if you're in a big hurry, as this will take a bit of time to write up.

No need for me to spend the time and effort if you just want something quick-and-dirty anyway :-).
0
Scott PletcherSenior DBACommented:
Let's assume you have the csv data loaded into a table named
dbo.data_dictionary_import
with these columns:
TableName      PhysicalTableName      PrimaryKey      ForeignKey1      ForeignKey2      FieldName      PhysicalFieldName      DataType      Length

Below is my script to process that data.

I use new control tables to store the info coming in.  I do two passes of the csv data to load those tables:
the first for the main tables,
the second to store the FK relationships (and note missing FK table(s)/column(s), if any).

Once the control tables are loaded,
you can make any modifications to those tables you want/need to make,
then we can use "standard" XML concatenation technique (or another method if you prefer) to generate a script to create the tables.
I've allowed for keys to have multiple columns, even if you don't currently have that situation in your tables, it's possible in the future.

Edit: The "NewDataType*" columns are just in case you want to change the data type of a column when the new table is created.  For example, "LSE_CNT" should really be an int column rather than varchar.  The new type and implementation method would just be options to do that, if it's needed.  If not needed, of course you can just ignore it :-).

/* Create control tables to hold imported data dictionary data. */
CREATE TABLE dbo.tables 
    (
    TableId int IDENTITY(1, 1) NOT NULL CONSTRAINT tables__PK PRIMARY KEY,
    TableName varchar(100) NULL,
    PhysicalTableName varchar(100) NOT NULL,
    SchemaName varchar(30) NOT NULL CONSTRAINT tables__DF_SchemaName DEFAULT 'dbo',
    PrimaryKeyCount tinyint NULL
    );
CREATE TABLE dbo.columns 
    (
    TableId int NOT NULL /*CONSTRAINT columns__FK_tables FOREIGN KEY REFERENCES dbo.tables ( TableId )*/,
    ColumnId int IDENTITY(1, 1) NOT NULL,
    FieldName varchar(100) NULL,
    PhysicalFieldName varchar(100) NOT NULL,
    DataType varchar(50) NULL,
    Length varchar(10) NULL,
    NewDataType varchar(50) NULL,
    NewDataTypeImplementation tinyint NULL,
    CONSTRAINT columns__PK PRIMARY KEY ( TableId, ColumnId )
    );
CREATE TABLE dbo.primary_key_columns
    (
    PkId int IDENTITY(1, 1) NOT NULL,
    TableId int NOT NULL,
    ColumnId int NOT NULL
    /*CONSTRAINT primary_key_columns__FK_columns FOREIGN KEY REFERENCES dbo.columns ( TableId, ColumnId )*/
    );
CREATE TABLE dbo.foreign_key_columns 
    (
    TableIdFrom int NOT NULL,
    ColumnIdFrom int NOT NULL,
    KeySequence tinyint NOT NULL,
    TableIdTo int NULL,
    ColumnIdTo int NULL --,
    /*CONSTRAINT foreign_key_columns__FK_columns1 FOREIGN KEY REFERENCES dbo.columns ( TableIdFrom, ColumnIdFrom )*/,
    /*CONSTRAINT foreign_key_columns__FK_columns2 FOREIGN KEY REFERENCES dbo.columns ( TableIdTo, ColumnIdTo )*/
    );


/* Main code to load the control tables. */

DECLARE /* data dictionary columns */
    @TableName varchar(100),
    @PhysicalTableName varchar(100),
    @PrimaryKey varchar(50),
    @ForeignKey1 varchar(50),
    @ForeignKey2 varchar(50),
    @FieldName varchar(50),
    @PhysicalFieldName varchar(50),
    @DataType varchar(50),
    @Length varchar(10)
DECLARE @ColumnId int
DECLARE @TableId int


DECLARE cursor_data_dictionary CURSOR LOCAL FAST_FORWARD FOR
SELECT 
    ddi.TableName, ddi.PhysicalTableName, 
    ddi.PrimaryKey, 
    ddi.ForeignKey1, ddi.ForeignKey2, 
    ddi.FieldName, ddi.PhysicalFieldName,
    ddi.DataType, ddi.Length
FROM dbo.data_dictionary_import ddi;
--ORDER BY ddi.TableName, ddi.PhysicalTableName

OPEN cursor_data_dictionary;

/* Load all main table info (no FK info). */
WHILE 1 = 1 
BEGIN
    FETCH NEXT FROM cursor_data_dictionary INTO @TableName, @PhysicalTableName, 
        @PrimaryKey, 
        @ForeignKey1, @ForeignKey2, 
        @FieldName, @PhysicalFieldName,
        @DataType, @Length
    IF @@FETCH_STATUS <> 0
        IF @@FETCH_STATUS = -1
            BREAK
        ELSE
            CONTINUE;

    IF EXISTS(SELECT 1 FROM dbo.tables WHERE PhysicalTableName = @PhysicalTableName)
    BEGIN
        SELECT @TableId = TableId
        FROM dbo.tables
        WHERE PhysicalTableName = @PhysicalTableName
    END /*IF*/
    ELSE
    BEGIN
        INSERT INTO dbo.tables ( TableName, PhysicalTableName )
        SELECT @TableName, @PhysicalTableName
        SET @TableId = SCOPE_IDENTITY()
    END /*ELSE*/

    IF @PrimaryKey = 'X'
    BEGIN
        IF EXISTS(SELECT 1 FROM dbo.columns WHERE TableId = @TableId AND PhysicalFieldName = @PhysicalFieldName)
        BEGIN
            SELECT @ColumnId = ColumnId
            FROM dbo.columns
            WHERE TableId = @TableId AND PhysicalFieldName = @PhysicalFieldName
        END /*IF*/
        ELSE
        BEGIN
            INSERT INTO dbo.columns ( TableId, FieldName, PhysicalFieldName, DataType, Length )
            SELECT @TableId, REPLACE(@FieldName, '*', ''), REPLACE(@PhysicalFieldName, '*', ''), @DataType, @Length
            SET @ColumnId = SCOPE_IDENTITY()
        END /*ELSE*/

        IF NOT EXISTS(SELECT 1 FROM dbo.primary_key_columns WHERE TableId = @TableId AND ColumnId = @ColumnId)
        BEGIN
            INSERT INTO dbo.primary_key_columns ( TableId, ColumnId )
            SELECT @TableId, @ColumnId
        END /*IF*/        
    END /*IF*/
    ELSE
    IF NOT EXISTS(SELECT 1 FROM dbo.columns WHERE TableId = @TableId AND PhysicalFieldName = REPLACE(@PhysicalFieldName, '*', ''))
    BEGIN
        INSERT INTO dbo.columns ( TableId, FieldName, PhysicalFieldName, DataType, Length )
        SELECT @TableId, REPLACE(@FieldName, '*', ''), REPLACE(@PhysicalFieldName, '*', ''), @DataType, @Length
        SET @ColumnId = SCOPE_IDENTITY()            
    END /*IF*/    

END /*WHILE*/

CLOSE cursor_data_dictionary;

UPDATE t
SET PrimaryKeyCount = ISNULL(c.PrimaryKeyCount, 0)
FROM dbo.tables t
LEFT OUTER JOIN (
    SELECT TableId, COUNT(*) AS PrimaryKeyCount
    FROM dbo.primary_key_columns
    GROUP BY TableId
) AS c ON c.TableId = t.TableId

SELECT 'tables' AS table_name, * FROM dbo.tables;
SELECT 'columns' AS table_name, * FROM dbo.columns;
SELECT 'primary_key_columns' AS table_name, * FROM dbo.primary_key_columns;

OPEN cursor_data_dictionary;

/* Load FK info (only). */
WHILE 1 = 1 
BEGIN
    FETCH NEXT FROM cursor_data_dictionary INTO @TableName, @PhysicalTableName, 
        @PrimaryKey, 
        @ForeignKey1, @ForeignKey2, 
        @FieldName, @PhysicalFieldName,
        @DataType, @Length
    IF @@FETCH_STATUS <> 0
        IF @@FETCH_STATUS = -1
            BREAK
        ELSE
            CONTINUE;

    IF @ForeignKey1 = ''
        CONTINUE;

    /* Load FK info here */

END /*WHILE*/

DEALLOCATE cursor_data_dictionary;

SELECT 'foreign_key_columns' AS table_name, * FROM dbo.foreign_key_columns;

Open in new window

0
Scott PletcherSenior DBACommented:
I've made final adjustments to the code above.
Please refresh your screen and recapture the code to make sure you get the latest version.

The actual generation of the CREATE TABLE statement with PK and FK constraints is rather trivial once the control tables are loaded.  It won't take long at all to code that up once/if you're ok with the approach overall.
0
Scott PletcherSenior DBACommented:
Here's the sample data I used for testing the script above, based on your original data:

CREATE TABLE dbo.data_dictionary_import (
    TableName varchar(100) NULL,
    PhysicalTableName varchar(100) NULL,
    PrimaryKey varchar(50) NULL,
    ForeignKey1 varchar(50) NULL,
    ForeignKey2 varchar(50) NULL,
    FieldName varchar(50) NULL,
    PhysicalFieldName varchar(50) NULL,
    DataType varchar(50) NULL,
    Length varchar(10) NULL
    )
INSERT INTO dbo.data_dictionary_import VALUES
('Owner Table', 'OWNER',               'X',              'Lease ID',      'API14',      '*Owner ID',          '*OWNR_ID',          'VARCHAR',          '50'),
('Owner Table',      'OWNER'      ,            'Owner ID',      'Lease ID', 'API14'      ,   '**Lease ID List',      '**LSE_ID_LIST','VARCHAR',          '50'),
('Owner Table',      'OWNER'       ,           'Owner ID',      'Lease ID', 'API14'       ,  '**API14 List',      '**API14_LIST',          'VARCHAR',          '50'),
('Owner Table',      'OWNER'            ,             '',             '',             ''     ,    'Lease Count',       'LSE_CNT',      'VARCHAR',          '50')
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Evan CutlerVolunteer Chief Information OfficerAuthor Commented:
You guys are awesome! Thanks so much for helping me out.
This is great!
0
Scott PletcherSenior DBACommented:
Here's the code to generate the PKs:


SELECT 'ALTER TABLE [' + t.SchemaName + '].[' + t.PhysicalTableName + '] ADD ' +
    'CONSTRAINT [' + t.PhysicalTableName + '__PK' + '] PRIMARY KEY CLUSTERED ( ' + STUFF(
    (SELECT ', [' + c.PhysicalFieldName + ']'
     FROM dbo.columns c
     WHERE c.TableId = pkc.TableId AND c.ColumnId = pkc.ColumnId
     FOR XML PATH('')
    ), 1, 2, '') + ') WITH ( FILLFACTOR = 99 );'  AS [--Primary_Key_Definitions]
FROM dbo.primary_key_columns pkc
INNER JOIN dbo.tables t ON t.TableId = pkc.TableId
0
Scott PletcherSenior DBACommented:
And the code for the FKs.  I can't easily test this code as neither referenced table definition -- Lease / API14 -- was present in the sample data.:

SELECT 'ALTER TABLE [' + tFrom.SchemaName + '].[' + tFrom.PhysicalTableName + '] ADD ' +
    'CONSTRAINT [' + tFrom.PhysicalTableName + '__PK' + '] FOREIGN KEY ( ' + STUFF(
    (SELECT ', [' + c.PhysicalFieldName + ']'
     FROM dbo.columns c
     WHERE c.TableId = fkc.TableIdFrom AND c.ColumnId = fkc.ColumnIdFrom
     FOR XML PATH('')
    ), 1, 2, '') + ') ' +
    'REFERENCES [' + tTo.SchemaName + '].[' + tTo.PhysicalTableName + '] ( ' + STUFF(
    (SELECT ', [' + c.PhysicalFieldName + ']'
     FROM dbo.columns c
     WHERE c.TableId = fkc.TableIdTo AND c.ColumnId = fkc.ColumnIdTo
     FOR XML PATH('')
    ), 1, 2, '') + ') ' AS [--Foreign_Key_Definitions]
FROM dbo.foreign_key_columns fkc
INNER JOIN dbo.tables tFrom ON tFrom.TableId = fkc.TableIdFrom
INNER JOIN dbo.tables tTo ON tFrom.TableId = fkc.TableIdTo
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
SQL

From novice to tech pro — start learning today.