Avatar of Evan Cutler
Evan Cutler
Flag for United States of America asked on

how to read in a csv data dictionary and build SQL Server tables from it.

Greetings,

I have a csv sheet that is a data dictionary.  It looks like this:

TableName	PhysicalTableName	PrimaryKey	ForeignKey	ForeignKey	FieldName	PhysicalFieldName	DATATYPE	LENGTH
 Owner Table	OWNER	X	Lease ID	API14	*Owner ID	*OWNR_ID	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**Lease ID List	**LSE_ID_LIST	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**API14 List	**API14_LIST	VARCHAR	50
 Owner Table	OWNER				Lease Count	LSE_CNT	VARCHAR	50

Open in new window


As you can see, this is not about importing a csv full of data.  This is a csv with schema metadata.
I need to know if there's a way in SQL Server to build the tables defined in this csv.  There are over 2000 lines in this document.
I'm hoping I don't have to run this manually.

If anyone has any ideas, I'd greatly appreciate it.
Thanks
Microsoft SQL ServerSQL

Avatar of undefined
Last Comment
Scott Pletcher

8/22/2022 - Mon
Pawan Kumar

can you give me one example what you need from the CSV data we are inserting into the table.

This would be very very difficult to build the SQL even after insertion of CSV into the table. Please use below code to import the data.

USE [DBNAME]
GO

CREATE TABLE [dbo].[p1](
	[ Owner Table] [varchar](50) NULL,
	[OWNER] [varchar](50) NULL,
	[X] [varchar](50) NULL,
	[Lease ID] [varchar](50) NULL,
	[API14] [varchar](50) NULL,
	[*Owner ID] [varchar](50) NULL,
	[*OWNR_ID] [varchar](50) NULL,
	[VARCHAR] [varchar](50) NULL,
	[50] [varchar](50) NULL
) ON [PRIMARY]

GO

BULK
INSERT [p1]
FROM 'c:\pawan\p\p1.txt'
WITH
(
	FIELDTERMINATOR = '	',
	ROWTERMINATOR = '\n'
)
GO

Open in new window


OUTPUT

/*------------------------
SELECT *  FROM [p1]
------------------------*/
 Owner Table                                       OWNER                                              X                                                  Lease ID                                           API14                                              *Owner ID                                          *OWNR_ID                                           VARCHAR                                            50
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------------------------------------------------
TableName                                          PhysicalTableName                                  PrimaryKey                                         ForeignKey                                         ForeignKey                                         FieldName                                          PhysicalFieldName                                  DATATYPE                                           LENGTH
 Owner Table                                       OWNER                                              X                                                  Lease ID                                           API14                                              *Owner ID                                          *OWNR_ID                                           VARCHAR                                            50
 Owner Table                                       OWNER                                              Owner ID                                           Lease ID                                           API14                                              **Lease ID List                                    **LSE_ID_LIST                                      VARCHAR                                            50
 Owner Table                                       OWNER                                              Owner ID                                           Lease ID                                           API14                                              **API14 List                                       **API14_LIST                                       VARCHAR                                            50
 Owner Table                                       OWNER                                              NULL                                               NULL                                               NULL                                               Lease Count                                        LSE_CNT                                            VARCHAR                                            50

(5 row(s) affected)

Open in new window

Evan Cutler

ASKER
I don't have that.

I have a blank database.  The tables inside this database is defined in the CSV.
Again, This is not a data file.  This is, at best for lack of better term, a data dictionary.
There are over 200 tables defined in this document.  I really don't want to build them from the csv manually.

As you can see in the example, I have the Table_Name, Column_name, Data_Type, and Data_Length.
Does SQL Server have the ability to ingest this metadata, and create tables from the definition?

Thanks
Evan Cutler

ASKER
In the example above....
somehow I need the information:

TableName	PhysicalTableName	PrimaryKey	ForeignKey	ForeignKey	FieldName	PhysicalFieldName	DATATYPE	LENGTH
 Owner Table	OWNER	X	Lease ID	API14	*Owner ID	OWNR_ID	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**Lease ID List	**LSE_ID_LIST	VARCHAR	50
 Owner Table	OWNER	Owner ID	Lease ID	API14	**API14 List	**API14_LIST	VARCHAR	50
 Owner Table	OWNER				Lease Count	LSE_CNT	VARCHAR	50

Open in new window


to turn into:
CREATE TABLE OWNER (
     OWNR_ID varchar(50),
     LSE_ID_LIST varchar(50),
     API14_LIST varchar(50),
     ...
)

Open in new window


I realize there's asterisk's in my information...just ignore them.  Thanks

Appreciate it.
)
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Pawan Kumar

There is nothing built in in SQL server ..we have to create some logic to get the statements.. do u have have data also in this CSV file?
eg. data for OWNER  table?
Evan Cutler

ASKER
Thanks.  No.  The OWNER table is going to be populated by a SSIS package to be built.
so you are suggesting creating some external code to build the tables?
Pawan Kumar

so you are suggesting creating some external code to build the tables?
yes read the data from the inserted table and build sql for each tableaname in the table and generate sql using dynamic sql.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
Pawan Kumar

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Scott Pletcher

If you have some time, I'm working on a far more comprehensive and flexible approach.  Please let me know if you're in a big hurry, as this will take a bit of time to write up.

No need for me to spend the time and effort if you just want something quick-and-dirty anyway :-).
Scott Pletcher

Let's assume you have the csv data loaded into a table named
dbo.data_dictionary_import
with these columns:
TableName      PhysicalTableName      PrimaryKey      ForeignKey1      ForeignKey2      FieldName      PhysicalFieldName      DataType      Length

Below is my script to process that data.

I use new control tables to store the info coming in.  I do two passes of the csv data to load those tables:
the first for the main tables,
the second to store the FK relationships (and note missing FK table(s)/column(s), if any).

Once the control tables are loaded,
you can make any modifications to those tables you want/need to make,
then we can use "standard" XML concatenation technique (or another method if you prefer) to generate a script to create the tables.
I've allowed for keys to have multiple columns, even if you don't currently have that situation in your tables, it's possible in the future.

Edit: The "NewDataType*" columns are just in case you want to change the data type of a column when the new table is created.  For example, "LSE_CNT" should really be an int column rather than varchar.  The new type and implementation method would just be options to do that, if it's needed.  If not needed, of course you can just ignore it :-).

/* Create control tables to hold imported data dictionary data. */
CREATE TABLE dbo.tables 
    (
    TableId int IDENTITY(1, 1) NOT NULL CONSTRAINT tables__PK PRIMARY KEY,
    TableName varchar(100) NULL,
    PhysicalTableName varchar(100) NOT NULL,
    SchemaName varchar(30) NOT NULL CONSTRAINT tables__DF_SchemaName DEFAULT 'dbo',
    PrimaryKeyCount tinyint NULL
    );
CREATE TABLE dbo.columns 
    (
    TableId int NOT NULL /*CONSTRAINT columns__FK_tables FOREIGN KEY REFERENCES dbo.tables ( TableId )*/,
    ColumnId int IDENTITY(1, 1) NOT NULL,
    FieldName varchar(100) NULL,
    PhysicalFieldName varchar(100) NOT NULL,
    DataType varchar(50) NULL,
    Length varchar(10) NULL,
    NewDataType varchar(50) NULL,
    NewDataTypeImplementation tinyint NULL,
    CONSTRAINT columns__PK PRIMARY KEY ( TableId, ColumnId )
    );
CREATE TABLE dbo.primary_key_columns
    (
    PkId int IDENTITY(1, 1) NOT NULL,
    TableId int NOT NULL,
    ColumnId int NOT NULL
    /*CONSTRAINT primary_key_columns__FK_columns FOREIGN KEY REFERENCES dbo.columns ( TableId, ColumnId )*/
    );
CREATE TABLE dbo.foreign_key_columns 
    (
    TableIdFrom int NOT NULL,
    ColumnIdFrom int NOT NULL,
    KeySequence tinyint NOT NULL,
    TableIdTo int NULL,
    ColumnIdTo int NULL --,
    /*CONSTRAINT foreign_key_columns__FK_columns1 FOREIGN KEY REFERENCES dbo.columns ( TableIdFrom, ColumnIdFrom )*/,
    /*CONSTRAINT foreign_key_columns__FK_columns2 FOREIGN KEY REFERENCES dbo.columns ( TableIdTo, ColumnIdTo )*/
    );


/* Main code to load the control tables. */

DECLARE /* data dictionary columns */
    @TableName varchar(100),
    @PhysicalTableName varchar(100),
    @PrimaryKey varchar(50),
    @ForeignKey1 varchar(50),
    @ForeignKey2 varchar(50),
    @FieldName varchar(50),
    @PhysicalFieldName varchar(50),
    @DataType varchar(50),
    @Length varchar(10)
DECLARE @ColumnId int
DECLARE @TableId int


DECLARE cursor_data_dictionary CURSOR LOCAL FAST_FORWARD FOR
SELECT 
    ddi.TableName, ddi.PhysicalTableName, 
    ddi.PrimaryKey, 
    ddi.ForeignKey1, ddi.ForeignKey2, 
    ddi.FieldName, ddi.PhysicalFieldName,
    ddi.DataType, ddi.Length
FROM dbo.data_dictionary_import ddi;
--ORDER BY ddi.TableName, ddi.PhysicalTableName

OPEN cursor_data_dictionary;

/* Load all main table info (no FK info). */
WHILE 1 = 1 
BEGIN
    FETCH NEXT FROM cursor_data_dictionary INTO @TableName, @PhysicalTableName, 
        @PrimaryKey, 
        @ForeignKey1, @ForeignKey2, 
        @FieldName, @PhysicalFieldName,
        @DataType, @Length
    IF @@FETCH_STATUS <> 0
        IF @@FETCH_STATUS = -1
            BREAK
        ELSE
            CONTINUE;

    IF EXISTS(SELECT 1 FROM dbo.tables WHERE PhysicalTableName = @PhysicalTableName)
    BEGIN
        SELECT @TableId = TableId
        FROM dbo.tables
        WHERE PhysicalTableName = @PhysicalTableName
    END /*IF*/
    ELSE
    BEGIN
        INSERT INTO dbo.tables ( TableName, PhysicalTableName )
        SELECT @TableName, @PhysicalTableName
        SET @TableId = SCOPE_IDENTITY()
    END /*ELSE*/

    IF @PrimaryKey = 'X'
    BEGIN
        IF EXISTS(SELECT 1 FROM dbo.columns WHERE TableId = @TableId AND PhysicalFieldName = @PhysicalFieldName)
        BEGIN
            SELECT @ColumnId = ColumnId
            FROM dbo.columns
            WHERE TableId = @TableId AND PhysicalFieldName = @PhysicalFieldName
        END /*IF*/
        ELSE
        BEGIN
            INSERT INTO dbo.columns ( TableId, FieldName, PhysicalFieldName, DataType, Length )
            SELECT @TableId, REPLACE(@FieldName, '*', ''), REPLACE(@PhysicalFieldName, '*', ''), @DataType, @Length
            SET @ColumnId = SCOPE_IDENTITY()
        END /*ELSE*/

        IF NOT EXISTS(SELECT 1 FROM dbo.primary_key_columns WHERE TableId = @TableId AND ColumnId = @ColumnId)
        BEGIN
            INSERT INTO dbo.primary_key_columns ( TableId, ColumnId )
            SELECT @TableId, @ColumnId
        END /*IF*/        
    END /*IF*/
    ELSE
    IF NOT EXISTS(SELECT 1 FROM dbo.columns WHERE TableId = @TableId AND PhysicalFieldName = REPLACE(@PhysicalFieldName, '*', ''))
    BEGIN
        INSERT INTO dbo.columns ( TableId, FieldName, PhysicalFieldName, DataType, Length )
        SELECT @TableId, REPLACE(@FieldName, '*', ''), REPLACE(@PhysicalFieldName, '*', ''), @DataType, @Length
        SET @ColumnId = SCOPE_IDENTITY()            
    END /*IF*/    

END /*WHILE*/

CLOSE cursor_data_dictionary;

UPDATE t
SET PrimaryKeyCount = ISNULL(c.PrimaryKeyCount, 0)
FROM dbo.tables t
LEFT OUTER JOIN (
    SELECT TableId, COUNT(*) AS PrimaryKeyCount
    FROM dbo.primary_key_columns
    GROUP BY TableId
) AS c ON c.TableId = t.TableId

SELECT 'tables' AS table_name, * FROM dbo.tables;
SELECT 'columns' AS table_name, * FROM dbo.columns;
SELECT 'primary_key_columns' AS table_name, * FROM dbo.primary_key_columns;

OPEN cursor_data_dictionary;

/* Load FK info (only). */
WHILE 1 = 1 
BEGIN
    FETCH NEXT FROM cursor_data_dictionary INTO @TableName, @PhysicalTableName, 
        @PrimaryKey, 
        @ForeignKey1, @ForeignKey2, 
        @FieldName, @PhysicalFieldName,
        @DataType, @Length
    IF @@FETCH_STATUS <> 0
        IF @@FETCH_STATUS = -1
            BREAK
        ELSE
            CONTINUE;

    IF @ForeignKey1 = ''
        CONTINUE;

    /* Load FK info here */

END /*WHILE*/

DEALLOCATE cursor_data_dictionary;

SELECT 'foreign_key_columns' AS table_name, * FROM dbo.foreign_key_columns;

Open in new window

Scott Pletcher

I've made final adjustments to the code above.
Please refresh your screen and recapture the code to make sure you get the latest version.

The actual generation of the CREATE TABLE statement with PK and FK constraints is rather trivial once the control tables are loaded.  It won't take long at all to code that up once/if you're ok with the approach overall.
Your help has saved me hundreds of hours of internet surfing.
fblack61
ASKER CERTIFIED SOLUTION
Scott Pletcher

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Evan Cutler

ASKER
You guys are awesome! Thanks so much for helping me out.
This is great!
Scott Pletcher

Here's the code to generate the PKs:


SELECT 'ALTER TABLE [' + t.SchemaName + '].[' + t.PhysicalTableName + '] ADD ' +
    'CONSTRAINT [' + t.PhysicalTableName + '__PK' + '] PRIMARY KEY CLUSTERED ( ' + STUFF(
    (SELECT ', [' + c.PhysicalFieldName + ']'
     FROM dbo.columns c
     WHERE c.TableId = pkc.TableId AND c.ColumnId = pkc.ColumnId
     FOR XML PATH('')
    ), 1, 2, '') + ') WITH ( FILLFACTOR = 99 );'  AS [--Primary_Key_Definitions]
FROM dbo.primary_key_columns pkc
INNER JOIN dbo.tables t ON t.TableId = pkc.TableId
Scott Pletcher

And the code for the FKs.  I can't easily test this code as neither referenced table definition -- Lease / API14 -- was present in the sample data.:

SELECT 'ALTER TABLE [' + tFrom.SchemaName + '].[' + tFrom.PhysicalTableName + '] ADD ' +
    'CONSTRAINT [' + tFrom.PhysicalTableName + '__PK' + '] FOREIGN KEY ( ' + STUFF(
    (SELECT ', [' + c.PhysicalFieldName + ']'
     FROM dbo.columns c
     WHERE c.TableId = fkc.TableIdFrom AND c.ColumnId = fkc.ColumnIdFrom
     FOR XML PATH('')
    ), 1, 2, '') + ') ' +
    'REFERENCES [' + tTo.SchemaName + '].[' + tTo.PhysicalTableName + '] ( ' + STUFF(
    (SELECT ', [' + c.PhysicalFieldName + ']'
     FROM dbo.columns c
     WHERE c.TableId = fkc.TableIdTo AND c.ColumnId = fkc.ColumnIdTo
     FOR XML PATH('')
    ), 1, 2, '') + ') ' AS [--Foreign_Key_Definitions]
FROM dbo.foreign_key_columns fkc
INNER JOIN dbo.tables tFrom ON tFrom.TableId = fkc.TableIdFrom
INNER JOIN dbo.tables tTo ON tFrom.TableId = fkc.TableIdTo
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.