Find DataTypes of a staging table

I have a staging table that has been imported from excel, every column is varchar(255).  I need to move the data into a new table that has the correct datatypes for each column.  I need a best guess at least on the column datatype and we never know how many columns the table is going to be, we only know it will be called staging_[something].  The datatypes I need to check for are int, datetime, float, decimal, varchar, bit.  What is the best way to do this in T-SQL.  Thanks
nutnutAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Scott PletcherSenior DBACommented:
Profiling the columns for int, float, decimal and bit is fairly straightforward.

Checking datetime vs general varchar could be very difficult.  Do you have some type of restrictions on how the date/datetime could be entered?
0
nutnutAuthor Commented:
Yes we do we have maybe 5 different common ways. I was thinking as long as the bulk of the column have a 13,14,15 or 16 in its a date because all dates will be from 2013,14,15 and 16?
0
David ToddSenior DBACommented:
Hi,

Can you look at Excel and from there guess at the datatype?

Regards
  David
0
Scott PletcherSenior DBACommented:
I've got very limited time now, but I think the code below will provide you a decent start.  You seemed to have a better handle on testing for date than I did -- I would want specific patterns -- so I left that for you.

My code presumes you want to look at *every* row before deciding on the type.  If there are too many rows, you'll probably want to add a "TOP (nnnn)" expression.

I've only coded column1, but column2 would follow exactly the same pattern, and so on for column3, column4, etc..

I'd SELECT these results into a temp table, either one row for all columns or one row per column, and analyze the types there.

The basic analysis would be something like:
  -- if total_count = bit_ok_count and sign_location_max = 0 and float_location_final_max = 0  then  type = bit
  -- if total_count = int_ok_count and sign_location_max IN (0, 1) and float_location_final_max = 0  then  type = int
  --  if total_count = decimal_ok_count and sign_location_max IN (0, 1) and float_location_final_max = 0  then  type = decimal
  --  if total_count = datetime_ok_count  then  type = datetime
  -- else  type = varchar.
 
SELECT
    SUM(1) AS Total_Count,
    MIN(sign_location) AS sign_location_min,
    MAX(sign_location) AS sign_location_max,
    MIN(float_location_final) AS float_location_final_min,
    MAX(float_location_final) AS float_location_final_max,
    SUM(CASE WHEN column1_mod_final NOT LIKE '%[^0123456789]%' THEN 1 ELSE 0 END) AS int_ok_count,
    SUM(CASE WHEN column1_mod_final NOT LIKE '%[^0-9.]%' AND column1_mod_final NOT LIKE '%.%.%' THEN 1 ELSE 0 END)
        AS decimal_ok_count,
    SUM(CASE WHEN column1_mod_final IN ('0', '1', 'False', 'True') THEN 1 ELSE 0 END) AS bit_ok_count
    --add your date testing logic here AS datetime_ok_count
FROM ( --dbo.staging_table1
    SELECT '12e+5' AS column1
) AS staging_table1
CROSS APPLY (
    SELECT PATINDEX('%[+-]%', column1) AS sign_location,
        PATINDEX('%[DE]%', column1) AS float_location_prelim
) AS ca10
CROSS APPLY (
    SELECT SUBSTRING(column1, sign_location, 1) AS [sign]
) AS ca20
CROSS APPLY (
    SELECT CASE WHEN sign_location > 0 THEN STUFF(column1, sign_location, 1, '') ELSE column1 END AS column1_mod_1
) AS ca30
CROSS APPLY (
    SELECT CASE WHEN float_location_prelim > 0 AND
        SUBSTRING(column1_mod_1, float_location_prelim + 1, 99) NOT LIKE '%[^0123456789]%' AND
        (sign_location = 0 OR sign_location > float_location_prelim) THEN float_location_prelim ELSE 0 END
        AS float_location_final
) AS ca4
CROSS APPLY (
    SELECT CASE WHEN float_location_final > 0 THEN STUFF(column1_mod_1, float_location_final, 1, '')
        ELSE column1_mod_1 END AS column1_mod_final
) AS ca6
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
nutnutAuthor Commented:
Great answer thanks!!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Query Syntax

From novice to tech pro — start learning today.