I am a total beginner to SQL and although I understand the concepts I do not understand the syntax. I have searched long enough to make my eyes bleed so now I need to just ask:
How do you read a CSV file into an SQLEXPRESS/Server 2008 table when the CSV file is not well formed?
The data file is from a data recorder and the first line in the file is always the column headers. The data that follows may have sync data inserted between the lines of sensor data to indicate that something happened while recording and that the data may have a gap in the readings. Obviously these sync lines are not predictable. And there is also a possibility that the data may just be malformed. The sync lines contain the MESSAGE ID and a WORD COUNT of 0 followed by a comma. The comma is an artifact of the way the data recorder fills out incomplete fields (ie. a bug we can't fix.)
Here are a few lines from a sensor data file:
MESSAGE ID, WORD COUNT, BUFFER NUMBER, TIME MSP (SECONDS), TIME LSP (SECONDS), NUMBER OF MEAS, MEAS 1 CODE, MEAS 1 VEHICLE ID, MEAS 1 FREQ, MEAS 1 VALIDITY, MEAS 1 RANGE (METERS)
7159, 75, 0, 5.8982400000000000e+005, 3.0441000000000004e+003, 18, 1, 18, 0, 0,-4.2217182187500000e+006
7159, 75, 0, 5.8982400000000000e+005, 3.0442000000000003e+003, 18, 1, 18, 0, 1,-4.2217274531250000e+006
7159, 75, 0, 5.8982400000000000e+005, 3.0443000000000002e+003, 18, 1, 18, 0, 1,-4.2217365625000000e+006
7159, 75, 0, 5.8982400000000000e+005, 3.0444000000000001e+003, 18, 1, 18, 0, 1,-4.2217456250000000e+006
7159, 75, 0, 5.8982400000000000e+005, 3.0445000000000000e+003, 18, 1, 18, 0, 1,-4.2217548437500000e+006
This is the code I have written so far to read the data into a temporary table and then parse the data to the permanent table.
-- Start with a clean slate and drop the RawSensorData table if it
-- already exists
IF OBJECT_ID ('#RawSensorData', 'U') IS NOT NULL
DROP TABLE #RawSensorData
-- Create a temporary table that we can read the raw sensor data into
CREATE TABLE #RawSensorData ([RawSensorData] VARCHAR(MAX))
-- Read in the raw sensor data as a string
BULK INSERT #RawSensorData
-- Drop the SensorData table if it already exists
IF OBJECT_ID ('[dbo].[SensorData]', 'U') IS NOT NULL
DROP TABLE [dbo].[SensorData]
-- Create the SensorData table
IF OBJECT_ID ('[dbo].[SensorData]', 'U') IS NULL
CREATE TABLE [dbo].[SensorData]
[ MESSAGE ID] smallint,
[ WORD COUNT] smallint,
[ BUFFER NUMBER] smallint,
[ TIME MSP (SECONDS)] float,
[ TIME LSP (SECONDS)] float,
[ NUMBER OF MEAS] smallint,
[ MEAS 1 CODE] smallint,
[ MEAS 1 VEHICLE ID] smallint,
[ MEAS 1 FREQ] smallint,
[ MEAS 1 VALIDITY] smallint,
[ MEAS 1 RANGE (METERS)] float
-- Insert the data from the RawSensorData table into
-- the parsed SensorData table
[ MESSAGE ID],
[ WORD COUNT],
[ BUFFER NUMBER],
[ TIME MSP (SECONDS)],
[ TIME LSP (SECONDS)],
[ NUMBER OF MEAS],
[ MEAS 1 CODE],
[ MEAS 1 VEHICLE ID],
[ MEAS 1 FREQ],
[ MEAS 1 VALIDITY],
[ MEAS 1 RANGE (METERS)]
-- What do I do here to take the single column data from
-- the RawSensorData table and convert to the multi-column
-- SensorData table?
-- Delete the temporary table
DROP TABLE #RawSensorData
So I have two questions:
1. What do I need to put in the SELECT statement to take the single column data from the RawSensorData table and convert to the multi-column SensorData table?
2. How do I count how many sync lines and error lines are in the original data?
And... Business Rules dictate that I cannot use BCP, DTS, SSIS, and OPENROWSET or anything else that either calls an external program or that would require changing the Surface Area Configuration.
Capturing the lines with errors would be a bonus but is not necessary.
As you can see, the data is captured with 100 nanosecond accuracy and will generate files that are several million lines so efficient code is a major plus.
And since I am new to this I am looking for example code, the more specific the better, general suggestions would not be helpful at this point.
<<< Everything is easier when you know how. >>>