Solved

# Bulk Insert Help

Posted on 2008-10-25
710 Views
Hi Everyone

I am trying to bulk insert the attached file into a table I am creating with the code below.  I am getting errors with the Primary key being the wrong data type.

i don't know how to solve this issue, I was hoping someone could iron out the code to allow the bulk insert to be successful.  I had to change the file extension to allow uploads.  The extension is .csv

USE master
Go

DROP DATABASE      ChartOfAccounts
CREATE DATABASE ChartOfAccounts
ON
(NAME = 'ChartOfAccounts',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.mdf',
SIZE = 10,
MAXSIZE = 50,
FILEGROWTH = 5 )
LOG ON
(NAME = 'ChartOfAccountsLog',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.ldf',
SIZE = 5MB,
MAXSIZE = 25MB,
FILEGROWTH = 5MB )

GO

USE ChartOfAccounts
GO

DROP TABLE ChartOfAccounts
CREATE TABLE ChartOfAccounts
(
[Account] int NOT NULL      PRIMARY KEY CLUSTERED IDENTITY (1,1)
, [Description]      varchar (128) NULL
, [Short_Description]      varchar(512)      NOT NULL
, [Balance]       Char (128)            NULL
)
GO

BULK INSERT ChartOfAccounts
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO

BULK
INSERT ChartOfAccounts
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO

COA.csv.txt
0
Question by:salberta
• 8
• 2
• 2
• +2

LVL 18

Expert Comment

ID: 22805816
declare the identity as bigint; your first row is way bigger than integer
0

LVL 17

Expert Comment

ID: 22805822
See the type of data in CSV file. For first column i.e. Account it should be plain integer without decimal point and any type of quotes or charcters other than digits. If you are providing values for this PK, then IDENTITY insert has to be turned on. Make sure the data is 1 2 3 4 ......................
If data for column is missing then turn IDENTITY insert off. I think there is @@IDENTITY global variable for this. Check what is the default and change it to the only other state if required.
But if it is a purely data type mismatch between data in .CSV file and column, then maybe due to some default/missing option in BULK INSERT command which controls the data load operation, single/double quotes are dynamically enclosing the PK data.
It also maybe that your column data type is int and there is a value exceeding its upper limit in .CSV file. You may also happen to be inserting NULL into NOT NULL column. Also, identity property may be having ORDER/NOORDER just like in a sequence. Check CYCLE/NOCYCLE and MAXVALUE as well.
All the best.

-- k_murli_krishna
0

LVL 18

Expert Comment

ID: 22805824
There are other problems: the .csv file in some cases only accounts for two or three columns and should account for all four in the file.
0

LVL 18

Expert Comment

ID: 22805831
The first account number is 100001 and cant be held in an integer column. You can't insert into an IDENTITY without turning it off. SET IDENTITY_INSERT can be used to disable the IDENTITY property of a column by enabling values to be explicitly inserted.
0

LVL 18

Expert Comment

ID: 22805837
Also you need to add FIRSTROW = 2, because the BULK INSERT will  fail when trying to insert the first row into the data due to the fact that you're attempting to insert text into a numeric column.
0

LVL 17

Expert Comment

ID: 22806583
jmoss111, you are correct about less data in some rows and IDENTITY insert. Salberta, there is a row in .CSV file having data as ,, i.e. there is no data at all & hence no data for PK which is a real problem. After solving all else based on our comments, you will need to clear this 1 last one since apart from PK, Short_Description is also NOT NULL.
0

Author Comment

ID: 22806786
Hi Everyone-

I am very new to this,  Working code or at least how you would perform this action would be apprecitated.    i am a Systems Administrator and was handed SQL and this project.     The first field needs to be a primary.  If there is duplicates I can find them.
or if you have a way to search for dupes, that would help
I can create and populate the table from the bulk insert statement and from an import.  i cannot change and table values howerver, due to null errors.
I have books online and I am pouring througth them.
0

LVL 18

Expert Comment

ID: 22806995
salberta,

See code attached.  I made the changes that I told you about in earlier posts, I.E. :

removed IDENTITY (1,1)

and added a ", " at the end of the first 10 rows to get the correct # of columns and it worked great.

Jim

USE master
Go

--DROP DATABASE      ChartOfAccounts
CREATE DATABASE ChartOfAccounts
ON
(NAME = 'ChartOfAccounts',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.mdf',
SIZE = 10,
MAXSIZE = 50,
FILEGROWTH = 5 )
LOG ON
(NAME = 'ChartOfAccountsLog',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.ldf',
SIZE = 5MB,
MAXSIZE = 25MB,
FILEGROWTH = 5MB )

GO

USE ChartOfAccounts
GO

DROP TABLE ChartOfAccounts
CREATE TABLE ChartOfAccounts
(
[Account] int NOT NULL      PRIMARY KEY CLUSTERED
, [Description]      varchar (128) NULL
, [Short_Description]      varchar(512)      NOT NULL
, [Balance]       Char (128)            NULL
)
GO

BULK INSERT ChartOfAccounts
FROM 'C:\coa.txt'
WITH
(
FIELDTERMINATOR = ',',FIRSTROW = 2,
ROWTERMINATOR = '\n'
)
GO

0

LVL 18

Expert Comment

ID: 22806997
You can fix the text file, right?
0

LVL 51

Accepted Solution

Mark Wills earned 500 total points
ID: 22808341
FIRSTROW = 2, identity_insert ON, and check for dupes. Use a staging table...


-- first do a staging table

if object_id(N'ChartOfAccounts_staging','U') is not null drop table ChartOfAccounts_staging

-- create a staging table
CREATE TABLE ChartOfAccounts_Staging
(
[Account] int NULL   --  PRIMARY KEY CLUSTERED  --IDENTITY (1,1)
, [Description]      varchar (128) NULL
, [Short_Description]      varchar(512)      NULL
, [Balance]       Char (128)            NULL
)
GO

BULK INSERT ChartOfAccounts_staging
FROM 'C:\ee\COA.csv.txt'
WITH
(
FIELDTERMINATOR = ',',
FIRSTROW = 2,
ROWTERMINATOR = '\n'
)
GO

-- show dupes - maybe remove dupes

select * from chartofaccounts_staging
where account in
(select account from chartofaccounts_staging group by account having count(*) > 1)
order by account
GO

-- now populate from staging...

if object_id(N'ChartOfAccounts','U') is not null drop table ChartOfAccounts
GO

-- create new home

CREATE TABLE ChartOfAccounts
(
[Account] int NOT NULL      PRIMARY KEY CLUSTERED IDENTITY (1,1)
, [Description]      varchar (128) NULL
, [Short_Description]      varchar(512)      NOT NULL
, [Balance]       Char (128)            NULL
)
GO

set identity_insert ChartOfAccounts ON
GO

-- build exclude dupes (any instance)

insert chartofaccounts (account,description,short_description,balance)
select distinct account,description,short_description,balance
from chartofaccounts_staging
where isnull(account,0) > 0
and len(isnull(short_description,'')) > 0
and not exists (select * from chartofaccounts c where c.account = chartofaccounts_staging.account)
and account not in (select account from chartofaccounts_staging group by account having count(*) > 1)

-- turn identity control back on

set identity_insert ChartOfAccounts OFF
GO

0

LVL 18

Expert Comment

ID: 22808585
Like mark_wills said, never import directly into a production table; always use a staging  table.
0

Author Comment

ID: 22813610
Hi jmoss111-

It seems that even a simple mod of the lines of the file are escaping me.   Could you please give me an example.
Sorry to be a bother.

salberta
0

LVL 18

Expert Comment

ID: 22814357
Your text file contains 4 columns, some rows of data only contain two columns but you don't have the extra delimiters to fill out the row, IE two ,,
0

LVL 30

Expert Comment

ID: 22818478
To extend jmoss111 comments, line 35 just has ,, in it. This indicates three columns (seperated by commas). Every other line has three commas in it, indicating four columns.
You need to work out what is meant to be at line 35. It definitely should not be ,,
Looking at the data, there's no space in the numbers between 121011 and 121012, so I'm guessing someone accidentally typed it in, and it should be removed.

Also, do you have a reason to make that column an IDENTITY?
You usually create an IDENTITY column so that if you insert data into it, it will automatically put in an incrementing value in the field for you.
So after you have successfully imported this daat, do you expect to be adding more data to the table, and have the table automatically generate an Account for you? if not, then you shouldn't be using the Identity property in your table.
0

LVL 51

Expert Comment

ID: 22821404
hi salberta - have you tried my solution ? it does work as you want - just uses a staging table in between - is there any problems with that approach ? The only question I have is that it will ignore dupes in the staging table - altogether (ie not even 1 instance wins),  and you probably want to report those as errors for manual intervention rather than trying to decide which CofA entry becomes "real". But if there is a rule to help determine the "real" entry in those dupes, than can add that in...
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

### Suggested Solutions

Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.