Solved

Bulk Insert Help

Posted on 2008-10-25
15
709 Views
Last Modified: 2013-11-30
Hi Everyone

I am trying to bulk insert the attached file into a table I am creating with the code below.  I am getting errors with the Primary key being the wrong data type.

i don't know how to solve this issue, I was hoping someone could iron out the code to allow the bulk insert to be successful.  I had to change the file extension to allow uploads.  The extension is .csv


USE master
Go

DROP DATABASE      ChartOfAccounts
CREATE DATABASE ChartOfAccounts
ON
(NAME = 'ChartOfAccounts',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.mdf',
SIZE = 10,
MAXSIZE = 50,
FILEGROWTH = 5 )
LOG ON
(NAME = 'ChartOfAccountsLog',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.ldf',
SIZE = 5MB,
MAXSIZE = 25MB,
FILEGROWTH = 5MB )

GO



USE ChartOfAccounts
GO

DROP TABLE ChartOfAccounts
CREATE TABLE ChartOfAccounts
      (
      [Account] int NOT NULL      PRIMARY KEY CLUSTERED IDENTITY (1,1)
      , [Description]      varchar (128) NULL
      , [Short_Description]      varchar(512)      NOT NULL
      , [Balance]       Char (128)            NULL
      )
GO

BULK INSERT ChartOfAccounts
        FROM 'C:\Documents and Settings\Administrator\My Documents\KudlerAccounts\KudlerCOA.csv'
            WITH
    (
                FIELDTERMINATOR = ',',
                ROWTERMINATOR = '\n'
    )
GO


BULK
INSERT ChartOfAccounts
        FROM 'C:\Documents and Settings\Administrator\My Documents\KudlerAccounts\KudlerCOA.csv'
            WITH
    (
                FIELDTERMINATOR = ',',
                ROWTERMINATOR = '\n'
    )
GO

COA.csv.txt
0
Comment
Question by:salberta
  • 8
  • 2
  • 2
  • +2
15 Comments
 
LVL 18

Expert Comment

by:jmoss111
ID: 22805816
declare the identity as bigint; your first row is way bigger than integer
0
 
LVL 17

Expert Comment

by:k_murli_krishna
ID: 22805822
See the type of data in CSV file. For first column i.e. Account it should be plain integer without decimal point and any type of quotes or charcters other than digits. If you are providing values for this PK, then IDENTITY insert has to be turned on. Make sure the data is 1 2 3 4 ......................
If data for column is missing then turn IDENTITY insert off. I think there is @@IDENTITY global variable for this. Check what is the default and change it to the only other state if required.
But if it is a purely data type mismatch between data in .CSV file and column, then maybe due to some default/missing option in BULK INSERT command which controls the data load operation, single/double quotes are dynamically enclosing the PK data.
It also maybe that your column data type is int and there is a value exceeding its upper limit in .CSV file. You may also happen to be inserting NULL into NOT NULL column. Also, identity property may be having ORDER/NOORDER just like in a sequence. Check CYCLE/NOCYCLE and MAXVALUE as well.
All the best.

-- k_murli_krishna
0
 
LVL 18

Expert Comment

by:jmoss111
ID: 22805824
There are other problems: the .csv file in some cases only accounts for two or three columns and should account for all four in the file.
0
Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 18

Expert Comment

by:jmoss111
ID: 22805831
The first account number is 100001 and cant be held in an integer column. You can't insert into an IDENTITY without turning it off. SET IDENTITY_INSERT can be used to disable the IDENTITY property of a column by enabling values to be explicitly inserted.
0
 
LVL 18

Expert Comment

by:jmoss111
ID: 22805837
Also you need to add FIRSTROW = 2, because the BULK INSERT will  fail when trying to insert the first row into the data due to the fact that you're attempting to insert text into a numeric column.
0
 
LVL 17

Expert Comment

by:k_murli_krishna
ID: 22806583
jmoss111, you are correct about less data in some rows and IDENTITY insert. Salberta, there is a row in .CSV file having data as ,, i.e. there is no data at all & hence no data for PK which is a real problem. After solving all else based on our comments, you will need to clear this 1 last one since apart from PK, Short_Description is also NOT NULL.
0
 

Author Comment

by:salberta
ID: 22806786
Hi Everyone-
 
I am very new to this,  Working code or at least how you would perform this action would be apprecitated.    i am a Systems Administrator and was handed SQL and this project.     The first field needs to be a primary.  If there is duplicates I can find them.
or if you have a way to search for dupes, that would help
I can create and populate the table from the bulk insert statement and from an import.  i cannot change and table values howerver, due to null errors.
I have books online and I am pouring througth them.  
0
 
LVL 18

Expert Comment

by:jmoss111
ID: 22806995
salberta,

See code attached.  I made the changes that I told you about in earlier posts, I.E. :

removed IDENTITY (1,1)
added  ,FIRSTROW = 2,

and added a ", " at the end of the first 10 rows to get the correct # of columns and it worked great.

Jim

USE master
Go
 
--DROP DATABASE      ChartOfAccounts
CREATE DATABASE ChartOfAccounts
ON
(NAME = 'ChartOfAccounts',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.mdf',
SIZE = 10,
MAXSIZE = 50,
FILEGROWTH = 5 )
LOG ON
(NAME = 'ChartOfAccountsLog',
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ChartOfAccounts.ldf',
SIZE = 5MB,
MAXSIZE = 25MB,
FILEGROWTH = 5MB )
 
GO
 
 
 
USE ChartOfAccounts
GO
 
DROP TABLE ChartOfAccounts
CREATE TABLE ChartOfAccounts
      (
      [Account] int NOT NULL      PRIMARY KEY CLUSTERED
      , [Description]      varchar (128) NULL
      , [Short_Description]      varchar(512)      NOT NULL
      , [Balance]       Char (128)            NULL
      )
GO
 
BULK INSERT ChartOfAccounts
        FROM 'C:\coa.txt'
            WITH
    (
                FIELDTERMINATOR = ',',FIRSTROW = 2,
                ROWTERMINATOR = '\n'
    )
GO

Open in new window

0
 
LVL 18

Expert Comment

by:jmoss111
ID: 22806997
You can fix the text file, right?
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 500 total points
ID: 22808341
FIRSTROW = 2, identity_insert ON, and check for dupes. Use a staging table...



-- first do a staging table
 
if object_id(N'ChartOfAccounts_staging','U') is not null drop table ChartOfAccounts_staging
 
-- create a staging table
CREATE TABLE ChartOfAccounts_Staging
      (
      [Account] int NULL   --  PRIMARY KEY CLUSTERED  --IDENTITY (1,1)
      , [Description]      varchar (128) NULL
      , [Short_Description]      varchar(512)      NULL
      , [Balance]       Char (128)            NULL
      )
GO
 
-- load staging 
 
BULK INSERT ChartOfAccounts_staging
        FROM 'C:\ee\COA.csv.txt'
            WITH
    (
                FIELDTERMINATOR = ',',
                FIRSTROW = 2,
                ROWTERMINATOR = '\n'
    )
GO
 
-- show dupes - maybe remove dupes
 
select * from chartofaccounts_staging
where account in 
(select account from chartofaccounts_staging group by account having count(*) > 1)
order by account
GO
 
-- now populate from staging...
 
if object_id(N'ChartOfAccounts','U') is not null drop table ChartOfAccounts
GO
 
-- create new home
 
CREATE TABLE ChartOfAccounts
      (
      [Account] int NOT NULL      PRIMARY KEY CLUSTERED IDENTITY (1,1)
      , [Description]      varchar (128) NULL
      , [Short_Description]      varchar(512)      NOT NULL
      , [Balance]       Char (128)            NULL
      )
GO
 
-- allow identity updates 
 
set identity_insert ChartOfAccounts ON
GO
 
-- build exclude dupes (any instance)
 
insert chartofaccounts (account,description,short_description,balance)
select distinct account,description,short_description,balance
from chartofaccounts_staging
where isnull(account,0) > 0
and len(isnull(short_description,'')) > 0
and not exists (select * from chartofaccounts c where c.account = chartofaccounts_staging.account)
and account not in (select account from chartofaccounts_staging group by account having count(*) > 1)
 
-- turn identity control back on
 
set identity_insert ChartOfAccounts OFF
GO

Open in new window

0
 
LVL 18

Expert Comment

by:jmoss111
ID: 22808585
Like mark_wills said, never import directly into a production table; always use a staging  table.
0
 

Author Comment

by:salberta
ID: 22813610
Hi jmoss111-
 
It seems that even a simple mod of the lines of the file are escaping me.   Could you please give me an example.
Sorry to be a bother.
 
salberta
0
 
LVL 18

Expert Comment

by:jmoss111
ID: 22814357
Your text file contains 4 columns, some rows of data only contain two columns but you don't have the extra delimiters to fill out the row, IE two ,,
0
 
LVL 30

Expert Comment

by:nmcdermaid
ID: 22818478
To extend jmoss111 comments, line 35 just has ,, in it. This indicates three columns (seperated by commas). Every other line has three commas in it, indicating four columns.
You need to work out what is meant to be at line 35. It definitely should not be ,,
Looking at the data, there's no space in the numbers between 121011 and 121012, so I'm guessing someone accidentally typed it in, and it should be removed.
 
Also, do you have a reason to make that column an IDENTITY?
You usually create an IDENTITY column so that if you insert data into it, it will automatically put in an incrementing value in the field for you.
So after you have successfully imported this daat, do you expect to be adding more data to the table, and have the table automatically generate an Account for you? if not, then you shouldn't be using the Identity property in your table.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22821404
hi salberta - have you tried my solution ? it does work as you want - just uses a staging table in between - is there any problems with that approach ? The only question I have is that it will ignore dupes in the staging table - altogether (ie not even 1 instance wins),  and you probably want to report those as errors for manual intervention rather than trying to decide which CofA entry becomes "real". But if there is a rule to help determine the "real" entry in those dupes, than can add that in...
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question