• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1780
  • Last Modified:

MSSQL 2005 - non-XML format file errors - overflow - CSV into existing SQL table

I'm attempting to use BULK INSERT as noted below, to import a 135mb csv file of csv data into a SQL table.  I'm getting the following errors..

-------------------------------------------------------------------------------------------------------------------------
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 1, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1

     My format file calls for importing 2 columns which are numeric from the csv data file, SS_ID and FQDN_ID.  the column types are assigned as Integer, and the FQDN_ID is the PK for the table.

     The import only errors on the fields where the SS_ID is larger than 4 digits.  I thought an INT data type was 12 char capable ?  In the sample data, it would error on rows 1-6, but not 6 and 7.  What am I missing in my format file, to make this work ?  I've tinkered, rather extensively, and can't seem to find what I'm missing.

Here is the import query
-------------------------------------------------------------------------------------------------------------------------
BULK INSERT imshostnamestemp
   FROM 'd:\importreports\scripts\imsextracts\old\Hostnames 20061128 09_10-1164723052.csv'
   WITH ( FORMATFILE = 'd:\importreports\scripts\bcp\imshostname3.fmt',
                  FIRSTROW = 1,
                  batchsize = 50000);

Format File
-------------------------------------------------------------------------------------------------------------------------
9.0
7
1       SQLCHAR       0       0     "\""           0     DummyField1        SQL_Latin1_General_CP1_CI_AS
2       SQLINT       0       12     "\",\""        1     SS_ID         SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR       0       0     "\",\""  2     Asset_Name         SQL_Latin1_General_CP1_CI_AS
4       SQLINT      0       12     "\",\""      3     FQDN_ID         SQL_Latin1_General_CP1_CI_AS
5       SQLCHAR       0       0     "\",\""  4     Asset_Alias_Hostname         SQL_Latin1_General_CP1_CI_AS
6       SQLCHAR       0       0     "\",\""  5     Description         SQL_Latin1_General_CP1_CI_AS
7       SQLCHAR       0       0     "\",\""      0     DummyProduct        SQL_Latin1_General_CP1_CI_AS
8       SQLCHAR       0       0     "\",\""      0     DummyState            SQL_Latin1_General_CP1_CI_AS
9       SQLCHAR       0       0     "\"\r\n"      0     Dummyfield2        SQL_Latin1_General_CP1_CI_AS


Sample Data
    - comma seperate, however every value is quote surrounded, whether its text or not.  The field terminators account for this, the first quote is called a field, so it can be ignored, then the real data, and then ignore the last two columns.

-------------------------------------------------------------------------------------------------------------------------
"1","ino0a662","41062","ino0a662.svr.bank.net","FQDN","SERVER","PROD"
"13","cmhtu505","16992","cmhtu505.tivoli.bank.net","FQDN","SERVER","DISPOSED"
"16","cmhtu730","17002","cmhtu730.tivoli.bank.net","FQDN","SERVER","DISPOSED"
"23","dfwtu202","22942","dfwtu202.tivoli.bank.net","FQDN","SERVER","DECOM"
"52","sdftu400","77555","sdftu400.tivoli.bank.net","FQDN","SERVER","DECOM"
"56","banconewh","4116","banconewh.dc.bank.net","FQDN","SERVER","DISPOSED"
"57","bankone-net","4117","bankone-net.dc.bank.net","FQDN","SERVER","DISPOSED"


0
jiriki76
Asked:
jiriki76
  • 9
  • 6
1 Solution
 
Anthony PerkinsCommented:
>>I thought an INT data type was 12 char capable ? <<
Not quite: An integer goes from -2,147,483,648  to 2,147,483,647
0
 
jiriki76Author Commented:
So, why is it coughing up an overflow on numeric data larger than 9999, instead of  2,147,483,647.  I simply don't follow.  Should I try changing the destination table column to bigint maybe?  I don't understand, but would it solve the problem?

I import this same file to the table every day, about 400,000 rows of data, but I'm doing it via vbscript and ADO.  I'm hoping to increase performance significantly, but I know the data is fine with respect to the table, as I can import the same file in 10 minutes via vbscript/ADO.

Thoughts?
0
 
Anthony PerkinsCommented:
Perhaps you are confusing a T-SQL integer data type (-2^31 to 2^31 - 1) with a VBScript integer data type (-2^15 to  2^15 - 1)
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
jiriki76Author Commented:
Well, confusion or not, the sample data I provided, is well within the constraints of both TSQL and VB, right?

Here's it the destination table structgure.  Given the table structure below, why does the import fail on an overflow.

CREATE TABLE [dbo].[IMSHostnamesTEMP](
      [SS_ID] [int] NULL,
      [Asset_Name] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
      [FQDN_ID] [int] NOT NULL,
      [Asset_Alias_Hostname] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
      [Description] [nvarchar](4000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
      [dummyfield1] [nvarchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
      [dummyfield2] [nvarchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
      [dummyproduct] [nvarchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
      [dummystate] [nvarchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
0
 
Anthony PerkinsCommented:
You do realize that your table exceeds the allowed row lenght of 806 bytes, right?
0
 
Anthony PerkinsCommented:
Let's try that again:
You do realize that your table exceeds the maximum allowed row length of 8060 bytes, right?  It has 12,844 bytes.
0
 
Anthony PerkinsCommented:
When you change the number of columns from 7 to the correct number 5 it will import successfully (at least it did for me).  Just understand that if the data column width exceeds 8060 it will fail.
0
 
Anthony PerkinsCommented:
Or to clarify (and sorry for the multiple posts), you have chosen to ignore in your format file columns 1, 7, 8 and 9 which means that you should have a total left of 5 columns not 7.
0
 
jiriki76Author Commented:
Would it be a true statement to say that the second line of the format file indicates the number of columns in the source that will be sent to the destination?.  I understood it to be simply the number of columns in the source, period.  Ok... changed, but still getting the overflows... Although they've moved...

Regarding the data length issues, i'll respond in a seperate post, so as not to get more confusing..

Using "5" for  # of columns
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 1, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 2, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 3, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 4, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 5, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 6, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 7, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 8, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 9, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 10, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 11, column 4 (FQDN_ID).
Msg 4865, Level 16, State 1, Line 1
Cannot bulk load because the maximum number of errors (10) was exceeded.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
(0 row(s) affected)

Using "7" for the # of columns
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 1, column 2 (SS_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 2, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 3, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 4, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 5, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 6, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 9, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 14, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 16, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 17, column 4 (FQDN_ID).
Msg 4867, Level 16, State 1, Line 1
Bulk load data conversion error (overflow) for row 18, column 4 (FQDN_ID).
Msg 4865, Level 16, State 1, Line 1
Cannot bulk load because the maximum number of errors (10) was exceeded.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
(0 row(s) affected)

And the current first 11 rows it managed to get through...
---------
"00001","ino0a662","4594","beltu100","ALTERNATE HOSTNAME","SERVER","PROD"
"1","ino0a662","41062","ino0a662.svr.bank.net","FQDN","SERVER","PROD"
"13","cmhtu505","16992","cmhtu505.tivoli.bank.net","FQDN","SERVER","DISPOSED"
"16","cmhtu730","17002","cmhtu730.tivoli.bank.net","FQDN","SERVER","DISPOSED"
"23","dfwtu202","22942","dfwtu202.tivoli.bank.net","FQDN","SERVER","DECOM"
"52","sdftu400","77555","sdftu400.tivoli.bank.net","FQDN","SERVER","DECOM"
"56","banconewh","4116","banconewh.dc.bank.net","FQDN","SERVER","DISPOSED"
"57","bankone-net","4117","bankone-net.dc.bank.net","FQDN","SERVER","DISPOSED"
"58","cmhapp003","15701","cmhapp003.bank.net","FQDN","SERVER","DISPOSED"
"60","bospayweb01","5266","bospayweb01.dc.bank.net","FQDN","SERVER","DECOM"
"61","cascryprd","7894","cascryprd.ad.bank.net","FQDN","SERVER","PROD"

And of course, the format file as it stands now..
9.0
7
1       SQLCHAR       0       0     "\""           0     DummyField1        SQL_Latin1_General_CP1_CI_AS
2       SQLINT       0       12     "\",\""        1     SS_ID        ""
3       SQLCHAR       0       0     "\",\""  2     Asset_Name         SQL_Latin1_General_CP1_CI_AS
4       SQLINT      0       12     "\",\""      3     FQDN_ID       ""
5       SQLCHAR       0       0     "\",\""  4     Asset_Alias_Hostname         SQL_Latin1_General_CP1_CI_AS
6       SQLCHAR       0       0     "\",\""  5     Description         SQL_Latin1_General_CP1_CI_AS
7       SQLCHAR       0       0     "\",\""      0     DummyProduct        SQL_Latin1_General_CP1_CI_AS
8       SQLCHAR       0       0     "\",\""      0     DummyState            SQL_Latin1_General_CP1_CI_AS
9       SQLCHAR       0       0     "\"\r\n"      0     Dummyfield2        SQL_Latin1_General_CP1_CI_AS
0
 
jiriki76Author Commented:
Regarding data column width, I'm not understanding what I'm reading I guess...  I found this on books online regarding TSQL and Length, Scale and Precision.

http://msdn2.microsoft.com/en-us/library/ms190476.aspx
Length for a numeric data type is the number of bytes that are used to store the number. Length for a character string or Unicode data type is the number of characters. The length for binary, varbinary, and image data types is the number of bytes. For example, an int data type can hold 10 digits, is stored in 4 bytes, and does not accept decimal points. The int data type has a precision of 10, a length of 4, and a scale of 0.

So, in my situation, a value of say... 15701, should import properly into an INT data type, because its less than 10 digits. ?  However the last sentence is what confuses me, length of 4... the preceding sentence indicates thats 4 bytes, however after adjusting my format file to 5 columns, the 2 FQDN_ID column failures, were for values starting with the #7.  All of the other FQDN_ID values in those first 11 lines had a leading digit less than 7.  Why is that an issue?  And why were the SS_ID values in rows 6 and 11 fine.

Befuddled...  still reading, trying to find the error of my ways...
0
 
jiriki76Author Commented:
what befuddles me the most, is that the data value, 15701, can be manually placed into an INT field at any point in the table.  Why is it giving me so much grief importing it ?

oh, and that 7, really is a 5, i accidentally pasted it with the 7 when i was gathering the error information to paste it... i promise, i really did change it to 5...
0
 
Anthony PerkinsCommented:
Try it this way:

9.0
8
1      SQLCHAR      0      0      "\""      0      DummyField      SQL_Latin1_General_CP1_CI_AS
2      SQLCHAR      0      12      "\",\""      1      SS_ID      SQL_Latin1_General_CP1_CI_AS
3      SQLCHAR      0      0      "\",\""      2      Asset_Name      SQL_Latin1_General_CP1_CI_AS
4      SQLCHAR      0      12      "\",\""      3      FQDN_ID      SQL_Latin1_General_CP1_CI_AS
5      SQLCHAR      0      0      "\",\""      4      Asset_Alias_Hostname      SQL_Latin1_General_CP1_CI_AS
6      SQLCHAR      0      0      "\",\""      5      Description      SQL_Latin1_General_CP1_CI_AS
7      SQLCHAR      0      0      "\",\""      0      DummyProduct      SQL_Latin1_General_CP1_CI_AS
8      SQLCHAR      0      0      "\"\r\n"      0      DummyProduct      SQL_Latin1_General_CP1_CI_AS
0
 
Anthony PerkinsCommented:
Let me know, if you are not following what I am doing in the above format.

>>Would it be a true statement to say that the second line of the format file indicates the number of columns in the source that will be sent to the destination?<<
That is correct.  I misread.

Regarding the host file data type (the second column in the format file) this is what BOL has to say on the subject:

"Host file data type Data type stored in the particular field of the data file. With ASCII data files, use SQLCHAR; for native format data files, use default data types."
0
 
jiriki76Author Commented:
Working...  I saw that before, but I guess I was making another mistake, because it was complaining then too.

One adjustment, was to remove the \r\n from the last field terminator, and it worked fine, until it got to the bottom of the file.  Here's what I got.

Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

I presume something at the end of my CSV file is fishy, maybe a hidden ASCII char or something.  Wondering if I shouldn't use vb to clean up the end of the file first, and then let it upload via bulk insert?  Here's is the final format file I'm using..

9.0
8
1     SQLCHAR     0     0     "\""     0     DummyField     SQL_Latin1_General_CP1_CI_AS
2     SQLCHAR     0     12     "\",\""     1     SS_ID     SQL_Latin1_General_CP1_CI_AS
3     SQLCHAR     0     0     "\",\""     2     Asset_Name     SQL_Latin1_General_CP1_CI_AS
4     SQLCHAR     0     12     "\",\""     3     FQDN_ID     SQL_Latin1_General_CP1_CI_AS
5     SQLCHAR     0     0     "\",\""     4     Asset_Alias_Hostname     SQL_Latin1_General_CP1_CI_AS
6     SQLCHAR     0     0     "\",\""     5     Description     SQL_Latin1_General_CP1_CI_AS
7     SQLCHAR     0     0     "\",\""     0     DummyProduct     SQL_Latin1_General_CP1_CI_AS
8     SQLCHAR     0     0     "\""    0     DummyProduct     SQL_Latin1_General_CP1_CI_AS


I changed the terminater for the the 8th column, because it would give me a "column is too long in the data file for row x, column 8 error.  So, I think thats whats causing the issue, it calls it a new row as soon as it gets to the last  quote, yet at the end of the file, it doesn't understand what to do with the cr/lf.  Gotta figure out whats really there.

Thank you so much for the assistance here... Itimported 400,000 rows in about 9 seconds, versus the 15 minutes it was before.
0
 
Anthony PerkinsCommented:
Please do not overlook that your table design is seriously flawed as i exceeds the maximum row width of 8060 bytes.  This could come back to haunt you.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

  • 9
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now