• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 287
  • Last Modified:

SQL - Processing text files without CR

Hi,
can you please help.
I am loading and processing the contents of text files into a table.
However, some of my files have LF instead of CR.

I've been told that if I open the file in notpad and then save it, it will resolve the problem.

Has anyone come across a similar problem and if so a little guidance on how to do this would be really appreciated.

Regards

0
Eric Harris
Asked:
Eric Harris
  • 8
  • 5
1 Solution
 
Richard QuadlingSenior Software DeveloperCommented:
What are you using to load the data?

I use
 TRUNCATE TABLE TripleCDataConversion.dbo.Addresses
 BULK INSERT TripleCDataConversion.dbo.Addresses
 FROM 'D:\CCC2SQL\AD.TAB'
 WITH
  (
  FIELDTERMINATOR = '\t',
  KEEPIDENTITY,
  KEEPNULLS,
  LASTROW =               3313,
  ORDER (UniqueID ASC)
  )

Open in new window


The FIELDTERMINATOR is the tab character.

By default the ROWTERMINATOR is '\r\n' (http://msdn.microsoft.com/en-us/library/ms188365.aspx).

If you added or used ROWTERMINATOR = '\n', then that should all work for you.
0
 
Richard QuadlingSenior Software DeveloperCommented:
http://msdn.microsoft.com/en-us/library/ms191485.aspx disagrees with the previous link.

Here it says "\n" is the default ROWTERMINATOR.

Either way, play around with these to find what matches your data.
0
 
dqmqCommented:
Different platforms use conventions for marking a new line in text files  For example, Unix uses LF and Windows uses CRLF.  Some older systems use CR, but I think it's pretty rare today.

It's generally best to convert to the native format when the file is first loaded onto the platform.  FTP will do so, provided the transfer is done in ASCII or character mode rather than binary mode.  Free commands utilities are also available that can do the conversion for most platforms.

Some text editors are capable of translating different standards and even translating to the "correct" standard during a save operation. Notepad, I kind of doubt--you may have better luck with Wordpad a command line tool designed for the purpose.
 
0
Cloud Class® Course: Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

 
Eric HarrisDeveloperAuthor Commented:
Hi There
Thanks for this it looked great to me.
However, although the syntax looks fine it doesn't seem to do anything.
I've tried /t /n and /r and nothing seems to have any effect.

here's my code

  EXEC ('bulk INSERT temp_EPOS_Ord_File FROM "' + 'V:\ewh\Website\Live\ErroredFiles\' + @filename + '" WITH (FIELDTERMINATOR = ' + '''' + '\n' + '''' + ', KEEPIDENTITY, KEEPNULLS, LASTROW = 3313)')

Any ideas
0
 
Richard QuadlingSenior Software DeveloperCommented:
ROWTERMINATOR.

Not FIELDTERMINATOR.

Different purposes.

column FIELDTERMINATOR column FIELDTERMINATOR column ROWTERMINATOR
0
 
Eric HarrisDeveloperAuthor Commented:
oops.
Cheers but still not joy

  EXEC ('bulk INSERT temp_EPOS_Ord_File FROM "' + 'V:\ewh\Website\Live\ErroredFiles\' + @filename + '" WITH (ROWTERMINATOR = ' + '''' + '\n' + '''' + ', KEEPIDENTITY, KEEPNULLS, LASTROW = 3313)')
0
 
Richard QuadlingSenior Software DeveloperCommented:
Using the following code
create table FamilyNames (Name varchar(50), Age smallint)
bulk insert FamilyNames from 'D:\FamilyNamesCR.csv' with (CODEPAGE='RAW', DATAFILETYPE='CHAR', FIELDTERMINATOR='\t', ROWTERMINATOR='\r')
bulk insert FamilyNames from 'D:\FamilyNamesLF.csv' with (CODEPAGE='RAW', DATAFILETYPE='CHAR', FIELDTERMINATOR='\t', ROWTERMINATOR='\n')
bulk insert FamilyNames from 'D:\FamilyNamesCRLF.csv' with (CODEPAGE='RAW', DATAFILETYPE='CHAR', FIELDTERMINATOR='\t', ROWTERMINATOR='\r\n')
select * from FamilyNames
drop table FamilyNames

Open in new window


And 3 different CSV files (Windows CRLF, Unix LF and Mac CR).

I'm getting ...

(5 row(s) affected)
Msg 4864, Level 16, State 1, Line 3
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 2 (Age).
Msg 4864, Level 16, State 1, Line 4
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 2 (Age).

(5 row(s) affected)

Open in new window


So 5 rows imported from the first file using the \r terminator, but that's it.

Changing the terminator to be used for the third file to \n (rather than the correct \r\n) and that file is imported.

Argh! http://msdn.microsoft.com/en-us/library/ms191485.aspx
Using Row Terminators
The row terminator can be the same character as the terminator for the last field. Generally, however, a distinct row terminator is useful. For example, to produce tabular output, terminate the last field in each row with the newline character (\n) and all other fields with the tab character (\t). To place each data record on its own line in the data file, specify the combination \r\n as the row terminator.

Note:  
When you use bcp interactively and specify \n (newline) as the row terminator, bcp automatically prefixes it with a \r (carriage return) character, which results in a row terminator of \r\n.

I SEEM to be getting this error using BULK INSERT.

Seems like MS SQL won't let you import data from a unix machine.

But no. Example C at http://msdn.microsoft.com/en-us/library/ms188365.aspx

C. Using line feed as a row terminator
The following example imports a file that uses the line feed as a row terminator such as a UNIX output:
DECLARE @bulk_cmd varchar(1000)
SET @bulk_cmd = 'BULK INSERT AdventureWorks2008R2.Sales.SalesOrderDetail
FROM ''<drive>:\<path>\<filename>'' 
WITH (ROWTERMINATOR = '''+CHAR(10)+''')'
EXEC(@bulk_cmd)

Open in new window

FamilyNames.zip
0
 
Richard QuadlingSenior Software DeveloperCommented:
Here you go. Using the same zip files as previously supplied.

create table FamilyNames (Name varchar(50), Age smallint)

-- Successful statement to import data where the row terminator is CR \r
bulk insert FamilyNames from 'D:\FamilyNamesCR.csv' with (FIELDTERMINATOR='\t', ROWTERMINATOR='\r')


-- Successful statement to import data where the row terminator is LF \n
DECLARE @bulk_cmd varchar(1000)
SET @bulk_cmd = 'bulk insert FamilyNames from ''D:\FamilyNamesLF.csv'' with (FIELDTERMINATOR=''\t'', ROWTERMINATOR='''+CHAR(10)+''')'
EXEC(@bulk_cmd)


-- Successful statement to import data where the row terminator is CRLF \r\n
bulk insert FamilyNames from 'D:\FamilyNamesCRLF.csv' with (FIELDTERMINATOR='\t', ROWTERMINATOR='\n')

-- Show the lengths to show that no trailing odd characters were added
select *, LEN(name) from FamilyNames order by Name

drop table FamilyNames

Open in new window


outputs
Name	Age	(No column name)
Hesper	8	6
Hesper	8	6
Hesper	8	6
Oscar	3	5
Oscar	3	5
Oscar	3	5
Richard	43	7
Richard	43	7
Richard	43	7
Sally	37	5
Sally	37	5
Sally	37	5
Thomas	6	6
Thomas	6	6
Thomas	6	6

Open in new window

with message of
(5 row(s) affected)

(5 row(s) affected)

(5 row(s) affected)

(15 row(s) affected)

Open in new window

0
 
Eric HarrisDeveloperAuthor Commented:
Thanks for this you are being really helpful.
I can see the theory is absolutely correct but it still doesn't seem to work.

I don't have a field delimiter. I'm inserting the record as a single string.
I've tried 9,10 and 13 but still no success.

The files are being sent to me by a third party so I have no control over format. i'm told that they are in 'linex' format so it should need the /n (char(10)) options you provided.
It would be nice if they were consistent. It appears some are line terminited and some CR.
(just my luck)

All I know if that if I open the file in notepad and then save it, the terminator corrects iteaslf and all is OK withthe file.
Obviously I can't do this in batch mode.

I will keep trying. Thanks again for your help.
0
 
Richard QuadlingSenior Software DeveloperCommented:
OK. So you need only to define the ROWTERMINATOR.

Can you zip a file and send it to me.

As you are on windows, can you try the following.

1 - Open a command prompt
2 - Change directory to the location of the file.
3 - Type in the following command
edit /078 <filename>

Open in new window


That will give a screen along the lines of An example screen showing the binary mode, wrapped to 78 characters.
If you look carefully, moving the cursor to the end of the word "UPDATE" (the file I'm showing is a saved SQL statement), the next character is musical note. It has a value (look in the bottom right corner of the screen) "Line:1   Col:7   Value:13". Next to it "Line:1   Col:8   Value:10" and finally, some Value:9.

This is telling me that the lines terminator with CR(13 / 0x0D) LF (10 / 0x0A).

The 9's are tab characters for the next line.

The actual text looks like
UPDATE
                                CONTRACTS.dbo.POP_HEADER
                        SET
                                POH_RECHARGE_DATETIME = GetDate(), -- Now
                                POH_RECHARGE_STATUS = 6 -- Recharged/NoDamage/Ignore
                        WHERE
                                POH_PRIMARY = 1269460;

Open in new window


Once you can confirm how the lines are terminated, then we can see why you are not getting the desired results.

How are you executing the SQL?

0
 
Richard QuadlingSenior Software DeveloperCommented:
If you look through the rest of the file, you will see the same 13 / 10 pairing for each line ending.
0
 
Eric HarrisDeveloperAuthor Commented:
Well thanks to you it is now working.
I did everything you said which confirmed that it was indeed a LF being used as a row terminator.
So I went bak to your last solution and actually found out that it was the wat I was displaying the data to check the results.

In fact it works fine

Firstly I must apologies for not picking this up before your last post but must must also add that I really benefitted from the extra posts you made. That's some really useful stuff
0
 
Eric HarrisDeveloperAuthor Commented:
What an abolutely brilliant response.
some much useful informaton in here
0
 
Richard QuadlingSenior Software DeveloperCommented:
Your question made me examine a LOT of things I took for granted. That is one of the joys of Experts Exchange. I may know a lot, but I don't know everything and every question can be a learning opportunity for both of us.

Really pleased you got it working and very chuffed with your comments.

I just read your profile. If you fancy a chat, drop me a line. My e-addy is on my profile.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 8
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now