Link to home
Start Free TrialLog in
Avatar of andre_st
andre_st

asked on

How to import a csv-file, when some fields contain line brakes, into columns in Excel?

I need to import a large CSV file (exported from win 2000 MSSQL-server), which contains of fields like this.

"title","text","ID"
"this is the title","And this is a lot 
of text.

<b>This should be in bold</b>

And the text ends here for this entry","334"
"title number 2","and this title also has
linebrakes in it...

<i>which is causing me</i>a headache,"897"

Open in new window


etc.

I need to get this into Excel, with each data field in one column. But it seems that if there are linebrakes in the file, it all gets messed up - and the . I thought that was the reason why you can configure Excel to use the double quotation character as a text delimiter during import, but that doesn't help when it comes to the linebrakes.

I've been testing all different types of delimiters, using comma,tab, semicolon - and single and double quotation marks as text delimiters. But nothing works so far...

How do I go about this?

The reason I need to set this up like this in Excel, is because I need to add some data to each entry. Therefore I thought it would be simplest to import it into delimited columns, and then just add the new columns - and export it again as a new csv-file.

So if you happen to know a way how to add for example the data field: "additional data" between "title" and "text" in the whole document (without the need of using columns in the first place), then that would solve my problem to!
SOLUTION
Avatar of Superdave
Superdave
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of andre_st
andre_st

ASKER

Thanks for your reply. I had seen the settings for CRLF/LF when exporting - but I didn't know what they meant. So thanks for pointing that out. But it still doesn't work. I've once again tried all different delimiters , this time with an LF-exported file from MSSQL.

I began to wonder if the exported file actually was in the LF-format, as I saw no difference at all after importing to Excel. So I found a third party tool to convert CRLF to LF. And after I used it I could at least se difference when opening the file in notepad - as there were no linebrakes at all. But when importing to Excel, the fields still get all messed up as before. I can't really understand how Excel divides the data into cells/columns. One problem seems to be though that it is also dividing the data based on the font tags for bold and italic <i>, <b>. But it also still seems to be dividing based on linebrakes. I don't know how Linebrakes/Line Feeds actually work. But if I for example copy the text from textfile I have - where in Notepad I can see no line brakes at all, and copy it to MS Word, all line brake show up again.

I am thinking that it is probably s simpler solution for me to find out how to insert a datafield into the document, without having to sort this column problem out. I guess this could be done in Excel as well. To make it find out for example the fourth semicolon of each entry, and add specific data there. Do you know how to do that?
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I couldnt do that with the third party tool I was using at first. But I found out it can easily be done with Notepad++

In there you can chose to view all characters, including CR/LF by:

View > Show Symbol > Show all characters.

Then you can use the search and replace function. Enable the "extended search mode", and search for "\r\n", and replace with <br />. (it is not possible to search for "CRLF".

Worked like a charm, and now I can import to excel, with all the fields in the right cells/columns.

Thanks for your help guys