We help IT Professionals succeed at work.

SQL*Loader How To strip out CR/LF in one column

rowek
rowek asked
on
9,047 Views
Last Modified: 2013-12-18
We have a very simple need and have received a lot of conflictig advise.  The simple question is "How do I remove a CR or LF out of one column contained in a CSV text INFILE?"   I want to do this inside the SQL*Loader 10g CTL file below.  The rest of this proces has run for four months as is, but we now want to remove these characters from the COMMENTS column to include its data as well.  

Note that MS Access is the source database, we run a TransferText to produce the CSV text file.  CR/LF is legal in MS Access but not in Oracle, so we need to get SQL*Loader to strip these out.  Must be done here, not upstream.  Thanks!

LOAD DATA
INFILE      'Refer to app.config file'
BADFILE     'Refer to app.config file'
DISCARDFILE 'Refer to app.config file'
APPEND
INTO TABLE dairy.insp
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(
  insp_ID INTEGER EXTERNAL,
  Insp_Empl_ID CHAR,
.
. <some detail removed>
.
  SA_Date_Hazard_Analysis_Issue  "to_date(:SA_Date_Hazard_Analysis_Issue,'MM/DD/YYYY HH24:MI:SS')",
  SA_Date_HACCP_Plan_Issue        "to_date(:SA_Date_HACCP_Plan_Issue,'MM/DD/YYYY HH24:MI:SS')",
  SA_Date_Prereq_Programs_issue  "to_date(:SA_Date_Prereq_Programs_issue,'MM/DD/YYYY HH24:MI:SS')",
  Insp_Order  INTEGER EXTERNAL,
  COMMENTS CHAR,   <=========== Need to remove CR/LRs from this column
  CREATE_DATE "to_date(:CREATE_DATE,'MM/DD/YYYY HH24:MI:SS')",
  CREATE_USER CHAR,
  MODIFY_DATE "to_date(:MODIFY_DATE,'MM/DD/YYYY HH24:MI:SS')",
  MODIFY_USER CHAR,
  PROGRAM_USE CHAR,
  Upload_Date     "to_date(:Upload_Date ,'MM/DD/YYYY HH24:MI:SS')",
  Press_Insp  CHAR
)
Comment
Watch Question

Commented:
If your records are *not* delimited by CR/LF then you can do a simple replace

 REPLACE(COMMENTS ,chr(13)||chr(10),' ') COMMENTS

Author

Commented:
Actually I think I will need an OR.
My fault, but it can be either a CR or a LF or both.   I will look at text file with a hex editor and post more detail, but if you could help me with and OR situaiton I should be able to award points.  Thanks!

Commented:
In that case, use translate

translate(COMMENTS ,chr(13)||chr(10),'  ')   -- to replace CR or LF with a white space

Again, the important factor is that you don't use newlines as record delimiters. If you do, then this method will not work and you will most likely have to either modify the export process from Access or to do some preprocessing on the CSV file.

Author

Commented:
Okay, this may sound silly, but what is the best way to tell if that is the case?  use a hex editor to look at the file?  MS Access runs on Windows XP and creates a standard CSV text file.  If I do have NewLines what else can I do? The change must take place in this process as the export process cannot be changed. Thanks.  Testing now.

Author

Commented:
I just did extensive testing.   The text file is dependent on the CRLF at the end of each row to terminate the row.   Looks like the REPLACE and TRANSLATE will not work.  Is there another way to handle this INSIDE of the SQL*Loader Control file?  Thank you for your efforts so far.

Commented:
To the best of my knowledge, no. What I've done in situations like this was to run a small app that would pre-process the file and remove the newlines inside the text field.

Can you completely bypass SQL loader and just write an app that connects to the Access database, pulls the data and inserts it into Oracle?

Author

Commented:
I would love to write that small app, but the situation has gone political and additional programming is not allowed.  If we could do it in the CTL file then that would be okay, but no new code.  The solution we decided on was to use an aliasing query on the export in order to replace the COMMENTS column contents with NULL.  Not elegant, but they do not really care about that column for their application.  I guess if there was a silver bullet they would go for it, but the project is out of funding and we need to close it out quickly.  
I did confirm with Visual Notepad++ that there are CRLFs at the end of each row and in the records that are getting thrown to the BAD file.

Thanks again for your help.
CERTIFIED EXPERT
Top Expert 2005

Commented:
rowek,
   The problem is that SQL*Loader can't tell that you are still on the same row.  The CR/LF looks like a new row.  Try taking out the OPTIONALLY in the "OPTIONALLY ENCLOSED BY" and see if that allows it to scan multiple records.  I'll look and see if I can find an example of multiple physical records becoming one logical record.

Good luck!
CERTIFIED EXPERT
Top Expert 2005

Commented:
rowek,
   OK, I think this is the right path.  Put the OPTIONALLY back in if you have any data not enclosed by quotes (like INSP_ID, I'm guessing).  The add this as the next line:

CONTINUEIF LAST PRESERVE != ''''

As long as the last column in the record is a string (and enclosed by a quote), I think this will work.  Here's where I got it from (10g, but not much changed in this area): http://download.oracle.com/docs/cd/B19306_01/server.102/b14215/ldr_control_file.htm#i1005509

Good luck!

Author

Commented:
DrSQL: the politics have won out on this one.  We elected to drop to column, but I will do this...I will try your suggestion tomorrow and if it works I will award points.  I want to leave a solution.  gatorvip was very close but we needed the CRLF at the end of the steam.  Must run now...late for appt, but will try in the morning.  Thank you both for trying. Keith
Information Technology Specialist
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT
Top Expert 2005

Commented:
awking00,
    From what I gather, the problem is that the content of the data file might look like:

1,'FRED','I have only myself to blame.<CR>
<LF>If only I had looked right when I stepped off the curb on my visit to England.'

And SQL*Loader sees that as two records, when it should be one.  I could be wrong, but that's the problem I was trying to solve for rowek.

Good luck!

Author

Commented:
DrSQL, that's exactly the issue: rows are being "broken" sometimes two or three times.  Sorry I cannot test at the moment, must run payroll for my troops.  I hope to confirm your technique later today.  Cheers.
awking00Information Technology Specialist
CERTIFIED EXPERT

Commented:
Thanks, DrSQL
Now, I see the problem. I gather it only gets worse if the comments data also contains commas.

Author

Commented:
awking00, do you still think your method is worth pursuing?  I did reveal your code and it appears to only strip out those characters for columns that I pass to it.  This means I would not harm the CRLF needed at the end of each row.  Will test after payroll run.
awking00Information Technology Specialist
CERTIFIED EXPERT

Commented:
I would say it's worth a try.

Author

Commented:
My data has been changed.  The PM asked us to put a "NR" in the COMMENTS field so we could close out the project. If I can locate a copy of the old data (the one with the CRLFs in it) then I will see this thru.  Its important for me to know and for the other folks that are searching out a solution like I was.

Keith

Commented:
>>CONTINUEIF LAST PRESERVE != ''''

This might work. Last year I tried to get it to run in an external table but was unable to do so, then I took a different approach.

>>Create a function to remove cr/lf, then only apply it to the comments column.

I don't think that's going to work, as by the time the COMMENTS field is passed to the function it has already been parsed in by the sql loader. If you test the function separately it will do it, of course, but not when it's invoked in the parfile.

>> Its important for me to know and for the other folks that are searching out a solution like I was.

I'm definitely interested to see if you're able to find a viable solution here.

Author

Commented:
I am out all this week for surgery...it may be a while.  Thank you both for trying hard.

Author

Commented:
Good news, I may be able to locate an old copy of the data and test out this suggestion:
<<<awking00:Create a function to remove cr/lf, then only apply it to the comments column.>>>
Will know in the morning. I want to know just as bad as you guys.  This is a real weakness of Oracle SQL*Loader, IMHO.

Author

Commented:
Thank you for your patience on this one, surgery had me down for a while.  I gave you  an excellent for effort.  The solution is partial.  It seems to work well when only one CRLF is found in the COMMENTS field.  If multiple CRLFs are found then I get "second enclosure string not present".  In the BAD file I see a double quote to start the string but no ending double quote.  Thank you again for the solution and the patience.  Keith

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.