Solved

script to insert multiple line breaks in clob oracle

Posted on 2014-12-12
14
746 Views
Last Modified: 2014-12-21
I have a table with a clob column and some of the records need to have line breaks inserted. Due to the large number of records involved, doing so manually would take too long. Is there a way to insert multiple line breaks into each clob field?

When I say multiple line breaks I don't mean multiple line breaks all together; I mean multiple line breaks spread out among the content of the clob field
0
Comment
Question by:corgano
  • 6
  • 6
  • 2
14 Comments
 
LVL 20

Expert Comment

by:flow01
ID: 40497787
Where does the need come from?  Problems with a maximum linesize , problems with  a pl/sql varchar2 maximum size of 32767 ?  Something else ?
Is it purely technical and does it not matter if a line break splits a word, if it does matter what should be the rules for inserting.
Should you be able to distinguish between  inserted breaks and breaks originaly existing?
Is the content nice formatted XMLtype or something else ?
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40498345
Other questions:
What is the largest CLOB length you are dealing with?
I assume you just don't want to insert the line feeds randomly.  
Are you looking to do a replace?
If not, how do you know where the line breaks go?
0
 

Author Comment

by:corgano
ID: 40498528
The need is based on a customer of ours who placed a large amount of data (32k+ characters) in their knowledgebase with no line breaks. Whiel theire older product ignored this, the newer version enforces certain limitations on the number of characters before a line break is needed.

So I am trying to see if there is a way to insert a line break every so many characters in the club field to resolve the issue as the customer has possibly hundreds of such records and is not likely to want to manually updated each one.

My opinion is that if it is possible to resolve this at the db layer, I am not concerned if the line break cuts a word apart. If the customers agents are told about it or notice it they can manually resolve that.

Either inserting the link breaks every xx number of characters would be good, as I am not sure how useful a replace would be without an anchor character to focus on.

The content is formatted xml btw
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40499334
>>The content is formatted xml btw

You cannot break XML every X characters or you can end up with invalid XML.  You might break it in the middle of a node name and break the XML.

You would need to write code that went every X characters then looked for a text node prior to that to insert a break.  And also make sure that text node was 'OK' to break.

If you inserted a line break in something like a phone number or date would almost surly break an application that was looking for a specific format.

That would be a LOT of work and like any written code, prone to bugs which could break the XML.

I would suggest the app be changed to deal with CLOBs.

Changing the data to fit issues in a poorly designed application will only lead to trouble.
0
 

Author Comment

by:corgano
ID: 40499510
Making the change in code is not an option; doing so would break a number of other things; that decision was already made.

What if the data was moved into a VARCHAR table then the insertion was done?
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 500 total points
ID: 40500260
Varchar2 outside of pl/sql is limited to a max of 4000 characters.  Possibly less depending on the database's character set.  Even if you store the CLOB 4000 characters at a time, you'll still have the problem of no line breaks when you put it all back together to get a valid XML doc.

Looks like you'll need to write some pretty specialized code to find a text node somewhere around where you want to insert the line break, insert the line break and repeat for the length of the CLOB while not generating bad XML or splitting a column like a date or phone number that the app requires to be in a specific format.

I wish you well with that.  It will require a LOT of testing and will still likely break something once executed in production.
0
 

Author Comment

by:corgano
ID: 40509052
One followup question if you don't mind - what if the content in question were formatted as html instead of xml?

Would it be possible to insert <p></p> to break up the long content (cosmetic issues aside)?
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 20

Expert Comment

by:flow01
ID: 40509959
That would not change your problem of not having a line break.
And to split in paragraphs the html should have the long text in paragraphs already and i would expect an insert of  </p><p>

<html><p>longtext</p></html>
to
<html><p>textpart</p><p>textpart</p><p>textpart</p></html>

...  and all remarks about testing and breaking  of slightwv would still be true
(I did make a package to split a clob  to get some feeling about it : and it does not feel to good. But I wil add the source to give an idea of a possible  approach)

Whatever you do:  be sure to save your original data first.
split-xmlclob.pck.sql
0
 

Author Comment

by:corgano
ID: 40509981
Thanks - much appreciated
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40510157
HTML these days should be XML compliant so the previous comments stand.

Sounds like you only want to change the format of a specific field not just every X characters in the entire document.  That can change things.

If you specifically know what nodes/tags you want to go after, you might be able to parse those and use UPDATEXML and some custom code to take specific nodes and insert paragraph tags of CR/LF characters.
0
 

Author Comment

by:corgano
ID: 40510352
in order to resolve the issue I am dealing with I need to insert line breaks or paragraph tags into the clob after a certain number of characters
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40511148
I understand that.  As we have posted you cannot take the entire XML/HTML CLOB and randomly insert characters every 'X' characters.

What I was talking about is if there was a specific node/field in the XML/HTML that you need to break up, that may be possible.

To say for sue you will need to provide A LOT more detail about the specific requirements and how the application processes the CLOB.
0
 

Author Comment

by:corgano
ID: 40511616
The content of the clob is the content of a knowledge article. In earlier versions of our product the product allowed for parsing lines in excess of 6k characters, but a design change in our new version limits the parsing to around 6k. If the line has more than 6k characters then problems occur when trying to search for it.

Due to the nature of the design change it cannot be regressed, and we only have one customer that is experience this issue with approx. 2500 articles. We do know that breaking up the long lines into less than 6k lines resolves the issue, so due to the number of articles we are for a way to modify the affected articles by inserting line break/paragraph tags. We considered trying to use a replace cunction to insert the tags right after each period, but put that option aside due to the number of breaks that might get inserts (would differ between clobs)
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40511933
A little better but still doesn't help a lot.

Is that 6k in a single XML node of 6k total characters into the XML no matter how many tags there is?

Can you have:
<html>
<p>Hello</p>
<br/>
--6000 more br tags
<p>World</p>
...
</html>

Open in new window


So breaking 6k into it will break the html.

Or do you have:
<html>
<p>
--the entire text of the Bible
</p>
</html>

Open in new window


So breaking on a word around 6k will not break the html.

Since you are 'searching' based on the breaks, you really cannot break mid-word.  If the searching you are doing allows for terms and phrases, breaking the text into chunks can cause missed searches.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

This article describes some very basic things about SQL Server filegroups.
Never store passwords in plain text or just their hash: it seems a no-brainier, but there are still plenty of people doing that. I present the why and how on this subject, offering my own real life solution that you can implement right away, bringin…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now