[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Wrap text of scripttags into CDATA

Posted on 2014-03-12
4
Medium Priority
?
374 Views
Last Modified: 2014-03-13
Hello all

I'm trying to wrap the inner text of script tags into CDATA section.
I have the following code already (which I copied from another side) but works partial :
Regex regScriptCDATA = new Regex("(<script[^<>]*>)([^<>]+)(<\\/script>)");
MatchCollection m = regScriptCDATA.Matches(strHtml);
String strScriptCDATA = "$1" + "//<![CDATA[\n" + "$2" + "//]]>\n" + "$3";
strHtml = regScriptCDATA.Replace(strHtml, strScriptCDATA);

Open in new window

My problem is that if the text contains < or > it won't wrap.

The problem is that I have a website which (still) needs to be XHTML compatible. It's an ASP.NET application with hundreds of usercontrols which have script inside their markup  sometimes wrapped into CDATA.

Instead of modifying each usercontrol and putting CDATA into these scripts I was thinking of modifying the HTML by replacing the script tags which don't have CDATA before rendering the site.

Using regular expressions would be preferable.
0
Comment
Question by:Albert Van Halen
  • 2
  • 2
4 Comments
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39924751
Can you please give an example of the input, the output you're currently getting, and the output you want to get?
0
 
LVL 19

Author Comment

by:Albert Van Halen
ID: 39925745
Hi Terry

The input which I have at the moment look like this
<script type="text/javascript">
function test() {
var x = 0;
if(x > 0)
    alert('test');
}
</script>
<script type="text/javascript">
function test2() {
var x = 0;
if(x == 0)
    alert('test');
}
</script>
<script type="text/javascript">
//<![CDATA[
function test3() {
var x = 0;
if(x > 0)
    alert('test');
}
//]]>
</script>

Open in new window

The output I'm getting is this
<script type="text/javascript">
function test() {
var x = 0;
if(x > 0)
    alert('test');
}
</script>
<script type="text/javascript">
//<![CDATA[
function test2() {
var x = 0;
if(x == 0)
    alert('test');
}
//]]>
</script>
<script type="text/javascript">
//<![CDATA[
function test3() {
var x = 0;
if(x > 0)
    alert('test');
}
//]]>
</script>

Open in new window

Not that the second code block is wrapped into CDATA but the first is not. The third code block was already wrapped into CDATA so that's OK.

The reason that the first code block isn't wrapped into CDATA is because the innertext of the script node contains a '>' character.

Basically I want to have a regex searching for script tags which do not contain CDATA.

I hope this is clear for you.
Thanks in advance !
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 2000 total points
ID: 39925877
Try this pattern, using a negative lookahead to check each character between the script tags:

(<script[^<>]*>)((?:(?!CDATA|<\/script>)[\w\W])+)(<\/script>)

Open in new window


Also, rather than using [^<>], which gets stuck part way through the first script tag, I've used the negative lookahead to ensure we don't go past the closing script tag. This would also mean that other tags can be contained within the script tags.

It seems to work in myregextester.com

I used [\w\W] to match any one character, but if you activate single line mode you could just use . instead.
0
 
LVL 19

Author Closing Comment

by:Albert Van Halen
ID: 39925981
Excellent, this is exactly what I want. Thanks !!
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Suggested Courses

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question