REA_ANDREW
asked on
Regular Expression
Hi ,
I am new to regular expressions and need to extract a part of HTML. Does Carriage Return matter? For example
<div id="subnavmaroon">
<ul>
<li class="picborder"><a href="#">Agriculture</a></ li>
<li class="picborder"><a href="#">Mining & Exploration</a></li>
<li><a href="#">Petroleum</a></li >
</ul>
</div>
I require a regular expression which will give me
<li class="picborder"><a href="#">Agriculture</a></ li>
<li class="picborder"><a href="#">Mining & Exploration</a></li>
<li><a href="#">Petroleum</a></li >
So the innerHtml of the UL tag
Thanks in advance
Andrew
I am new to regular expressions and need to extract a part of HTML. Does Carriage Return matter? For example
<div id="subnavmaroon">
<ul>
<li class="picborder"><a href="#">Agriculture</a></
<li class="picborder"><a href="#">Mining & Exploration</a></li>
<li><a href="#">Petroleum</a></li
</ul>
</div>
I require a regular expression which will give me
<li class="picborder"><a href="#">Agriculture</a></
<li class="picborder"><a href="#">Mining & Exploration</a></li>
<li><a href="#">Petroleum</a></li
So the innerHtml of the UL tag
Thanks in advance
Andrew
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thank you for getting back to me. I need to replace what it finds with nothing. It is searching a file so it will be over multiple lines.
Thanks
Andrew
Thanks
Andrew
Yeah, just a small adaptation then:
using System;
using System.Collections.Generic ;
using System.Text;
using System.Text.RegularExpress ions;
namespace ConsoleApplication1 {
class Program {
static void Main(string[] args) {
string input = @"
This is some text to be preserved
<div id=""subnavmaroon"">
<ul>
<li class=""picborder""><a href=""#"">Agriculture</a> </li>
<li class=""picborder""><a href=""#"">Mining & Exploration</a></li>
<li><a href=""#"">Petroleum</a></ li>
</ul>
</div>
This was after the emptied div";
string result = Regex.Replace(
input,
@"<div id=""subnavmaroon"">\s*<ul >(.*?)</ul >\s*</div> ",
@"<div id=""subnavmaroon""></ul>< /div>",
RegexOptions.Singleline
);
Console.WriteLine(result);
Console.ReadLine();
}
}
}
--
Outputs:
--
This is some text to be preserved
<div id="subnavmaroon"></ul></d iv>
This was after the emptied div
using System;
using System.Collections.Generic
using System.Text;
using System.Text.RegularExpress
namespace ConsoleApplication1 {
class Program {
static void Main(string[] args) {
string input = @"
This is some text to be preserved
<div id=""subnavmaroon"">
<ul>
<li class=""picborder""><a href=""#"">Agriculture</a>
<li class=""picborder""><a href=""#"">Mining & Exploration</a></li>
<li><a href=""#"">Petroleum</a></
</ul>
</div>
This was after the emptied div";
string result = Regex.Replace(
input,
@"<div id=""subnavmaroon"">\s*<ul
@"<div id=""subnavmaroon""></ul><
RegexOptions.Singleline
);
Console.WriteLine(result);
Console.ReadLine();
}
}
}
--
Outputs:
--
This is some text to be preserved
<div id="subnavmaroon"></ul></d
This was after the emptied div
Oops... You'll probably want to throw in a <ul> start tag into the string that is the third argument to Regex.Replace. Of course, if you want to wipe the div as well, just make the third argument an empty string.
ASKER
<div id="subnavmaroon">
<ul>
and ending
</ul>
</div>
anything between I need to return