Solved

Regular Expression

Posted on 2006-10-30
5
363 Views
Last Modified: 2010-04-16
Hi ,

I am new to regular expressions and need to extract a part of HTML. Does Carriage Return matter? For example

<div id="subnavmaroon">
<ul>
<li class="picborder"><a href="#">Agriculture</a></li>
<li class="picborder"><a href="#">Mining &amp; Exploration</a></li>
<li><a href="#">Petroleum</a></li>
</ul>
</div>

I require a regular expression which will give me

<li class="picborder"><a href="#">Agriculture</a></li>
<li class="picborder"><a href="#">Mining &amp; Exploration</a></li>
<li><a href="#">Petroleum</a></li>

So the innerHtml of the UL tag

Thanks in advance

Andrew
0
Comment
Question by:REA_ANDREW
  • 3
  • 2
5 Comments
 
LVL 20

Author Comment

by:REA_ANDREW
ID: 17836915
so starting

<div id="subnavmaroon">
<ul>

and ending

</ul>
</div>

anything between I need to return
0
 
LVL 6

Accepted Solution

by:
der_jth earned 500 total points
ID: 17837114
Match m = Regex.Match(input, @"<div id=""subnavmaroon"">\s*<ul>(.*?)</ul>\s*</div>", RegexOptions.Singleline);
string result = m.Groups[1].Value;

Let me know if you have any issues with this.
0
 
LVL 20

Author Comment

by:REA_ANDREW
ID: 17837256
thank you for getting back to me.  I need to replace what it finds with nothing.  It is searching a file so it will be over multiple lines.

Thanks

Andrew
0
 
LVL 6

Expert Comment

by:der_jth
ID: 17837317
Yeah, just a small adaptation then:

using System;
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;

namespace ConsoleApplication1 {
  class Program {
    static void Main(string[] args) {

      string input = @"
This is some text to be preserved
<div id=""subnavmaroon"">
<ul>
<li class=""picborder""><a href=""#"">Agriculture</a></li>
<li class=""picborder""><a href=""#"">Mining &amp; Exploration</a></li>
<li><a href=""#"">Petroleum</a></li>
</ul>
</div>
This was after the emptied div";

      string result = Regex.Replace(
        input,
        @"<div id=""subnavmaroon"">\s*<ul>(.*?)</ul>\s*</div>",
        @"<div id=""subnavmaroon""></ul></div>",
        RegexOptions.Singleline
      );

      Console.WriteLine(result);
      Console.ReadLine();
    }
  }
}

--

Outputs:

--
This is some text to be preserved
<div id="subnavmaroon"></ul></div>
This was after the emptied div
0
 
LVL 6

Expert Comment

by:der_jth
ID: 17837326
Oops... You'll probably want to throw in a <ul> start tag into the string that is the third argument to Regex.Replace. Of course, if you want to wipe the div as well, just make the third argument an empty string.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Although it is an old technology, serial ports are still being used by many hardware manufacturers. If you develop applications in C#, Microsoft .NET framework has SerialPort class to communicate with the serial ports.  I needed to…
This article introduced a TextBox that supports transparent background.   Introduction TextBox is the most widely used control component in GUI design. Most GUI controls do not support transparent background and more or less do not have the…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question