.NET c# - Read HTML to find link with specific class

Hi,

I have written a simple .NET c# application that downloads the source code of a HTML web page and reads the stream line by line.

The HTML contains many links like this:
<a class="org" href="/business/linkaddress">Link Text</a>

Open in new window


I need to extract the link address (eg: "/business/linkaddress" for all links that have class="org" and put into an array.

Can I do this with Regex?

string target = @"HTTP ADDRESS";

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(target);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();

string resultline;
int i = 0;

using (Stream responseStream = response.GetResponseStream())
using (StreamReader htmlStream = new StreamReader(responseStream, Encoding.UTF8))

    while ((resultline = htmlStream.ReadLine()) != null)
		i = i+1;
		ClearCurrentConsoleLine();
        Console.WriteLine(i);
        
		if (Regex.IsMatch(resultline, "XXXXX"))
        Console.WriteLine(resultline.Trim(new char[] { ' ', '\t' }));

Open in new window

mhdiAsked:
Who is Participating?
 
dejaanbuCommented:
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.