Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 346
  • Last Modified:

.NET c# - Read HTML to find link with specific class

Hi,

I have written a simple .NET c# application that downloads the source code of a HTML web page and reads the stream line by line.

The HTML contains many links like this:
<a class="org" href="/business/linkaddress">Link Text</a>

Open in new window


I need to extract the link address (eg: "/business/linkaddress" for all links that have class="org" and put into an array.

Can I do this with Regex?

string target = @"HTTP ADDRESS";

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(target);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();

string resultline;
int i = 0;

using (Stream responseStream = response.GetResponseStream())
using (StreamReader htmlStream = new StreamReader(responseStream, Encoding.UTF8))

    while ((resultline = htmlStream.ReadLine()) != null)
		i = i+1;
		ClearCurrentConsoleLine();
        Console.WriteLine(i);
        
		if (Regex.IsMatch(resultline, "XXXXX"))
        Console.WriteLine(resultline.Trim(new char[] { ' ', '\t' }));

Open in new window

0
mhdi
Asked:
mhdi
1 Solution
 
dejaanbuCommented:
0

Featured Post

Become an Android App Developer

Ready to kick start your career in 2018? Learn how to build an Android app in January’s Course of the Month and open the door to new opportunities.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now