?
Solved

.NET c# - Read HTML to find link with specific class

Posted on 2012-03-10
1
Medium Priority
?
332 Views
Last Modified: 2012-03-11
Hi,

I have written a simple .NET c# application that downloads the source code of a HTML web page and reads the stream line by line.

The HTML contains many links like this:
<a class="org" href="/business/linkaddress">Link Text</a>

Open in new window


I need to extract the link address (eg: "/business/linkaddress" for all links that have class="org" and put into an array.

Can I do this with Regex?

string target = @"HTTP ADDRESS";

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(target);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();

string resultline;
int i = 0;

using (Stream responseStream = response.GetResponseStream())
using (StreamReader htmlStream = new StreamReader(responseStream, Encoding.UTF8))

    while ((resultline = htmlStream.ReadLine()) != null)
		i = i+1;
		ClearCurrentConsoleLine();
        Console.WriteLine(i);
        
		if (Regex.IsMatch(resultline, "XXXXX"))
        Console.WriteLine(resultline.Trim(new char[] { ' ', '\t' }));

Open in new window

0
Comment
Question by:mhdi
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 
LVL 14

Accepted Solution

by:
dejaanbu earned 2000 total points
ID: 37706314
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article is for Object-Oriented Programming (OOP) beginners. An Interface contains declarations of events, indexers, methods and/or properties. Any class which implements the Interface should provide the concrete implementation for each Inter…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question