Solved

regular expression, pick up the last occurrance

Posted on 2006-11-21
6
352 Views
Last Modified: 2010-04-16
Hi,

How do I pick up the last occurrance of a matching string by regular expression. For example: I have string:

<a href=url1><a href=url2>......<end>

I want to pick up the last url before "<end>"

I use regular expression pattern: "<a href=(.+?)><end>" and it gives me "url1><a href=url2>......".
0
Comment
Question by:yeshengl
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
6 Comments
 
LVL 96

Expert Comment

by:Bob Learned
ID: 17998952
I find that parsing HTML is far easier than using regular expressions.

Here is a previous VB.NET question that highlights what I mean:

  http://www.experts-exchange.com/Programming/Programming_Languages/Dot_Net/VB_DOT_NET/Q_21767542.html

I have equivalent C# code.

Bob
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 17999850
Bob, could you post that C# code anyway ?
I'd like to have a play.
I can post a separate Q if you like...topic area of your choice :)
Thanks.
0
 
LVL 96

Accepted Solution

by:
Bob Learned earned 50 total points
ID: 18169812
Sorry, I lost track of this one :(

using System;
using System.Collections;
using System.Threading;
using System.Runtime.InteropServices;
using System.Windows.Forms;

public class HtmlAnchor
{
  public string HRef = "";
  public string Class = "";
  public string Text = "";
}

public class HtmlImage
{
  public string Src = "";
}

public class HtmlDocument
{
  private ArrayList _anchors = new ArrayList();
  private ArrayList _images = new ArrayList();
 
  [ComImport(), Guid("0000010c-0000-0000-C000-000000000046"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
  interface IPersist
  {

    void GetClassID(ref Guid pClassId);
  }
  [ComImport(), Guid("7FD52380-4E07-101B-AE2D-08002B2EC713"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
  interface IPersistStreamInit : IPersist
  {

    new void GetClassID(ref Guid pClassId);

    [PreserveSig()]
    int IsDirty();

    void Load(UCOMIStream pStm);

    void Save(UCOMIStream pStm, [MarshalAs(UnmanagedType.Bool)] bool fClearDirty);

    void GetMaxSize(ref long pCbSize);

    void InitNew();
  }
  private mshtml.HTMLDocument m_document;
  private string m_url = "";

  public HtmlDocument(string url)
  {
    m_url = url;
    Thread thread = new Thread(new ThreadStart(StartGetDocument));
    thread.Start();
    while (m_document == null || m_document.readyState != "complete")
    {
      Application.DoEvents();
    }
    this.FindAnchors(m_document);
  }

  private void StartGetDocument()
  {
    mshtml.HTMLDocument doc = new mshtml.HTMLDocument();
    IPersistStreamInit ips = (IPersistStreamInit)doc;
    ips.InitNew();
    m_document = (mshtml.HTMLDocument)doc.createDocumentFromUrl(m_url, "\0");
  }

  public HtmlDocument(mshtml.HTMLDocument document)
  {
    this.FindAnchors(document);
  }

  private void FindAnchors(mshtml.HTMLDocument document)
  {
    foreach (mshtml.HTMLAnchorElementClass element in document.getElementsByName("a"))
    {
      HtmlAnchor anchor = new HtmlAnchor();
      anchor.HRef = GetAttribute(element, "href");
      anchor.Class = GetAttribute(element, "class");
      anchor.Text = element.innerText;
      _anchors.Add(anchor);
    }
  }

  private void FindImages(mshtml.HTMLDocument document)
  {
    foreach (mshtml.HTMLImgClass element in document.getElementsByName("img"))
    {
      HtmlImage image = new HtmlImage();
      image.Src = GetAttribute(element, "src");
      _images.Add(image);
    }
  }

  private string GetAttribute(mshtml.IHTMLElement element, string attribName)
  {
    if (element.getAttribute(attribName, 0) != null)
    {
      return element.getAttribute(attribName, 0).ToString();
    }
    return "";
  }

  public HtmlAnchor[] Anchors
  {
    get
    {
      return (HtmlAnchor[])(_anchors.ToArray(typeof(HtmlAnchor)));
    }
  }
}

Bob
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 18363488
Thanks for that code, btw :)
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
In this video, viewers will be given step by step instructions on adjusting mouse, pointer and cursor visibility in Microsoft Windows 10. The video seeks to educate those who are struggling with the new Windows 10 Graphical User Interface. Change Cu…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question