[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

regex html div

Posted on 2006-10-23
3
Medium Priority
?
711 Views
Last Modified: 2009-12-16
I have read a html file into one long string.  

I need a regex that gets me everything between <div id="change"> and its matching closing </div>.

There is also the possibility of there being other DIV's inbetween this one.
0
Comment
Question by:cophi
  • 2
3 Comments
 
LVL 6

Expert Comment

by:VovinE
ID: 17790188
This is not possible with regular expressions.
If there were no closing div's inside your div, the regular expression would look something liket this:

"<div id=\"change\">[A-Za-z<>/\\"' \t\n\r]*?</div>"

If you need to match against closing div's also, then regular expression is not the way to get it :)
0
 
LVL 22

Expert Comment

by:_TAD_
ID: 17790435

HTML, if properly coded, is really just a series of XML fields.

Try reading your html file as if it were XML data and then use xPath navigation to find the div tag with the proper attribute.
0
 
LVL 6

Accepted Solution

by:
VovinE earned 2000 total points
ID: 17826740
Using regular expressions you might write a parser that does this for you.

Here is sample parser which does what you want:

    public class DivExtractor
    {
        private Regex regex;
        public DivExtractor()
        {
            regex = new Regex("(<div([^>]*)>)|(</div>)|(.)?", RegexOptions.IgnoreCase);
        }

        public string GetDiv(string content)
        {
            string[] r = GetDivs(content);
            if (r.Length > 0)
                return r[0];
            else
                return "";

        }
        public string[] GetDivs(string content)
        {
            extractions = new List<int>();
            level = 0;
            regex.Replace(content, DivMatched);
            List<string> results = new List<string>();
            int i = 0;
            while (i < extractions.Count-1)
            {
                int start = extractions[i++];
                int len = extractions[i++] - start;
                results.Add(content.Substring(start, len));
            }
            return results.ToArray();
        }

        private List<int> extractions;
        private int level;

        private string DivMatched(Match m)
        {
            if (m.Groups[1].Success)
            {
                if (level > 0)
                    level++;
                else if (m.Groups[2].Value.Contains("id=\"change\""))
                {
                    extractions.Add(m.Index + m.Length); // store starting extraction position
                    level++;
                }
            }
            else if (m.Groups[3].Success && level > 0)
            {
                level--;
                if (level == 0)
                {
                    extractions.Add(m.Index); // store closing index position
                }
            }
            return "";
        }
    }


Usage is very simple:

new DivExtractor().GetDiv(html)  // to extract single (first?) div content (without the div tags)

0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Hello there! As a developer I have modified and refactored the unit tests which was written by fellow developers in the past. On the course, I have gone through various misconceptions and technical challenges when it comes to implementation. I would…
We’ve all felt that sense of false security before—locking down external access to a database or component and feeling like we’ve done all we need to do to secure company data. But that feeling is fleeting. Attacks these days can happen in many w…
When cloud platforms entered the scene, users and companies jumped on board to take advantage of the many benefits, like the ability to work and connect with company information from various locations. What many didn't foresee was the increased risk…
Suggested Courses
Course of the Month19 days, 11 hours left to enroll

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question