Solved

What do the following reg ex's look like?

Posted on 2006-10-21
8
212 Views
Last Modified: 2010-03-31
Hello,
I need to pattern match the following html tags? Please note * means that it could be anything in between.

<head>*</head>
<title>*</title>
<body *>
<table *>


Worth 500 points.

Thanks,
Rick
0
Comment
Question by:richardsimnett
8 Comments
 
LVL 24

Expert Comment

by:sciuriware
ID: 17782952
As regular expressions are applied to a single line and
your first two examples usually extend over multiple lines
there is no perfect solution to those.

In REGEX a .*  could replace your *

;JOOP!
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 17782992
Try something like

      public static String matchTag(String tag, String toMatch) {
            String result = null;
            StringBuilder sb = new StringBuilder("(?ims)").append(tag).insert(tag.length() + 5,"(?: [^>]*)*").append("(.*?)").append(tag).insert(tag.length() + 23, "/");
            Pattern p = Pattern.compile(sb.toString());
            Matcher m = p.matcher(toMatch);
            if (m.find()) {
                  result = m.group(1);
            }
            return result;
      }
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 17782993
(You would pass the tag as <body>, <title> etc)
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 
LVL 2

Expert Comment

by:avsrivastava
ID: 17783018
This code will match the longest string(multiple occurrences will show up though).
If you want to match each of the tags, make 4 patterns and then match the input string against each.

 
            Pattern pattern = Pattern.compile("<html>.*</html>|<title>.*</title>|<body .*>|<table .*>");
            Matcher matcher = pattern.matcher(inputString);//inputString is the one in which you want to find the pattern
            boolean found = false;
            while (matcher.find()) {
                System.out.println("I found the text \""+matcher.group()+"\" starting at " +
                   "index "+matcher.start()+" and ending at index "+matcher.end()+"\n");
                found = true;
            }
            if(!found){
                System.out.println("No match found.\n");
0
 

Author Comment

by:richardsimnett
ID: 17783502
Ok guys I see what your saying but couldnt I just do something like this then? (This is what I currently have, but it doesnt work).

html = html.replaceFirst("<head>.*</head>", replaceHead());

Basically the intent is to replace the entire <head> tag and all text contained with it, and replace it with the head generated by replaceHead().  It doesnt seem to ever match. I have also tried it with this variation:

message = message.replaceFirst("/<head>.*</head>/gis", randomHead());


Neither have worked.

Thanks,
Rick
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 17783541
Have you run the code i posted?
0
 

Author Comment

by:richardsimnett
ID: 17786352
CEHJ,
Yes I just got done testing it... worked great for the <head> tags. Had to change it a little bit to make it do replacements, and I had to make a seperate function to deal with the <table> and <body> tags.. but it also is based on your suggestion.

here are the new functions:

public String replaceTag(String tag, String toMatch,String replacement)
     {
          String result = null;
          StringBuilder sb = new StringBuilder("(?ims)").append(tag).insert(tag.length() + 5,"(?: [^>]*)*").append("(.*?)").append(tag).insert(tag.length() + 23, "/");
          Pattern p = Pattern.compile(sb.toString());
          //cfg.writeLog("Pattern: " + sb.toString());
          Matcher m = p.matcher(toMatch);
          result = m.replaceFirst(replacement);
          return result;
     }
   
     public String replaceTagHead(String tag, String toMatch, String replacement)
     {
         String result = null;
         StringBuilder sb = new StringBuilder("(?ims)").append(tag).insert(tag.length() + 5,"(?: [^>]*)*"); //.append("(.*?)").append(tag).insert(tag.length() + 23, "/");
         Pattern p = Pattern.compile(sb.toString());
         //cfg.writeLog("Tag Head Pattern: " + sb.toString());
         Matcher m = p.matcher(toMatch);
         result = m.replaceFirst(replacement);
         
         return result;
     }

Thanks for the Help!

Rick
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 17786730
Well done. At first glance though, the above two methods look the same, and indeed should be the same theoretically, as should any tag replacement you're doing..?
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
mockito example issue 8 72
Where to store the queries for modification of table 4 61
sql import cannot be resolved jsp 3 16
Java: anonymous class 4 16
For customizing the look of your lightweight component and making it look lucid like it was made of glass. Or: how to make your component more Apple-ish ;) This tip assumes your component to be of rectangular shape and completely opaque. (COD…
After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
Viewers learn how to read error messages and identify possible mistakes that could cause hours of frustration. Coding is as much about debugging your code as it is about writing it. Define Error Message: Line Numbers: Type of Error: Break Down…

815 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now