Solved

Replacing string pattern

Posted on 2003-12-04
6
412 Views
Last Modified: 2010-08-05
I am trying to accomplish the following:

In the downloaded HTML source at <download dir>/<filename>.html replace the src attribute of all img tags with relative links to the downloaded image files.
So, for example, the index.html original <img src="hw5/CheckOut.gif"...> should be replaced with the relative <img src="index_html_files/CheckOut.gif"...>. You may assume that all image tags are of the form <img...src="<linked image>"...>. Image tags may also contain alt, width and height attributes in any order (i.e. src attribute could be first last or in between but You really do not care about the other attributes). The src attribute MAY NOT contain any .. relative paths!

And this is what I did so far:
----------------------------
  public static String patternReplace(String htmlWebPage, String subDirName){
    final int FLAGS = Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL ;
    final String REPLACE_PATTERN = "<img\\s+src\\s*=\\s*('|\")(.*?)('|\")";

    Pattern myPattern = Pattern.compile(REPLACE_PATTERN, FLAGS);
    Matcher myMatcher = myPattern.matcher(htmlWebPage);

    StringBuffer buffy = new StringBuffer();
    if(myMatcher.find() ){
      myMatcher.appendReplacement(buffy, subDirName);
    }

    myMatcher.appendTail(buffy);
    System.out.println(buffy.toString());

    String newHtml=buffy.toString();
    return newHtml;
  }
-----------------
final String REPLACE_PATTERN = "<img\\s+src\\s*=\\s*('|\")(.*?)('|\")";
this will find anything with the above pattern, but how do I replace image directory with subDirName that I am passing in?
I am thinking I might have to use Groups, but i don't have much idea how to do it.

0
Comment
Question by:dkim18
  • 5
6 Comments
 
LVL 92

Expert Comment

by:objects
Comment Utility
you want to use the replaceAll() method, and use groups for any parts of the matching sting you need to use.

0
 
LVL 92

Expert Comment

by:objects
Comment Utility
$n is used to insert the nth capturing group.
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 92

Accepted Solution

by:
objects earned 350 total points
Comment Utility
so you'll need to change your regexp to break the src path into path and filename, and replace it with the following (where n is the group number of the filename):

"<img src=\"index_html_files/$n\""
0
 

Author Comment

by:dkim18
Comment Utility
I guess I used a little trick without using group and replaceAll()

  public static String patternReplace(String htmlWebPage, String subDirName, String[] counter){
    final int FLAGS = Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL ;
    final String REPLACE_PATTERN = "<img\\s+src\\s*=\\s*('|\")(.*?)/";
    String replace_str = "<img src=\"" + subDirName + "/";
    Pattern myPattern = Pattern.compile(REPLACE_PATTERN, FLAGS);
    Matcher myMatcher = myPattern.matcher(htmlWebPage);

    StringBuffer buffy = new StringBuffer();
    for(int i = 0; i < counter.length ; i++){
      if (myMatcher.find()) {
        myMatcher.appendReplacement(buffy, replace_str);
      }
    }

    myMatcher.appendTail(buffy);

It couldn't get the all the concepts, so this is fine for now.
Thanks anyway...
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
As long as you achieved your goal :)

http://www.objects.com.au/staff/mick
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Suggested Solutions

After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now