• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 243
  • Last Modified:

Regular Expression


Hi Experts!

I need a regex that will filter out the tile of the image

Input : <img src="x.jpg" alt="image" title="mysooper dooper image" />

Output <img src="x.jpg" alt="image" title=" " />
0
dlcnet
Asked:
dlcnet
  • 6
  • 6
  • 5
  • +2
2 Solutions
 
CEHJCommented:
Try
s = s.replaceAll("title=\".*?\"", "title=\"\"");

Open in new window

0
 
for_yanCommented:
                     String ss =   "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" /> ";


Pattern p = Pattern.compile("title=\"(.*?)\"");
        
        Matcher m = p.matcher(ss);
        while (m.find()) {
           System.out.println(m.group(1));
        }

Open in new window


Output:
mysooper dooper image

Open in new window

0
 
CEHJCommented:
You really need the following for tolerance though
s = s.replaceAll("title\\s*=\\s*\".*?\"", "title=\"\"");

Open in new window

0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

 
dlcnetAuthor Commented:
@ CEHJ

 Hi ! I tried both of them and the title of the image is still there :(
0
 
CEHJCommented:
Please show your actual input where it failed
0
 
CEHJCommented:
The following is the output from the code below

<img src="x.jpg" alt="image" title="" />
String s =   "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" /> ";
s = s.replaceAll("title\\s*=\\s*\".*?\"", "title=\"\"");
System.out.println(s);

Open in new window

0
 
käµfm³d 👽Commented:
How about:
String source = "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" />";

String result = source.replaceAll("<img [^>]*)title=\"[^\"]*\"([^>]*)", "$1$2");

Open in new window

0
 
käµfm³d 👽Commented:
Hmmm...  I misread the question  : (

Correction:
String source = "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" />";

String result = source.replaceAll("<img [^>]*title=\")[^\"]*(\"[^>]*)", "$1$2");

Open in new window

0
 
dlcnetAuthor Commented:
@CEHJ

my bad :) it works ... however if I have something like this is crashes:
title="blablal&&
bla
bla
bla"

title spans over multiple lines. I believe after each  line is a CR
0
 
CEHJCommented:
OK. Try
s = s.replaceAll("(?s)title\\s*=\\s*\".*?\"", "title=\"\"");

Open in new window

0
 
käµfm³d 👽Commented:
Although I did forget an opening parentheses, the pattern I posted should account for multiple lines. Corrected paren below:
String result = source.replaceAll("(<img [^>]*title=\")[^\"]*(\"[^>]*)", "$1$2");

Open in new window

0
 
käµfm³d 👽Commented:
@CEHJ

That won't work either unless you turn on single-line mode  : )
0
 
käµfm³d 👽Commented:
Never mind. I missed it  : (
0
 
for_yanCommented:
This works for me; I just tested:

    String ss =   "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" /> ";
       ss= ss.replaceAll("title=\"(.*?)\"","");
        System.out.println(ss);

Open in new window


Output:
<img src="x.jpg" alt="image"  /> 
 

Open in new window

0
 
for_yanCommented:
Or this way if you want the word title= to leave there:

     String ss =   "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" /> ";
       ss= ss.replaceAll("title=\"(.*?)\"","title=\"\"");
        System.out.println(ss);

Open in new window

Output:
<img src="x.jpg" alt="image" title="" /> 

Open in new window

0
 
CEHJCommented:
>>ss= ss.replaceAll("title=\"(.*?)\"","title=\"\"");

The group is redundant and simply creates overhead. The pattern will fail for multiline
0
 
msk_apkCommented:
     String regexString ="title=\"(.*)\"";
            Pattern p = Pattern.compile(regexString);
            String one = "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper dooper image\" />";
            String two = "<img src=\"x.jpg\" alt=\"image\" title=\" \" />";

            Matcher matcher = p.matcher(one);
            if(matcher.find())
            {
                  System.out.println(matcher.group(1));
            }
0
 
msk_apkCommented:
sorry i believe i am repeating the answer. sorry for that.
0
 
for_yanCommented:
This works with multiline title:

             String ss =   "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper "+ System.getProperty("line.seprator") + "dooper image\" /> ";
       ss= ss.replaceAll("title=\"([^\r\n]*?)\"","title=\"\"");
        System.out.println("result: " +  ss);

Open in new window

Output:

result: <img src="x.jpg" alt="image" title="" /> 

Open in new window

0
 
for_yanCommented:
Yes, true, group is not necessary, I first thought that filter out means oppositely to extract;
group is from that time
0
 
for_yanCommented:


The same thing works without group:
                 String ss =   "<img src=\"x.jpg\" alt=\"image\" title=\"mysooper "+ System.getProperty("line.seprator") + "dooper image\" /> ";
       ss= ss.replaceAll("title=\"[^\r\n]*?\"","title=\"\"");
        System.out.println("result without group: " +  ss);

Open in new window


result without group: <img src="x.jpg" alt="image" title="" /> 

Open in new window

0
 
msk_apkCommented:
group is necessary if u would like to get the title name.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

  • 6
  • 6
  • 5
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now