Solved

regular expression optional whitespace

Posted on 2010-09-13
19
662 Views
Last Modified: 2012-05-10
Hi experts,

From the following block of text, i am trying to retrieve 9ismyanswer using a regular expression:

rooms </strong>&nbsp;<i>9ismyanswer</i></td>

Open in new window


 The part that I am having trouble with is I want to still get a match if there is any number of spaces (including 0) between rooms and </strong>

Here is what I've got so far, and it works as long as there is exactly one space after rooms:

pattern:
(?<=rooms </strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)

Open in new window


I've tried the following, but neither have worked.  Basically, I'm looking for a wildcard for any number of whitespaces.

(?<=rooms[\s]*</strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)
(?<=rooms\s*</strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)

Open in new window


Thanks for your help
0
Comment
Question by:paries
  • 7
  • 5
  • 2
  • +4
19 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 33669040
\s* should have worked.  what was the text that failed?
0
 
LVL 9

Expert Comment

by:rfportilla
ID: 33669155
Will this not work?

/.*<i>([^<]*).*/$1/
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33669193
Your reg exp didn't work because variable length look-behind is not yet implemented in PHP regex engine. Try this one:

(?<=<i>)[\w]+(?=</i></td>)

preg_match("/(?<=<i>)[\w]+(?=<\/i><\/td>)/", "rooms </strong>&nbsp;<i>9ismyanswer</i></td>");

Bye
0
 
LVL 84

Expert Comment

by:ozo
ID: 33669203
It doesn't need to be a look-behind
(?: should suffice rather than (?<=
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33669231
But he used look-behind in his regex and in that regex (?: doesn't work.
0
 
LVL 5

Accepted Solution

by:
vks_vicky earned 500 total points
ID: 33669821
You could try this

(?<=rooms\s{2}</strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)

Where {2} is the number of white spaces after rooms, but you cannot give {*} because its a Illegal repetition, hope this helps.
0
 
LVL 5

Expert Comment

by:vks_vicky
ID: 33669879
Try this one too.


(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)

This would be the result

Group(0) = rooms </strong>&nbsp;<i>9ismyanswer
Group(1) = rooms
Group(2) = </strong>
Group(3) = 9ismyanswer

You can enter any number of spaces between rooms and </strong>
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33670605
@vks_vicky no one of your regexp gives desired result nor in Regex Coach nor in Espresso.

@paries: you have here two valid solution:

1: the mine one posted in ID 33669193: (?<=<i>)[\w]+(?=</i></td>)

2: as suggested by ozo (?:[\w]+)(?=</i></td>) withou using look-behind

Hope this helps
0
 
LVL 5

Expert Comment

by:vks_vicky
ID: 33670829
@margusG, I'm using RegEx Tester a plugin for Eclipse.

Also I've tested the expression with an online tester

http://www.regexplanet.com/simple/index.html.

Please check with them I get results, I'll try to check with Regex Coach or Espresso and let you know
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 108

Expert Comment

by:Ray Paseur
ID: 33670946
As the thread of comments illustrates, REGEX is complicated and often hard to get right.  Tangentially related: There are literally thousands of REGEX patterns published on the WWW that purport to validate an email address, and almost all of them are wrong in one way or another.  The take-away message is that REGEX can be a powerful tool for good, or for wasting your debugging time!

Try running this little script.
<?php // RAY_temp_paries.php
error_reporting(E_ALL);

// TEST DATA FROM THE POST AT EE
$str = 'rooms </strong>&nbsp;<i>9ismyanswer</i></td>';

// PROCESS THE TEST DATA
echo pluck($str, 'i');

// A FUNCTION TO PLUCK OUT THE INFORMATION BETWEEN TAGS
function pluck($string, $tag, $case_sensitive=FALSE)
{
    // FORMAT THE SEARCH ARGUMENTS
    $open_tag = '<'  . $tag . '>';
    $clos_tag = '</' . $tag . '>';

    // COPY THE ORIGINAL STRING
    $str = $string;

    // IF CASE-INSENSITIVE SEARCH
	if (!$case_sensitive)
	{
	    $str      = strtoupper($string);
	    $open_tag = strtoupper($open_tag);
	    $clos_tag = strtoupper($clos_tag);
	}

    // FIND THE LOCATIONS OR RETURN FALSE IF NOT PARSABLE
    $a = strpos($str, $open_tag);
    if ($a === FALSE) return FALSE;
    $z = strpos($str, $clos_tag);
    if ($z === FALSE) return FALSE;

    // RETURN THE DATA FROM THE ORIGINAL STRING
    return substr($string, $a+strlen($open_tag), $z);
}

Open in new window

0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33670958
I'm very sorry, vks, I don't want to pick your solutions, but I tested your two regexp at regexplanet and result is Matches = No. With regex Tester (plugin for firefox) it says that is not a valid reg exp. Maybe I miss something, but really don't know where I'm wrong with these tests...
0
 
LVL 5

Expert Comment

by:vks_vicky
ID: 33670980
@marqusG, I'm not sure what you are doing and how you are trying it out.

I'm attaching a screenshot of regexplanet.

And I'm just trying to help. Its ur choice whether u use my solution or not!!
Screen-shot-2010-09-14-at-4.37.2.png
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33671057
I pray you to excuse me. Really I had not seen (for my inattention) group 0, group 1 and so on to the right!!! I had only seen matches() No (I don't uderstand so well what they mean with this).
I didn't wanto to drive you mad: your solution wroks fine as the others.

Best
0
 
LVL 5

Expert Comment

by:vks_vicky
ID: 33671207
You can read more about it @

http://www.regular-expressions.info/brackets.html

The section "Backtracking Into Capturing Groups" & "Backreferences to Failed Groups", tells you why the match was "No" and the groups are available.
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33671979
@vks: thanks for links. But still have a question for you: how you use grouping in PHP? I have used this
[code]<?php
$str = "rooms </strong>&nbsp;<i>9ismyanswer</i></td>";
$regex = "(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)";
preg_match_all("/(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)/", $str, $matches);
echo "<pre>";
var_dump($matches);
echo "</pre>";
?>[/code]

But result is NULL
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 33672178
@marqusG

You are using forward slash as your delimiter, but not escaping the ones you are using in your pattern. Try changing your delimiter or escaping your internal forward slashes:
<?php
	$str = "rooms </strong>&nbsp;<i>9ismyanswer</i></td>";
	$regex = "(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)";
	preg_match_all("#(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)#", $str, $matches);
	echo "<pre>";
	var_dump($matches);
	echo "</pre>";
?>

Open in new window

untitled.JPG
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33672199
Sometimes I feel stupid...:-(
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 33672205
I like to think we're all here to learn  :D
0
 

Author Closing Comment

by:paries
ID: 33678554
Thanks for the help everybody.vks_vicky's did exactly what I was looking for.  I learned quite a bit from the discussion too.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now