regular expression optional whitespace

Hi experts,

From the following block of text, i am trying to retrieve 9ismyanswer using a regular expression:

rooms </strong>&nbsp;<i>9ismyanswer</i></td>

Open in new window


 The part that I am having trouble with is I want to still get a match if there is any number of spaces (including 0) between rooms and </strong>

Here is what I've got so far, and it works as long as there is exactly one space after rooms:

pattern:
(?<=rooms </strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)

Open in new window


I've tried the following, but neither have worked.  Basically, I'm looking for a wildcard for any number of whitespaces.

(?<=rooms[\s]*</strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)
(?<=rooms\s*</strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)

Open in new window


Thanks for your help
pariesAsked:
Who is Participating?
 
vks_vickyConnect With a Mentor Commented:
You could try this

(?<=rooms\s{2}</strong>&nbsp;<i>)([\s\S]*?)(?=</i></td>)

Where {2} is the number of white spaces after rooms, but you cannot give {*} because its a Illegal repetition, hope this helps.
0
 
ozoCommented:
\s* should have worked.  what was the text that failed?
0
 
rfportillaCommented:
Will this not work?

/.*<i>([^<]*).*/$1/
0
Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

 
Marco GasiFreelancerCommented:
Your reg exp didn't work because variable length look-behind is not yet implemented in PHP regex engine. Try this one:

(?<=<i>)[\w]+(?=</i></td>)

preg_match("/(?<=<i>)[\w]+(?=<\/i><\/td>)/", "rooms </strong>&nbsp;<i>9ismyanswer</i></td>");

Bye
0
 
ozoCommented:
It doesn't need to be a look-behind
(?: should suffice rather than (?<=
0
 
Marco GasiFreelancerCommented:
But he used look-behind in his regex and in that regex (?: doesn't work.
0
 
vks_vickyCommented:
Try this one too.


(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)

This would be the result

Group(0) = rooms </strong>&nbsp;<i>9ismyanswer
Group(1) = rooms
Group(2) = </strong>
Group(3) = 9ismyanswer

You can enter any number of spaces between rooms and </strong>
0
 
Marco GasiFreelancerCommented:
@vks_vicky no one of your regexp gives desired result nor in Regex Coach nor in Espresso.

@paries: you have here two valid solution:

1: the mine one posted in ID 33669193: (?<=<i>)[\w]+(?=</i></td>)

2: as suggested by ozo (?:[\w]+)(?=</i></td>) withou using look-behind

Hope this helps
0
 
vks_vickyCommented:
@margusG, I'm using RegEx Tester a plugin for Eclipse.

Also I've tested the expression with an online tester

http://www.regexplanet.com/simple/index.html.

Please check with them I get results, I'll try to check with Regex Coach or Espresso and let you know
0
 
Ray PaseurCommented:
As the thread of comments illustrates, REGEX is complicated and often hard to get right.  Tangentially related: There are literally thousands of REGEX patterns published on the WWW that purport to validate an email address, and almost all of them are wrong in one way or another.  The take-away message is that REGEX can be a powerful tool for good, or for wasting your debugging time!

Try running this little script.
<?php // RAY_temp_paries.php
error_reporting(E_ALL);

// TEST DATA FROM THE POST AT EE
$str = 'rooms </strong>&nbsp;<i>9ismyanswer</i></td>';

// PROCESS THE TEST DATA
echo pluck($str, 'i');

// A FUNCTION TO PLUCK OUT THE INFORMATION BETWEEN TAGS
function pluck($string, $tag, $case_sensitive=FALSE)
{
    // FORMAT THE SEARCH ARGUMENTS
    $open_tag = '<'  . $tag . '>';
    $clos_tag = '</' . $tag . '>';

    // COPY THE ORIGINAL STRING
    $str = $string;

    // IF CASE-INSENSITIVE SEARCH
	if (!$case_sensitive)
	{
	    $str      = strtoupper($string);
	    $open_tag = strtoupper($open_tag);
	    $clos_tag = strtoupper($clos_tag);
	}

    // FIND THE LOCATIONS OR RETURN FALSE IF NOT PARSABLE
    $a = strpos($str, $open_tag);
    if ($a === FALSE) return FALSE;
    $z = strpos($str, $clos_tag);
    if ($z === FALSE) return FALSE;

    // RETURN THE DATA FROM THE ORIGINAL STRING
    return substr($string, $a+strlen($open_tag), $z);
}

Open in new window

0
 
Marco GasiFreelancerCommented:
I'm very sorry, vks, I don't want to pick your solutions, but I tested your two regexp at regexplanet and result is Matches = No. With regex Tester (plugin for firefox) it says that is not a valid reg exp. Maybe I miss something, but really don't know where I'm wrong with these tests...
0
 
vks_vickyCommented:
@marqusG, I'm not sure what you are doing and how you are trying it out.

I'm attaching a screenshot of regexplanet.

And I'm just trying to help. Its ur choice whether u use my solution or not!!
Screen-shot-2010-09-14-at-4.37.2.png
0
 
Marco GasiFreelancerCommented:
I pray you to excuse me. Really I had not seen (for my inattention) group 0, group 1 and so on to the right!!! I had only seen matches() No (I don't uderstand so well what they mean with this).
I didn't wanto to drive you mad: your solution wroks fine as the others.

Best
0
 
vks_vickyCommented:
You can read more about it @

http://www.regular-expressions.info/brackets.html

The section "Backtracking Into Capturing Groups" & "Backreferences to Failed Groups", tells you why the match was "No" and the groups are available.
0
 
Marco GasiFreelancerCommented:
@vks: thanks for links. But still have a question for you: how you use grouping in PHP? I have used this
[code]<?php
$str = "rooms </strong>&nbsp;<i>9ismyanswer</i></td>";
$regex = "(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)";
preg_match_all("/(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)/", $str, $matches);
echo "<pre>";
var_dump($matches);
echo "</pre>";
?>[/code]

But result is NULL
0
 
käµfm³d 👽Commented:
@marqusG

You are using forward slash as your delimiter, but not escaping the ones you are using in your pattern. Try changing your delimiter or escaping your internal forward slashes:
<?php
	$str = "rooms </strong>&nbsp;<i>9ismyanswer</i></td>";
	$regex = "(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)";
	preg_match_all("#(rooms)[\s]*(</strong>)&nbsp;<i>([\s\S]*?)(?=</i></td>)#", $str, $matches);
	echo "<pre>";
	var_dump($matches);
	echo "</pre>";
?>

Open in new window

untitled.JPG
0
 
Marco GasiFreelancerCommented:
Sometimes I feel stupid...:-(
0
 
käµfm³d 👽Commented:
I like to think we're all here to learn  :D
0
 
pariesAuthor Commented:
Thanks for the help everybody.vks_vicky's did exactly what I was looking for.  I learned quite a bit from the discussion too.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.