paries
asked on
regular expression optional whitespace
Hi experts,
From the following block of text, i am trying to retrieve 9ismyanswer using a regular expression:
The part that I am having trouble with is I want to still get a match if there is any number of spaces (including 0) between rooms and </strong>
Here is what I've got so far, and it works as long as there is exactly one space after rooms:
pattern:
I've tried the following, but neither have worked. Basically, I'm looking for a wildcard for any number of whitespaces.
Thanks for your help
From the following block of text, i am trying to retrieve 9ismyanswer using a regular expression:
rooms </strong> <i>9ismyanswer</i></td>
The part that I am having trouble with is I want to still get a match if there is any number of spaces (including 0) between rooms and </strong>
Here is what I've got so far, and it works as long as there is exactly one space after rooms:
pattern:
(?<=rooms </strong> <i>)([\s\S]*?)(?=</i></td>)
I've tried the following, but neither have worked. Basically, I'm looking for a wildcard for any number of whitespaces.
(?<=rooms[\s]*</strong> <i>)([\s\S]*?)(?=</i></td>)
(?<=rooms\s*</strong> <i>)([\s\S]*?)(?=</i></td>)
Thanks for your help
\s* should have worked. what was the text that failed?
Will this not work?
/.*<i>([^<]*).*/$1/
/.*<i>([^<]*).*/$1/
Your reg exp didn't work because variable length look-behind is not yet implemented in PHP regex engine. Try this one:
(?<=<i>)[\w]+(?=</i></td>)
preg_match("/(?<=<i>)[\w]+ (?=<\/i><\ /td>)/", "rooms </strong> <i>9ismyans wer</i></t d>");
Bye
(?<=<i>)[\w]+(?=</i></td>)
preg_match("/(?<=<i>)[\w]+
Bye
It doesn't need to be a look-behind
(?: should suffice rather than (?<=
(?: should suffice rather than (?<=
But he used look-behind in his regex and in that regex (?: doesn't work.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Try this one too.
(rooms)[\s]*(</strong>)&nb sp;<i>([\s \S]*?)(?=< /i></td>)
This would be the result
Group(0) = rooms </strong> <i>9ismyans wer
Group(1) = rooms
Group(2) = </strong>
Group(3) = 9ismyanswer
You can enter any number of spaces between rooms and </strong>
(rooms)[\s]*(</strong>)&nb
This would be the result
Group(0) = rooms </strong> <i>9ismyans
Group(1) = rooms
Group(2) = </strong>
Group(3) = 9ismyanswer
You can enter any number of spaces between rooms and </strong>
@vks_vicky no one of your regexp gives desired result nor in Regex Coach nor in Espresso.
@paries: you have here two valid solution:
1: the mine one posted in ID 33669193: (?<=<i>)[\w]+(?=</i></td>)
2: as suggested by ozo (?:[\w]+)(?=</i></td>) withou using look-behind
Hope this helps
@paries: you have here two valid solution:
1: the mine one posted in ID 33669193: (?<=<i>)[\w]+(?=</i></td>)
2: as suggested by ozo (?:[\w]+)(?=</i></td>) withou using look-behind
Hope this helps
@margusG, I'm using RegEx Tester a plugin for Eclipse.
Also I've tested the expression with an online tester
http://www.regexplanet.com/simple/index.html.
Please check with them I get results, I'll try to check with Regex Coach or Espresso and let you know
Also I've tested the expression with an online tester
http://www.regexplanet.com/simple/index.html.
Please check with them I get results, I'll try to check with Regex Coach or Espresso and let you know
As the thread of comments illustrates, REGEX is complicated and often hard to get right. Tangentially related: There are literally thousands of REGEX patterns published on the WWW that purport to validate an email address, and almost all of them are wrong in one way or another. The take-away message is that REGEX can be a powerful tool for good, or for wasting your debugging time!
Try running this little script.
Try running this little script.
<?php // RAY_temp_paries.php
error_reporting(E_ALL);
// TEST DATA FROM THE POST AT EE
$str = 'rooms </strong> <i>9ismyanswer</i></td>';
// PROCESS THE TEST DATA
echo pluck($str, 'i');
// A FUNCTION TO PLUCK OUT THE INFORMATION BETWEEN TAGS
function pluck($string, $tag, $case_sensitive=FALSE)
{
// FORMAT THE SEARCH ARGUMENTS
$open_tag = '<' . $tag . '>';
$clos_tag = '</' . $tag . '>';
// COPY THE ORIGINAL STRING
$str = $string;
// IF CASE-INSENSITIVE SEARCH
if (!$case_sensitive)
{
$str = strtoupper($string);
$open_tag = strtoupper($open_tag);
$clos_tag = strtoupper($clos_tag);
}
// FIND THE LOCATIONS OR RETURN FALSE IF NOT PARSABLE
$a = strpos($str, $open_tag);
if ($a === FALSE) return FALSE;
$z = strpos($str, $clos_tag);
if ($z === FALSE) return FALSE;
// RETURN THE DATA FROM THE ORIGINAL STRING
return substr($string, $a+strlen($open_tag), $z);
}
I'm very sorry, vks, I don't want to pick your solutions, but I tested your two regexp at regexplanet and result is Matches = No. With regex Tester (plugin for firefox) it says that is not a valid reg exp. Maybe I miss something, but really don't know where I'm wrong with these tests...
@marqusG, I'm not sure what you are doing and how you are trying it out.
I'm attaching a screenshot of regexplanet.
And I'm just trying to help. Its ur choice whether u use my solution or not!!
Screen-shot-2010-09-14-at-4.37.2.png
I'm attaching a screenshot of regexplanet.
And I'm just trying to help. Its ur choice whether u use my solution or not!!
Screen-shot-2010-09-14-at-4.37.2.png
I pray you to excuse me. Really I had not seen (for my inattention) group 0, group 1 and so on to the right!!! I had only seen matches() No (I don't uderstand so well what they mean with this).
I didn't wanto to drive you mad: your solution wroks fine as the others.
Best
I didn't wanto to drive you mad: your solution wroks fine as the others.
Best
You can read more about it @
http://www.regular-expressions.info/brackets.html
The section "Backtracking Into Capturing Groups" & "Backreferences to Failed Groups", tells you why the match was "No" and the groups are available.
http://www.regular-expressions.info/brackets.html
The section "Backtracking Into Capturing Groups" & "Backreferences to Failed Groups", tells you why the match was "No" and the groups are available.
@vks: thanks for links. But still have a question for you: how you use grouping in PHP? I have used this
[code]<?php
$str = "rooms </strong> <i>9ismyans wer</i></t d>";
$regex = "(rooms)[\s]*(</strong>)&n bsp;<i>([\ s\S]*?)(?= </i></td>) ";
preg_match_all("/(rooms)[\ s]*(</stro ng>) <i>([\s\S] *?)(?=</i> </td>)/", $str, $matches);
echo "<pre>";
var_dump($matches);
echo "</pre>";
?>[/code]
But result is NULL
[code]<?php
$str = "rooms </strong> <i>9ismyans
$regex = "(rooms)[\s]*(</strong>)&n
preg_match_all("/(rooms)[\
echo "<pre>";
var_dump($matches);
echo "</pre>";
?>[/code]
But result is NULL
@marqusG
You are using forward slash as your delimiter, but not escaping the ones you are using in your pattern. Try changing your delimiter or escaping your internal forward slashes:
You are using forward slash as your delimiter, but not escaping the ones you are using in your pattern. Try changing your delimiter or escaping your internal forward slashes:
<?php
$str = "rooms </strong> <i>9ismyanswer</i></td>";
$regex = "(rooms)[\s]*(</strong>) <i>([\s\S]*?)(?=</i></td>)";
preg_match_all("#(rooms)[\s]*(</strong>) <i>([\s\S]*?)(?=</i></td>)#", $str, $matches);
echo "<pre>";
var_dump($matches);
echo "</pre>";
?>
untitled.JPG
Sometimes I feel stupid...:-(
I like to think we're all here to learn :D
ASKER
Thanks for the help everybody.vks_vicky's did exactly what I was looking for. I learned quite a bit from the discussion too.