headzoo
asked on
RegExp Question
Howdy,
I'm pretty comfortable with regexps, but I don't know how to do this: Using PHP's preg_match(_all) to parse a SQL SELECT statement. Specifically, a SELECT statement can pretty much have an unlimited number of AND/OR clauses. So how do you write a regexp that can handle zero or more AND/OR statements? Keep in mind I need to capture back references for each one.
So here is the regexp I use to get a one or none AND/OR statements:
((AND|OR) ([a-zA-Z0-9\-_]+) ([=<>\!]+) \'([a-zA-Z0-9\-_]+)\')?
This is part of a larger regexp, but it's the part that stumps me. So that's fine for a SQL query like this:
SELECT column FROM table
WHERE column = 'something'
AND column = 'something'
But what if I have a query like this
SELECT column FROM table
WHERE column = 'something'
AND column = 'something'
AND column = 'something'
OR column = 'something'
AND column 'something'
That's a ridiculous query, but it gets the point across that the query may have zero, one, or more AND/OR statements.
I'm pretty comfortable with regexps, but I don't know how to do this: Using PHP's preg_match(_all) to parse a SQL SELECT statement. Specifically, a SELECT statement can pretty much have an unlimited number of AND/OR clauses. So how do you write a regexp that can handle zero or more AND/OR statements? Keep in mind I need to capture back references for each one.
So here is the regexp I use to get a one or none AND/OR statements:
((AND|OR) ([a-zA-Z0-9\-_]+) ([=<>\!]+) \'([a-zA-Z0-9\-_]+)\')?
This is part of a larger regexp, but it's the part that stumps me. So that's fine for a SQL query like this:
SELECT column FROM table
WHERE column = 'something'
AND column = 'something'
But what if I have a query like this
SELECT column FROM table
WHERE column = 'something'
AND column = 'something'
AND column = 'something'
OR column = 'something'
AND column 'something'
That's a ridiculous query, but it gets the point across that the query may have zero, one, or more AND/OR statements.
ASKER
I tried that, but it didn't work. The regexp matches, but it doesn't capture the back references in the second, third, etc, AND/OR statements. I'm not worried about all that other wacky things that could be in an SQL query. This is more about me just being curious how something like this works.
You can also use preg_split with PREG_SPLIT_DELIM_CAPTURE. That always works out nice in this kind of situations.
-r-
-r-
ASKER
What I was thinking is using a regexp to bet all the AND/OR statements in one back reference, and then run another one to break it up. So if the query looks like this:
SELECT column FROM table
WHERE column = 'something'
AND column = 'something'
OR column = 'something'
LIMIT 1
I could get just the "AND column = 'something' OR column = 'something'", and then run another regexp on that. I'll have to look into it.
SELECT column FROM table
WHERE column = 'something'
AND column = 'something'
OR column = 'something'
LIMIT 1
I could get just the "AND column = 'something' OR column = 'something'", and then run another regexp on that. I'll have to look into it.
how's about using preg_match_all y'all?
$regex = "((AND|OR)\\s+([a-zA-Z0-9\ -_]+)\\s*( [=<>\!]+)\ \s*\'?([a- zA-Z0-9\-_ ]+)\'?(\\s +|$))";
$num_of_matches = preg_match_all($regex, $query, $matches);
print_r($matches);
$regex = "((AND|OR)\\s+([a-zA-Z0-9\
$num_of_matches = preg_match_all($regex, $query, $matches);
print_r($matches);
of course you need to make the regex correct :S
$regex = "/((AND|OR)\\s+([a-zA-Z0-9 \-_]+)\\s* ([=<>\!]+) \\s*\'?([a -zA-Z0-9\- _]+)\'?(\\ s+|$))/is" ;
$regex = "/((AND|OR)\\s+([a-zA-Z0-9
here's my more complete version:
<pre>
<?php
$query = "SELECT * FROM `tablename` WHERE `foo`='bar' AND `bar`=5 OR `cheese`='nice' AND beer='too'";
$regex = "/(WHERE|AND|OR)\\s+`?(\\w +)`?\\s*=\ \s*'?(\\w+ )'?(\\s*|$ )/is";
preg_match_all($regex, $query, $matches, PREG_SET_ORDER);
print_r($matches);
?>
</pre>
test it here:
http://www.keteracel.com/test/test681.php
cheers,
keteracel
<pre>
<?php
$query = "SELECT * FROM `tablename` WHERE `foo`='bar' AND `bar`=5 OR `cheese`='nice' AND beer='too'";
$regex = "/(WHERE|AND|OR)\\s+`?(\\w
preg_match_all($regex, $query, $matches, PREG_SET_ORDER);
print_r($matches);
?>
</pre>
test it here:
http://www.keteracel.com/test/test681.php
cheers,
keteracel
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Have you concidered working your code out locally before posting keteracel?
It is nice to view your progress and incremental code, but might be easier for the Questioner to keep track of what is happening. From the emails he received he could have deducted that there are 4 more alternative solutions, which isn't the case.
-r-
It is nice to view your progress and incremental code, but might be easier for the Questioner to keep track of what is happening. From the emails he received he could have deducted that there are 4 more alternative solutions, which isn't the case.
-r-
Well, now I have another email from your flame ;)
<?
die("lol");
?>
die("lol");
?>
Of course, it becomes even more complicated when you consider that you need to handle parentheses inside the query:
SELECT column FROM table WHERE condition1 AND (condition2 OR condition3);
In that case, you turn to something like BNF.