• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 386
  • Last Modified:

Can a regex be conditional (pickup this or that)

I'm building a small application to allow me to gather sell price on ebay, I divided the script in 5 different preg_match_all section, one for each of the 5 data I want to pull

1) title
2) item number
3) bids
4) price
5) date

with the help of other Experts here I've been able to filter only the sold items, everything work pretty good but there are a few glitches and my script needs some fine tuning, here is an example;

             preg_match_all("'<td class=\"prc bidsold g-b\">(.*?)</td>'si", $source, $price_arr);
         foreach($price_arr[1] as $price)
                  echo "<tr><td><input type=\"text\" name=\"price[]\" size=\"10\" value=\"".$price."\"></td>";
This script allows me to get the sold price of an item, the line to scrape look like this

<td class=\"bids\"><div class=\"bin1\">5 Bids</div><span class=\"sold\">Sold</span></td><td class=\"prc bidsold g-b\">$300.00</td><td class=\"tme  rt\">

But if the sellers offer free shipping than the line look like this

<td class=\"bids\"><div class=\"bin1\">1 Bid</div><span class=\"sold\">Sold</span></td><td class=\"prc\"><div class=\"bidsold g-b\">$10.76</div><span class=\"tfsp\">Free shipping</span></td><td class=\"tme  rt\">

As you can see the "prc bidsold g-b" that I used to find inside a td tag is now split, the "prc" part is still inside a td tag but the "bidsold g-b" is now inside a div tag.

So the question is how can I modify the preg_match_all regex to pickup both instance, I tried different approach and the best I could get was an empty cell in my return table, as it is I don't even get anything return if the seller offered free shipping.

If at all possible I'd like a detail explanation because I have a few other similar situation that I need to address (like if the title is bold I don't pick it up, but that is another question).

  • 3
1 Solution
käµfm³d 👽Commented:
The simplest solution would probably be to just add a new condition for the separate scenario, using a vertical bar ( OR ) to separate the conditions. For the following, I just created a pattern to match the "free shipping" condition. It is pretty much the same as what you had before, just with <div> and the altered class attributes.
preg_match_all("'(?:<td class=\"prc bidsold g-b\">|<div class=\"bidsold g-b\">)(.*?)(?:</td>|</div>)'si", $source, $price_arr);

Open in new window

käµfm³d 👽Commented:

The (?: ... ) is a non-capturing grouping construct used to provide an internal boundary for the OR condition.
gamebitsAuthor Commented:
Awesome, works perfect, are you up for more points, I can post 2 mores questions similar to this in the mean time you may want to have a look at my other question still open.
käµfm³d 👽Commented:
I'm always game to help out  :)
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now