?
Solved

Regular Expression for xml extraction

Posted on 2003-03-20
4
Medium Priority
?
242 Views
Last Modified: 2013-11-19
I have XML files of the following form:

<a>
.
.
.
<b>
<c>variable-length multi-line markup</c>
<c>variable-length multi-line markup</c>
<c>variable-length multi-line markup</c>
.
.
.
<c>variable-length multi-line markup</c>
</b>
.
.
.
</a>

I'm trying to find the easiest way to extract the c elements into an array. I tried using

preg_match_all("'(\<c\>)[\s\S]*(\<\/c\>)'",$xml,$matches);

but $matches[0][0] just grabs everything between the first <c> and the last </c>. I'm wondering if there is some expression I could use to limit the matches to individual sets of <c>...</c>, or if some other method might work.

Thanks.
0
Comment
Question by:huenterprises
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 2

Accepted Solution

by:
bobsledbob earned 500 total points
ID: 8178273

preg_match_all("'(\<c\>)[\s\S]*?(\<\/c\>)'",$xml,$matches);

The addition of the question mark there means "don't be greedy".  Basically, you were matching the first <c> and the very last </c>.  Telling it not to be greedy tries to minimize the matching, rather than maximizing.

By the way, doesn't this work too (and it's a bit shorter).  Do you really need to paranthesize the <c>'s??

preg_match_all("|<c>[\s\S]*?</c>|", $xml, $matches);

More info about greedy at the PHP manual page...

http://www.php.net/manual/en/function.preg-match-all.php

0
 
LVL 33

Expert Comment

by:snoyes_jw
ID: 8181794
You might also try using PHP's XML parser
(http://www.php.net/manual/en/ref.xml.php)

You can specify in your open/close tag handler functions to set/unset some flag when you reach tag "c", then test for that flag in your character data handler function to add the contents to an array.
0
 

Author Comment

by:huenterprises
ID: 8182220
Thanks.

preg_match_all("|<c>[\s\S]*?</c>|", $xml, $matches);

That was just what I was needing.
0
 
LVL 3

Expert Comment

by:jayrod
ID: 8182373
where is the best place to learn about preg_match syntax?
0

Featured Post

Enroll in August's Course of the Month

August's CompTIA IT Fundamentals course includes 19 hours of basic computer principle modules and prepares you for the certification exam. It's free for Premium Members, Team Accounts, and Qualified Experts!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsers only know CSS so your awesome SASS code needs to be translated into normal CSS. Here I'll try to explain what you should aim for in order to take full advantage of SASS.
SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question