[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

how to pull an html element present inside a div

Posted on 2011-04-20
12
Medium Priority
?
299 Views
Last Modified: 2012-06-27
I need to pull h1 element from html markup present onlyl inside a div with certain id. e.g. in the following html

<html>
<body>
<h1>some testing</h1>
<div id="column-middle page-content"><p>some stuff inside first div</p> <h1>some title inside the div </h1><p> khjk lh</p></div>
<h1>My First Heading</h1> </div> more text
<p>My first paragraph.</p>
<div id="anding-left page-content">some stuff inside second div </div>
<p>content=23 april 2011 </p>
</bod>
</html>

i only need  <h1>some title inside the div </h1>





0
Comment
Question by:mmalik15
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 3

Expert Comment

by:Mrugesh1
ID: 35431691
Try this code:

<html>
<head>
<script type="text/javascript" language="javascript">
function GetH1Value()
{
      divObj=document.getElementById("column-middle page-content");
      var a = divObj.getElementsByTagName("h1");
      alert("<h1>" + a[0].innerHTML + "</h1>");
}
</script>
</head>
<body>
<h1>some testing</h1>
<div id="column-middle page-content"><p>some stuff inside first div</p> <h1>some title inside the div </h1><p> khjk lh</p></div>
<h1>My First Heading</h1> </div> more text
<p>My first paragraph.</p>
<div id="anding-left page-content">some stuff inside second div </div>
<p>content=23 april 2011 </p>

<button id="btnTest" onclick="GetH1Value()">Get H1 Value</button>

</bod>
</html>

0
 

Author Comment

by:mmalik15
ID: 35431704
thanks for the comment using javascript code butI am purely looking for a regular expression.
0
 
LVL 3

Expert Comment

by:Mrugesh1
ID: 35431741
I think regular expression won't work for this kind of problem.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:mmalik15
ID: 35431752
(?<=<div\sid="column-middle\spage-content").*(?=<\/div>)

the regular expression above matches the whole div but guess there would be a way to extract only h1 inside this div
0
 
LVL 48

Expert Comment

by:hernst42
ID: 35432343
If you do not have nested div, you could try to modify your part to:

(?<=<div\sid="column-middle\spage-content").*\<h1\>([^<]+)\<\/h1\>.*(?=<\/div>)
0
 

Author Comment

by:mmalik15
ID: 35432498
thanks for the comment harnest42

sorry assuming there are no nested div, your suggested reg expression does not seem to match and extract h1.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 35432674
Which language?
0
 

Author Comment

by:mmalik15
ID: 35432703
hi kaumed

i need to use it my xml file again and wonder if its possible via merely using regular expression.

p.s.  I can use python script too.
0
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 2000 total points
ID: 35433009
i need to use it my xml file again...
I'm not sure what you mean.

Here is a Python script that should do the job.
import re

f = open('/test.txt', 'r')
lines = f.readlines()
f.close()
data = '\n'.join(lines)
matcher = re.compile('(?si)<div\s*[^>]*id="column-middle page-content"(?:.(?!</div>))+(<h1>[^<]*</h1>)')
matches = matcher.findall(data)
for match in matches:
	print match, '\n'

Open in new window

0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 35433015
P.S.

Execute under v2.7.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 35433022
Execute under v2.7.
That was a bit misleading...  : (

I meant to say that I tested it under v2.7. I believe it should work for that or above.
0
 

Author Closing Comment

by:mmalik15
ID: 35697607
sorry for late reply kaufmed as iv just come back from holidays
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Suggested Courses

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question