[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

how to extract data from <p class=""> in multiple html files to csv file

Posted on 2005-04-25
4
Medium Priority
?
173 Views
Last Modified: 2010-04-15
I have approx 2400 html pages that I need to extract the value of a <p class=" "> from.
Each page includes the customers company name, address, city, state, zip, phone, fax.
I need these to be on a single row seperated by comma. Is there a program that will do this.
If so i could really use this. Thanks for your time.

example:

<p class="NomEmpresa">Company</p>
</td>
<td valign="Top">
<p class="NomEmpresa" align="right">company number</p>
</td>
<tr>
<td colspan="2">
<p class="Adresa">add1</p>
</td>
</tr>
<tr>
<td colspan="2">
<p><span class="Adresa">csz</span></p>
</td>
</tr>
<tr>
<td colspan="2">
<p><span class="Adresa"></a></span></p>
</td>
</tr>
<tr>
<td colspan="2">
<p><span class="Adresa">Tel.: Fax: </span></p>
</td>
</tr>
<tr>
<td colspan="2">
<p><span class="Adresa">E-mail: </span></p>
0
Comment
Question by:christpher7
  • 2
4 Comments
 
LVL 22

Expert Comment

by:grg99
ID: 13861354
Try this, works fine:

use strict; use warnings;

my( $Fn, $Pat, $t );  $Pat = 'class="Adresa">(.+?)<';

open( FL, "<FileList.txt" ) || die "No FileList.Txt!!\n";

while( <FL> ) {
   $Fn = $_;  open( F, "<$Fn" ) || die "No file $Fn !!!!\n"; $t = '';

while( <F> ) {  if( /$Pat/ ) {    if( $t eq '' ) {$t = $1;} else { $t = "$t,$1"; } }}

close( F );  print "$t\n";
}
close( FL );

0
 
LVL 15

Expert Comment

by:efn
ID: 13863589
> Try this, works fine:

I believe it, but since this is the C area, you might confide what language it's written in!  (Perl, I think.)
0
 
LVL 22

Accepted Solution

by:
grg99 earned 2000 total points
ID: 13865568
Tee hee-- Perl it is.

If it HAS to be in C, it will take a few more lines.



0
 

Author Comment

by:christpher7
ID: 13879790
Thanks grg99.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use for-loops in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.
Suggested Courses

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question