Solved

RETRIEVING TEXT FILE CONTENT

Posted on 2001-07-15
2
153 Views
Last Modified: 2013-12-25
i have a lot of html files.i want a perl script that will loop through each file and store the title and description values in two different arrays.

please find below the template of the html pages.

<html>
<head>
<title> i am the title</title>
<description> i describe this html page</description>
</head>
<body>
</body>
</html>

0
Comment
Question by:augblay
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 2

Accepted Solution

by:
psogaa earned 200 total points
ID: 6285949
use the perl script below, give as argument the target directory.

****************************************************

$targetDir = $ARGV[0];
opendir( DIR, $targetDir );
@files = grep( /\.html?$/,readdir( DIR )) or die "can't open dir: $!";
closedir( DIR );
@titles;
@descriptions;
foreach $file (@files){
  open( FILE, "$targetDir/$file" ) or die "can't open file: $!";  
  {
    undef( $/ );
    $fileContent = <FILE>;
  }  
  close FILE;
  $fileContent =~ /<title>(.*?)<\/title>.*?<description>(.*?)<\/description>/si;
  push( @titles, $1 );
  push( @descriptions, $2);    
}
0
 
LVL 1

Expert Comment

by:Moondancer
ID: 6419722
Open today, need more?
Moondancer
Community Support Moderator @ Experts Exchange
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you get a (Blue Screen of Death), your system writes a small file called a minidump. Your first step is to make certain your computer is setup to record memory dumps. Right click My Computer, choose properties. Click on the advanced tab, an…
It is becoming increasingly popular to have a front-page slider on a web site. Nearly every TV website,  magazine or online news has one on their site, and even some e-commerce sites have one. Today you can use sliders with Joomla, WordPress or …
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question