?
Solved

Extraction

Posted on 2011-05-09
1
Medium Priority
?
314 Views
Last Modified: 2012-06-27
Hi,

I'm looking for a way to extract meta keywords from a list of websites by specifying an text file as input with a list of domains.

I.E.

file.txt

http://site.com/main/index.htm

output:

<meta name="keywords" content="information on photolithography photoresist equipment semiconductor wafer processing photoresist track systems silicon wafer processing photoresist track systems silicon wafer spin track Organic Light Emitting Diodes OLED wafer coating wafer fabrication">

Thank you
0
Comment
Question by:faithless1
1 Comment
 
LVL 19

Accepted Solution

by:
Kim Ryan earned 2000 total points
ID: 35728033
You could look at this module http://search.cpan.org/~gaas/HTML-Parser-3.68/lib/HTML/HeadParser.pm . It will parse the heder of the web page you nominate. It then has methods to access the meta tags for charsert and name (the one you need).
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question