Extracting HTML tag contents to use as a parameter

Posted on 2004-10-15
Last Modified: 2013-12-25

I'm no expert at Perl, although I can hack together basic scripts. Our Perl developer was recently released for professional negligence, so I am having to pick up the pieces of one of his projects.

What I need to be able to do is to Extract the contents of the HTML <Title> tag from the page which called the script so I can pass it as a parameterin a query string to a pricing search tool.

We currently have it calling a fixed field through:
my $q=new CGI;
my $part = $q->param('Part');

but this means altering the layout of several thousand pages to include a page specific part description. A better option is to just pass the contents of the Title tag.
This is a UNIX script.

Anyone have any ideas?


Question by:asparak
  • 4
  • 3
LVL 28

Expert Comment

ID: 12319096
Are you saying that your "database" of parts discriptions is kept within the <title> parts discriptions </title> tags and is spread across several thousand pages of static html?  Can you show us an example of the html and Perl script you're using and/or provide a link to one of the pages?

Author Comment

ID: 12319455
The Title tag looks like a standard HTML tag : <title>My Server Model here</title>

Our Current test version, only available on our internal network while I try to fix this contains the following:

<a href="/cgi-bin/pTester3.cgi?Part="My Server Model here" target="_blank">Price It <span class="rightarrow">&raquo;</span></a>

What I need to do is to change this to:
<a href="/cgi-bin/pTester3.cgi" target="_blank">Price It <span class="rightarrow">&raquo;</span></a> to remove that hard coding issue.

The cgi script then needs to parse the HTML file it was called from and extract the description from the Title tag, to complete the following piece of code within pTester3.cgi
my $q=new CGI;
#my $part = $q->param('Part');
#Need to fix the following line:
my $part=$q->(the title of the page I was called from);
print $q->redirect(-URL=>"https://$portal/webquery/Query?Query.findButton=Find&Query.prodDescOutput=$part&Query.selectedOuputColumns=partNum%2CunitPrice");

This issues a command to our secure server to display up to the minute pricing information back to an authorised user for that product group. $portal is a parameter extracted from the session cookie to point the command to the right pricing information portal.

I have anonimised things a little.
LVL 28

Expert Comment

ID: 12320522
It is not possible it do it in the manor you're wanting because the html page won't passing anything to the cgi script.  If you don't want to pass the info in the link, you'll probably need the Perl script to read and parse the tags in the calling html file, which would not be very efficient.
LVL 28

Accepted Solution

FishMonger earned 125 total points
ID: 12320640
Here's a module that will help extract the info.
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

LVL 28

Expert Comment

ID: 12320678

Author Comment

ID: 12320864

I'll need to try and get the admins to have HeadParser installed on the server. Then try to figure out how to write the code to get HTTP_Referer and parse it.

Not sure I'm up to this, but I'll give it a go, unless you can think of a better way to approach this. All I have to go on are the few sparse notes of the developer before he was escorted from the building.

One thought I have was to embed some code in the head or something to populate the part dynamically as the Head is templated and so one change could be made to all the pages. It's just the body portion of the page we want to try to avoid having to customise.

LVL 51

Expert Comment

ID: 12331944
> The cgi script then needs to parse the HTML file it was called from and extract the description from the Title tag
i.g. impossible. Dot.
You either need active scripting on client side to do that, or you need to tell you CGI to request the page itself (which is totally unreliable).

You have following choices:
  1. leave as is, means that the generated page contains links with GET or POST request which carry "your title"
  2. build sessions server-side where you store the title and then can get it back after your CGI is called again with same session-ID

Author Comment

ID: 12336838
It's been a nasty hack ,but with the help of a friend who's far better at perl than I, we have managed to test HTML::HeadParser successfully on dev. Just need to push it out to live now.

Probably explains why it took our developer 6 months to do.

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this tutorial I will aim to show you how simple is making a small application in WhizBase, how to add, remove and update data in the DB. I will make a small address book application where you can add, browse, update and remove addresses. I wi…
Introduction This tutorial will give you a fast look what you can do with WhizBase. I expect you already know how to work with HTML at least, and that you understand the basics of the internet and how the internet works. WhizBase is a server-s…
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now