Solved

Filtering HTML on a Web Server

Posted on 1999-01-18
2
140 Views
Last Modified: 2013-12-25
All requests which reach Web Servers are responded with HTML streams back to the requesting client. Is there a way I can filter ALL the HTML output from the Web Server BEFORE it is sent to the client. This has nothing to do with files with specific extensions which I want to handle differently but with all the HTML output of the Web Server. I want to identify a specific tag which might be in any of the output and DO SOMETHING when I find it...

Thanx....
0
Comment
Question by:ariefishler
2 Comments
 
LVL 11

Accepted Solution

by:
mouatts earned 100 total points
ID: 1832312
In short yes maybe. To do it you must write an extension to the Web Server. How you do that depends on the web server.

The basic approach is that when an HTML page is transmitted by the server the server reads the head section of the page and transmits any HTTP messages indicated by HTTP-EQUIV meta tags.

This is known as head parsing and the trick to doing what you want to do is to include some additional parsing that will check the rest of the file.

There is a slight problem with this in that whilst HTML files are parsed most other types are not. So you will also need to include the code to parse any others that you wish to implement this upon and configure the server accordingly.

I don't think doing this is much of a problem with Netscape (I know it isn't because I've done it) and I think it shouldn't be a problem with either Apache or IIS. To do it with Oracle you will probably need to be on version 3 and you will need to dump the listerner an use an alternative HTTP server (if you're not using Oracle this won't make any sense so ignore it).

It is quite a big job and certainly too big for here. I would suggest that you look up the documentation for Apache, NSAPI (for Netscape) or ISAPI (for IIS) as you next step.

Incidentally in all cases you basically will have to write a library (DLL in windows, so on Unix) which is linked into the Server dynamically. Generally such libraries are (and should be ) written in C.

HTH
   Steve
0
 

Author Comment

by:ariefishler
ID: 1832313
Thanks for the answer....this is more or less what I expected but,
Before I accept I just want to clarify some things: I actually need to parse only HTML files (static, dynamic and any other created HTML type) which pass through the Web Server at a specific site.
I think that the part you wrote about "There is a slight problem with this in that whilst HTML files are parsed most other types are not. So you will also need to include the code to parse any others that you wish to implement this upon and configure the server accordingly. " becomes irelevant isn't it?

Can you elaborate on the other part you wrote "The basic approach is that when an HTML page is transmitted by the server the server reads the head section of the page and transmits any HTTP messages indicated by HTTP-EQUIV meta tags". What is the web server doing exactly here. I could not understand it from what you wrote. Do u also mean that I will have to hook the extension at the end of this parsing done by the server?

What I am trying to do is to plant my own tag in the HTMLs and parse it with my own code. This is something like Cold Fusion is doing if you know the product, but I think they have a specific extension for their files, and maybe that is how they map it to their code.

Wouldn't it be just easier to use a CGI at the place I want my special tag and make this CGI do what I want instead of parsing all the HTML. How much will I pay in communication time? (The HTML goes to the browser and then the CGI/Servlet is being called again from the BROWSER. Will it be expensive?)

Thanks....It can sure help :)
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
get remote physical servers and platforms 5 50
exchange, scripts 3 60
Batch FIle delete files with particular date 10 75
Groovy:unable to resolve class error 2 66
Introduction This tutorial will give you a fast look what you can do with WhizBase. I expect you already know how to work with HTML at least, and that you understand the basics of the internet and how the internet works. WhizBase is a server-s…
This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now