Non-HTML Info from Get?

I would like my Web parser to get uncommon-type data from the web.  I have seen all of the below working on other scripts, but I have yet to find code for it. This is stuff that is NOT in the HTML code
that gets returned.

The information I want is:
1. Server Type(IIS or Apache,etc..) that Get went to to get the page.
2. Last Modified - sometimes this appears in text, but I know their
    getting it else where because what appears in text is not always
    what appears in the script page.
3. Content size - i.e. how big in kbytes is the file. I would like to do
    this with out saving it to a file.
4. Anything else that isn't in HTML that I can grab!

There must be sone kind of attributes it returns during the Get that
has all this information. How do I access that!

How do they do it???
Where can I find out about this?
jgoreAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

monasCommented:
jgore,

      On the web documents are transmitted usign HTTP protocol. When you request document you get response consisting of:
1) response header;
2) blank line;
3) document text.

All information you need is in header. Althought HTTP standards DO NOT make all information you requested mandatory. Therefore on some responses such information may be missing.

To see in practice, telnet to port 80 of www server of your choice (telnet server 80), and enter:

HEAD http://www.server.of.your.choice/index.html HTTP/1.0

and hit <enter> 2 times.
You will be given header information (if you want to get document text also - use GET instead of HEAD).

Documentation what could be in the header you could fing in the standard. For most widely used version of protocol (HTTP/1.0) see http://www.cis.ohio-state.edu/htbin/rfc/rfc1945.html . You may want to check out rfc for HTTP/1.1 also.
0
guadalupeCommented:
This will give you what you want:

use LWP::Simple;
 ($content_type, $document_length, $modified_time, $expires, $server) = head("http://www.sn.no/";)
 
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jgoreAuthor Commented:
To guadalupe:
You will not be able to examine the response code or response headers (like 'Content-Type') when you are accessing the web using this  function. If you need that information you should use the full OO  interface.
(see LWP::UserAgent).

At least you pointed me in the right direction.
I didn't know where to look.

Hmmmm...me thinks this just got a lot harder!

To monas:
Thanks! I downloaded them. I'll read those.


Thanks to you both!  Cya'z
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.