Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 224
  • Last Modified:

Conecting directly to port 80...

Hello,
Not sure if this is the proper place for this question but...
I need to create a script that connects to port 80 of a specfic server.
The conecting part is not the problem. But when I try to "talk" to the server to retrieve a site. The script would behave as a browser, retrieving the page and any images asociated withit.
So the question really is:
How di I format a proper request method to a web server?
I tried:
get index.html http/1.1
get / http/1.1
and none worked. I tried reading the RFC for http1.1 and it was just to technical for this semi-geek.
All I need to do is retrieve the main file and parce out anything that would be in font bigger than 3 or in bold...

Thanks for your help
0
sinner052397
Asked:
sinner052397
  • 2
  • 2
1 Solution
 
mouattsCommented:
Your first answer is almost correct if you aren't going via a proxy. If you are you must use the absolute URL and not the relative one.

I think that http should be HTTP.

Make sure that the line is terminated with a CR and LF


Steve
0
 
sinner052397Author Commented:
How would a fully correct answer look like?
From a telnet session, without a proxy, I tried:

get index.html HTTP/1.1   and I got HTTP/1.1 400 Bad Request
get / HTTP/1.1   and I got HTTP/1.1 400 Bad Request
get get http://www.sf.cl   HTTP/1.1 and I got  HTTP/1.1 400 Bad Request

My other question is, if you do not know the name of the first page, what is the proper call? If I connect to cnn.com how do I know wether to ask for inex.html index.htm default.htm default.asp, etc...
could it be because I am trying this through a telnet session as oposed to throug the script and thus not emulating a browser?

Thanks again,
Marcelo


0
 
sinner052397Author Commented:
I also tried:
get /index.html HTTP/1.1
GET index.html HTTP/1.1
GET /index.html HTTP/1.1
and none worked...

0
 
fasterCommented:
GET / HTTP/1.0\r\n
\r\n
0
 
fasterCommented:
In short, you need 1 empty line.  The complete request format is:

Reguest line\r\n
header1: value1\r\n
header2: value2\r\n
.
\r\n

\r\n can also be replaced with \n, but \r\n is better, because that is the http standard.
Headers are optional for http 1.0, but in http 1.1, You must have a "Host" header, the host is the name of the server, for example:

Host: www.microsoft.com\r\n
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now