• C

open a URL and print its contents

OK, this applies to both OS: windows and linux.

i'm trying to open a URL, and display its contents on the console. much like wget would do, only simpler.

basically i need some code to perform the html request and print the content. tha's all.
i tried several aproaches but they mostly need MFC or some other technologies.
simple C would do.

if it's not possible then is there any way to do it wihtout MFC or the like?
no external libraries (like libcurl), it's just a small and simple tool.

thanks so much
urifAsked:
Who is Participating?
 
van_dyConnect With a Mentor Commented:
this is fairly easy to do along the lines suggested by others.
still here is some thing, see if it works for you.

#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>

int main(argc, argv)
      int argc;
      char **argv;
{
      struct sockaddr_in to;
      int sockfd;
      char data[SOME_NUMBER];
      int n;

      char *text = "GET http://www.google.co.in/ HTTP/1.0\r\n\r\n";
        char *ip = "216.239.39.99";     // usual dotted decimal representation of the ip of the site u want to retrieve from.
                                                     // in our example google.co.in
      to.sin_family = AF_INET;
      to.sin_port = htons(80);
      inet_aton(ip, &to.sin_addr);

      sockfd = socket(AF_INET, SOCK_STREAM, 0);

      if(!connect(sockfd, (struct sockaddr *)&to, sizeof(to)))
            printf("connected\n");
      else{
            printf("connection failed\n");
            exit(1);
      }

      write(sockfd, text, strlen(text));
      n = read(sockfd, data, 1024);     // this you should modify to handle the general case, to read repeatedly from the
                                                            // socket till u get EOF
      data[n - 1] = 0;

      printf("%s\n", data);
      close(sockfd);
      
}

pretty primitive. but i guess it will work. the ip and the page you specify can be made to come from command line.
you should do that. a better appproach wd be to parse out the first parts from the page name, and use gethostbyname
to get the ip. well i think you get the general idea to proceed from here

hope this helps.
van_dy
0
 
grg99Commented:
basically you need to open a socket to the server, port 80.
send a string of the format "GET URL"    optionally followed by options
You'll get back a response code, hopefully 400, "OK", followed by the web page.

Some pickier web servers require some options like "HTTP/1.1", but many don't

You can experiment with telnet:

c:>telnet cnn.com 80
GET /
400 OK
<DOCTYPE HTML...>  ...

to do it from a program, lookup the networking functions like socket(), connect(), etc...  these come with good examples.




0
 
ravs120499Commented:
Here are the functions you need to call in sequence. I think that should get you started. You can use man pages to figure out the details.

/* Open a socket and get a socket descriptor. This is analogous to using the open() function to open a file */
socket()

/* Set socket options - not mandatory to call this */
setsockopt()

/* Connect to the Webserver host/port. You will need to dig around a bit to figure out how to fill in the sockaddr struct. */
connect()

/* Send the HTTP GET command to the server. If the server supports HTTP 1.0, all you need to say is GET /path/to/file HTTP/1.0. With HTTP 1.1 you need to send some additional headers (I forget which) */
write()

/* Read the data sent back by the server */
read()

/* Do whatever you want with the data */
printf()

HTH
ravs
0
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

 
ravs120499Commented:
Forgot to add - for write() and read(), you use the socket descriptor (the one you got back from the socket call) like a file descriptor.
0
 
ravs120499Commented:
Also forgot - you need to close() the socket descriptor when done.
0
 
urifAuthor Commented:
thanks everyone.

ravs can you be more specific?
i know about the sockets, but for some reason it doen't work under windows and on linux i get a lot of "cannot open socket" kind of errors.

do you have a working example|?

thanks again
0
 
ravs120499Commented:
What errors do you get?

Rgds
0
 
urifAuthor Commented:
ravs never mind. thanks for trying to help anyway
as i say

> i need some code to perform the html request and print the content

i don't have the time (thge projst is a huge one and there is the asm part that is taking most of my time) to start trying to build the code, if you have a working piece of code, great, if not thanks so much anyway
0
 
ravs120499Commented:
ASM - that's assembly language, right?

Urif, it will take about 15 mins for an assembly programmer like you to get the C program running. I can give you working code, but not do your assignment (ok, ok, insignificant part of big project) for you. I would at least like to see you try - somehow, the "I'm too busy" doesn't motivate me too much.

Oh, but never mind, I see that van_dy has done it anyway.

Regards
0
 
van_dyCommented:
yea, what part of your project(which incorporates
something like dirty web client) would need to be coded
in assembly anyways... we would like to know and help
you in that too :D
0
 
urifAuthor Commented:
thanks everyone, let me try the code.

ravs, while i would generally agree with you i am pressed with time, that's why i winded up asking for a piece of code, however insignificant it is, as you said, yes, an asm programmer would come up with the code fast, but since i am generally a low level system (mostly kernel related work that has to do with crypto) programmer, i haven't done much with sockets and i have my head totally occupied with the asm.

now, what project has asm and a web thing in it? well, i can't tell you. but the socket part has to do with getting a specific piece of data from a server that sits half way across the world and that only understand http requests to send the data. since this is "trivial" i didn't want to even get my attention out of the assembler function that as it is, it's really a pain.

next time i will try "harder" i promise, but not now. i am sorry and i am sorry if i offended anyone it wasn't my intention.

thanks.
as soon as i  try the code i'll post my reply and maybe if you are good some assembler for you to "try" to decode :)
0
 
urifAuthor Commented:
your code worked but with some changing. i had to add some line, but it essentially worked.
the only problem is portability to win32 platform.

thanks so much for the help
0
 
van_dyCommented:
On windows you will need to do something like WSASTARTUP.
i haven t programmed on that platform, you may consider taking
a look at  initial pages of the beej's guide to set it right for windows

http://www.ecst.csuchico.edu/~beej/guide/net/
0
 
urifAuthor Commented:
as far as i know you might need either a sdk from microsoft, or something similar.
do you know if mingw (gcc for windows) supports sockets?
0
 
van_dyCommented:
I really cant guide you much there because i haven't
programmed on windows at all. However please see this link
and find out if the stuff is useful.

http://www.ecst.csuchico.edu/~beej/guide/net/html/intro.html#windows
0
 
urifAuthor Commented:
hmmm something stange happened.

ok, i tried the code you posted uaing google as the reference, everything is ok. i get:

connected

then the whole html code of the page.
now, i chenged the page to the server and page i am trying to access and nothing happens. i just get the connected but when it goes into the while() loop for the read != eof then it justs stays there.
now the page is accessible thru a webbrowser and wget as well, so there is no problem there,
any ideas?

thanks
0
 
urifAuthor Commented:
by the way, on windows cygwin gives you the answer, it supports sockets and gcc, so basically i can compile the same code with minor adjustments
0
 
urifAuthor Commented:
ok, i removed the HTTP/1.0 from the code and this what i got


<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>403 Forbidden</TITLE>
</HEAD><BODY>
<H1>Forbidden</H1>
You don't have permission to access http://xxx.yyy
on this server.<P>
<HR>
<ADDRESS>Apache/1.3.9 Server at yyy.yyy Port 80</ADDRESS>
</BODY></HTML>

but from the browser and from the same computer it works...
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.