Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Saving an Image from a web server in C++

Posted on 2007-10-15
6
Medium Priority
?
487 Views
Last Modified: 2012-05-05
My goal is to download and save an image from a web server from within a C++ program without using any tools like wget etc...  After I submit my GET request to the web server, it comes back with:

generic-message = Request-Line | Status-Line
                          *(message-header CRLF)
                          CRLF
                          [ message-body ]

see (http://www.rfc.net/rfc2616.html#p31)

Actual Output: tj@e1505:~> ./client www.google.com
HTTP/1.1 200 OK
Content-Type: image/gif
Last-Modified: Wed, 07 Jun 2006 19:38:24 GMT
Expires: Sun, 17 Jan 2038 19:14:07 GMT
Server: gws
Content-Length: 8558
Date: Mon, 15 Oct 2007 21:07:41 GMT

GIF89anjýÏ CIS(S...rest of file
**End Actual Output***

I'm assuming that [message-body] is the actual contents of the image that I want to save.  Right now my code is sending the Headers and the payload to a file instead of just the payload.  My question is how do I output just the payload to a file so as to save the image without the headers.  I've been thinking the trick might be to find the double CRLF that marks the end of the header section but what happens if this gets seperated out into two different buffers??? It would never be found??? Any other suggestions to improve the code are appreciated.

Right now I'm just trying to get this to work with the gif on google's homepage, hence the static GET request. Code:
#include <netdb.h>
#include <netinet/in.h>
#include <unistd.h>
#include <iostream>
#include <fstream>
using namespace std;

main(int argc, char *argv[])
{
      int socketDescriptor,bytes_received,total=0;
      struct sockaddr_in serverAddress;
      struct hostent *hostInfo;
      char buffer[1024];
      string request;
      fstream file;

      if (argc != 2) {
            cerr << "Usage: " << argv[0] << " hostname" << endl;
            exit(1);
      }
      if ((hostInfo=gethostbyname(argv[1])) == NULL) {
            cerr << "Unable to resolve host." << endl;;
            exit(1);
      }

      socketDescriptor = socket(AF_INET, SOCK_STREAM, 0);
      if (socketDescriptor < 0) {
            cerr << "Cannot create socket.\n";
            exit(1);
      }

      serverAddress.sin_family = hostInfo->h_addrtype;
      memcpy((char *) &serverAddress.sin_addr.s_addr, hostInfo->h_addr_list[0], hostInfo->h_length);
        serverAddress.sin_port = htons(80);

      if (connect(socketDescriptor, (struct sockaddr *) &serverAddress, sizeof(serverAddress)) < 0) {
            cerr << "Cannot connect.\n";
            exit(1);
      }

      request = "GET /intl/en_ALL/images/logo.gif HTTP/1.1\r\n";
      request += "Host: ";
      request += argv[1];
      request += "\r\n";
      request += "Connection: close\r\n";
      request += "\r\n";

      size_t request_size = request.size() + 1;
      char crequest[request_size];
      strncpy( crequest, request.c_str(), request_size );

      if (send(socketDescriptor, crequest, request_size, 0) < 0) {
            cerr << "Cannot send data.";
            close(socketDescriptor);
            exit(1);
      }

      file.open("test.gif",fstream::out);
      if (! file.is_open()) {
            cerr << "Unable to open file for writing." << endl;
            exit(1);
      }

      do {
            memset(crequest, 0x0, 1025);//zero the buffer
            bytes_received = recv(socketDescriptor, buffer, sizeof(buffer), 0);
total += bytes_received;
            cout << buffer;
            file << buffer;
      } while (bytes_received != 0);
cout << "\r\n" << "Total Bytes Received: " << total << endl;

      file.close();
      close(socketDescriptor);
}
0
Comment
Question by:n664dc
6 Comments
 
LVL 12

Expert Comment

by:OnegaZhang
ID: 20082547
You can allocate a 2k(expect payload is smaller than this size) buffer, fill it and find out position of CRLF.
0
 
LVL 3

Author Comment

by:n664dc
ID: 20082598
This now produces a partial image.  I guess it's corrupt... I dunno where to go from here yet.
#include <netdb.h>
#include <netinet/in.h>
#include <unistd.h>
#include <iostream>
#include <fstream>
using namespace std;

fstream file;

bool check_for_end_headers(char buffer[1024])
{
      for(int i=0;i<1021;i++)
            if(buffer[i] == '\r' && buffer[i+1] == '\n' && buffer[i+2] == '\r' && buffer[i+3] == '\n') {
                  cout << "FOUND" << endl;
                  for(int j=i+4;j<1024;j++) {
                        //cout << buffer[j];
                        file << buffer[j];
                  }
                  //cout << "\r\n\r\n\r\n";
                  return true;
            }
      return false;
}

main(int argc, char *argv[])
{
      int socketDescriptor,bytes_received,total=0;
      struct sockaddr_in serverAddress;
      struct hostent *hostInfo;
      char buffer[1024];
      string request;

      if (argc != 2) {
            cerr << "Usage: " << argv[0] << " hostname" << endl;
            exit(1);
      }
      if ((hostInfo=gethostbyname(argv[1])) == NULL) {
            cerr << "Unable to resolve host." << endl;;
            exit(1);
      }

      socketDescriptor = socket(AF_INET, SOCK_STREAM, 0);
      if (socketDescriptor < 0) {
            cerr << "Cannot create socket.\n";
            exit(1);
      }

      serverAddress.sin_family = hostInfo->h_addrtype;
      memcpy((char *) &serverAddress.sin_addr.s_addr, hostInfo->h_addr_list[0], hostInfo->h_length);
        serverAddress.sin_port = htons(80);

      if (connect(socketDescriptor, (struct sockaddr *) &serverAddress, sizeof(serverAddress)) < 0) {
            cerr << "Cannot connect.\n";
            exit(1);
      }

      request = "GET /intl/en_ALL/images/logo.gif HTTP/1.1\r\n";
      request += "Host: ";
      request += argv[1];
      request += "\r\n";
      request += "Connection: close\r\n";
      request += "\r\n";

      size_t request_size = request.size() + 1;
      char crequest[request_size];
      strncpy( crequest, request.c_str(), request_size );

      if (send(socketDescriptor, crequest, request_size, 0) < 0) {
            cerr << "Cannot send data.";
            close(socketDescriptor);
            exit(1);
      }

      file.open("test.gif",fstream::out);
      if (! file.is_open()) {
            cerr << "Unable to open file for writing." << endl;
            exit(1);
      }

      do {
            memset(crequest, 0x0, 1025);//zero the buffer
            bytes_received = recv(socketDescriptor, buffer, sizeof(buffer), 0);
total += bytes_received;
            cout << buffer;
            if(!check_for_end_headers(buffer))
                  file << buffer;
      } while (bytes_received != 0);
cout << "\r\n" << "Total Bytes Received: " << total << endl;

      file.close();
      close(socketDescriptor);
}
0
 
LVL 6

Accepted Solution

by:
SeanDurkin earned 300 total points
ID: 20083276
I slightly changed around your code so that it would get the whole image into the file:

bool check_for_end_headers(char buffer[], char newBuf[], int &len)
{
      for(int i=0;i<1021 && buffer[i] != '\0';i++)
            if(buffer[i] == '\r' && buffer[i+1] == '\n' && buffer[i+2] == '\r' && buffer[i+3] == '\n')
            {
                  int j(4);
                  for(; (i+j) < 1024 && buffer[i+j] != '\0'; j++)
                  {
                        newBuf[j-4] = buffer[i+j];
                  }
                  len = j-4;
                  return true;
            }
      return false;
}

[in main()]
char query[500];
sprintf(query, "GET /intl/en_ALL/images/logo.gif HTTP/1.1\r\nHost: www.google.com\r\nConnection: close\r\n\r\n", host);
send(sd, query, strlen(query), 0);

fstream file;
file.open("test.gif", fstream::out);

char text[50000];
char temp[1024], newTemp[1024];
bool alreadyFound = false;
int pos = 0, bytes, len;
while((bytes = recv(sd, temp, 1024, 0)) > 0)
{
      temp[bytes] = '\0';
      
      if(alreadyFound)
      {
            for(int i = 0; i < bytes; ++i)
            {
                  //text[pos++] = temp[i];
                  file << temp[i];
            }
      }
      else if(check_for_end_headers(temp, newTemp, len))
      {
            for(int i = 0; i < len; ++i)
            {
                  //text[pos++] = newTemp[i];
                  file << newTemp[i];
            }
            
            alreadyFound = true;
      }
}
text[pos] = '\0';
//cout << text;
//file.write(text, strlen(text));
cin.ignore(1024, '\n');
file.close();

The problem is that the GIF file contains a lot of null C characters ('\0'), so when I tried to store them into the large character array called text, it wouldn't print or write but the first 9 characters, because the first '\0' character was around the 10th position in the array (which is why I commented out all the statements with text in them and wrote individual characters). I have altered it so it will get only the information in the message body, but I haven't yet figured out why there are so many null terminating characters in the GIF file, and if they're supposed to be there or if I'm supposed to replace them with some other character. Hopefully you or someone else can make sense of the problem, but I've spent more than an hour on it already and can't figure it out, so this is where I stop. :)

Best of luck,

Sean
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 3

Author Comment

by:n664dc
ID: 20083368
After reading into the RFC a bunch more, I'm taking a slightly different approach.  I'll start working on it again some time tomorrow night.

Sean - so did your code correctly save the image despite being unable to display the contents on the screen?

Thanks so far all.
0
 
LVL 39

Assisted Solution

by:itsmeandnobodyelse
itsmeandnobodyelse earned 300 total points
ID: 20083624
>>>> The problem is that the GIF file contains a lot of null C characters ('\0'),
You need to open and write the .gif file in binary mode.

        ofstream file("test.gif",ios::out | ios::binary);
        if (file)
        {
              file.write(buf, bytesread);
        }

You would need to read the all .gif with recv, e.g. by

   // first supply a  sized buffer
   int bytesread = 0;
   char buffer[1024];
   char* buf = NULL;
   while (true)
   {
           int rc = recv(socketDescriptor, buffer, sizeof(buffer), 0);
           if (rc == SOCKET_ERROR && errno == EMSGSIZE)
                rc == sizeof(buffer);
           else if (rc == 0 || rc == SOCKET_ERROR)
                break;
           char* pbuf = new char[bytesread + rc];
           if (buf  != NULL)
           {
                 memcpy(pbuf, buf, bytesread);
                 delete []buf;
           }
           buf = pbuf;
           memcpy(&buf[bytesread], buffer, rc);
           bytesread += rc;
           if (rc < sizeof(buffer))
                break;
      }

Regards, Alex
0
 
LVL 3

Author Comment

by:n664dc
ID: 20089883
Functioning Code:
#include <netdb.h>
#include <netinet/in.h>
#include <unistd.h>
#include <iostream>
#include <fstream>
#include <math.h>
#include <string.h>
using namespace std;

fstream file;
int bytes_remaining=-1;

int cstrhex_to_int(char * string)
{
      int value=0;
      for(int i=0;i<strlen(string);i++)
            value += (int)pow((double)16,(double)strlen(string)-1-i)*(string[i]-'0');
      return(value);
}

int cstrdec_to_int(char * string)
{
      int value=0;
      for(int i=0;i<strlen(string);i++)
            value += (int)pow((double)10,(double)strlen(string)-1-i)*(string[i]-'0');
      return(value);
}

void get_chunksize(char * buffer)
{
      for(int i=0;i<1020;i++) {
//cout << "Checking: " << buffer[i];
            if(buffer[i] == '\r' && buffer[i+1] == '\n' && buffer[i+2] == '\r' && buffer[i+3] == '\n') {//find double CRLF
                  char * hexstring;
                  char * p;
                  int size=0;
                  memset(hexstring,'\0',8);//max filesize of 4,294,967,295 (alot)
                  p = &buffer[i+4];//FIXME: end of line could be beyond the end of the buffer?
                  for(int j=i+4;j<1024 && buffer[j]!='\r';j++,size++) {}//find the end of chunk-size
                  memcpy(hexstring,p,size);//move over hexvalues to 'hexstring'
                  bytes_remaining = cstrhex_to_int(hexstring);//convert it to decimal form so it's useful in main
                  cout << "\r\nLooking for " << bytes_remaining << " bytes." << endl;
                  //send the payload portion of this buffer to file
                  file.write(&buffer[i+6+size],1024-(i+6+size));
                  bytes_remaining -= 1024-(i+size+6);
                  return;
            }
            else if(i<997 && buffer[i] == 'C' && buffer[i+1] == 'o' && buffer[i+2] == 'n' && buffer[i+3] == 't' && buffer[i+4] == 'e' && buffer[i+5] == 'n' && buffer[i+6] == 't' && buffer[i+7] == '-' && buffer[i+8] == 'L') {//FIXME: can go beyond the end of the buffer? - Does the Content-Length Header exist?
                  //all warnings from previous if statment still apply here
                  char bytestring[10];
                  char * p;
                  int size=0;
                  memset(&bytestring,'\0',10);
                  p = &buffer[i+16];
                  for(int j=i+16;j<1024 && buffer[j]!='\r';j++,size++) {}//find the length of the Content-Length Header
                  memcpy(bytestring,p,size);
                  bytes_remaining = cstrdec_to_int(bytestring);
                  cout << "\r\nLooking for " << bytes_remaining << " bytes." << endl;
                  for(int j=i+16+size;j<1024;j++) {//find double CRLF
                        if(buffer[j] == '\r' && buffer[j+1] == '\n' && buffer[j+2] == '\r' && buffer[j+3] == '\n') {//find double CRLF
                              for(int k=j+4;k<1024;k++)
                                    file << buffer[k];
                              bytes_remaining -= 1024-(j+4);
                              return;
                        }
                  }
                  return;
            }
      }
return;
}

main(int argc, char *argv[])
{
      int socketDescriptor,bytes_received;
      struct sockaddr_in serverAddress;
      struct hostent *hostInfo;
      char buffer[1024];
      string request;

      if (argc != 2) {
            cerr << "Usage: " << argv[0] << " hostname" << endl;
            exit(1);
      }
      if ((hostInfo=gethostbyname(argv[1])) == NULL) {
            cerr << "Unable to resolve host." << endl;;
            exit(1);
      }

      socketDescriptor = socket(AF_INET, SOCK_STREAM, 0);
      if (socketDescriptor < 0) {
            cerr << "Cannot create socket.\n";
            exit(1);
      }

      serverAddress.sin_family = hostInfo->h_addrtype;
      memcpy((char *) &serverAddress.sin_addr.s_addr, hostInfo->h_addr_list[0], hostInfo->h_length);
        serverAddress.sin_port = htons(80);

      if (connect(socketDescriptor, (struct sockaddr *) &serverAddress, sizeof(serverAddress)) < 0) {
            cerr << "Cannot connect.\n";
            exit(1);
      }

      request = "GET /intl/en_ALL/images/logo.gif HTTP/1.1\r\n";
      request += "Host: ";
      request += argv[1];
      request += "\r\n";
//request += "Accept-Encoding: chunked\r\n";
      request += "Connection: close\r\n";
      request += "\r\n";

      size_t request_size = request.size() + 1;
      char crequest[request_size];
      strncpy( crequest, request.c_str(), request_size );

      if (send(socketDescriptor, crequest, request_size, 0) < 0) {
            cerr << "Cannot send data.";
            close(socketDescriptor);
            exit(1);
      }

      file.open("test.gif",fstream::out|ios::binary);
      if (! file.is_open()) {
            cerr << "Unable to open file for writing." << endl;
            exit(1);
      }

      do {
cout << "\r\nBytes Remaining: " << bytes_remaining << endl;
            memset(buffer, 0x0, 1024);//zero the buffer
            bytes_received = recv(socketDescriptor, buffer, sizeof(buffer), 0);
cout << "Bytes received: " << bytes_received << endl;
            cout << buffer;
            if(bytes_remaining == -1)
                  get_chunksize(buffer);
            else {
                  if(bytes_received > bytes_remaining)
                        for(int i=0;i<bytes_remaining;i++)
                              file << buffer[i];
                  else {
                        file.write(buffer, bytes_received);
                        bytes_remaining -= bytes_received;
                  }
            }
      } while (bytes_received != 0);

      file.close();
      close(socketDescriptor);
}

Thanks all!
0

Featured Post

[Webinar] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is a continuation of the C/C++ Visual Studio Express debugger series. Part 1 provided a quick start guide in using the debugger. Part 2 focused on additional topics in breakpoints. As your assignments become a little more …
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.
Suggested Courses

571 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question