C++ CGI Script replace space with %20

I'm looking for a way to replace a space with %20 in a link (string) that my script will output. This is one of my first attempts at using C++ for CGI so it's probably a relatively simple problem. I have looked at URL encoding functions and I really am not going to have anything else that needs to be url encoded other than a space character so I thought simply replacing the space with %20 would be faster than implementing a complete url encoding function. Speed is an issue here since the script will be getting well over two million views a day. Also, the space can not be replaced by a + since the site that I am linking to can't seem to parse the query string with a + in it.
BuickFreakAsked:
Who is Participating?
 
CmdrRickHunterCommented:
string or char array?

string solution:
string in = "hello world";
string out = "";
for (int inPos = 0; inPos < in.size(); inPos++) {
  if (in[inPos] != ' ')
    out += in[inPos];
  else
    out += "%20";
}

char solution:
char* in;
char* out;
#define INVALID_CHAR(c) ((c) == ' ')   /* define this to be whatever chars are invalid */
#define REPLACEMENT_STRING(c) "%20"
/* replacement string is, right now, a constant, because you wanted it to be just for spaces,
    the code, however, is capable of working with anything, rather quickly... just make
    replacment string do the "right thing".
    We will assume, for now, that all replacment strings are 3 characters
*/

char* inP = in;
int outLength = 0;
// determine how long the output string will be
while (inP) {
  if (INVALID_CHAR(*inP))
    outLength += 3;
  else
    outLength++;
  inP++;
}

// do whatever allocation code you wish, just make sure out is at least outLength +1 characters long

char* outP = out;
inP = in;
while (inP) {
  if (INVALID_CHAR(*inP)) {
     strcpy(outP, REPLACEMENT_STRING(*inP));
     outP += 3;
     inP++;
   } else {
      *outP++ = *inP++; // copy the char
   }
}


ignoring everything above, this kind of speed is not an issue for something getting hit 2 million times per day.  Thats only 23 hits per second, you've got 4ms to deliver a result.  I have coded complete URL encoders that can do this in a few microseconds per character (on average), running on a 500Mhz machine.  Remember, even if you hit memory at 60ns every single instruction, you still have 17 instructions per microsecond to work with.  And its actually difficult to make code miss caches that much.

If you're really worried about speed, use a profiler.  Find out where your slow-downs occur.  It probably wont be the URL encoding part.
0
 
jkrCommented:
If you are using VC++, you could use 'InternetCanonicalizeUrl()' to do that for you. If not, you could use

#include <string>

using namespace std;

string strURL = "http://www.somesite.com/url with spaces/index.html";

int npos = 0;
while ( -1 != ( npos = strURL.find ( ' ', npos))) { //as long as we find spaces

    strURL.replace ( npos, 1, "%20"); // replace them with %20
}
0
 
BuickFreakAuthor Commented:
Thanks. That worked like a charm. I guess I am just nervous about server load. The script is looking through a monster DB to get results and I have tried several solutions from PHP to mod_perl. With my traffic increasing as much as it has over the last month, even my mod_perl solution was having trouble during peak hours.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.