Solved

C Programming: substring extraction

Posted on 2013-10-26
8
323 Views
Last Modified: 2013-10-28
I am trying to create a HTTP GET request, which requires the following format:
GET {Request-URI) HTTP/1.1
Host: {host_name}

If I have a string such as "http://www.csun.edu/~steve", how do I extract the {host_name} part (http://www.csun.edu), and how do I extract the {Request_URI} part (/~steve)?

I need a strategy that will do this dynamically. In other words, the next string might be "http://www.cnn.com/money", or no directory structure at all like "http://www.google.com".
0
Comment
Question by:pzozulka
  • 5
  • 3
8 Comments
 
LVL 28

Accepted Solution

by:
Bill Bach earned 500 total points
ID: 39603269
You can do this via brute force, or through the use of a parser.  I'd do something like brute force, if you know that all input is a valid URL.  The logic would look like:
1) Detect and strip off the "http://", as this is not part of the host name.
2) Set HostName and Request strings to "".
3) Copy characters off to the HostName string until you find the next "/" or EOL.  
4) If EOL, exit and return strings as needed.
5) Copy from current position to EOL to Request string.
6) Exit and return strings as needed.

Can you create the working code from this description?
0
 
LVL 8

Author Comment

by:pzozulka
ID: 39603298
Thank you very much. I think this is exactly what I was looking for. I will try to implement in code, and will get back soon with my solution.
0
 
LVL 8

Author Comment

by:pzozulka
ID: 39604278
I think I got it:
// parse host to strip http:// and any dirs
// *************************************
char src[50];
char dest[100];
char *httpProto = "http://";

if((strstr(host, httpProto)) != NULL) { //does host contain http://
	memset(dest, '\0', sizeof(dest));
	strcpy(dest, host+strlen(httpProto));
	fprintf(configfp,"Stripped HTTP:\n%s\n",dest);
	// http:// stripped -- use dest
}
// *************************************

// copy chars off to the hostName until you find "/" or '\0'
// *************************************
char request[50];
char hostName[100];
int a, b = 0, forwardSlash = 0;

if((strstr(host, httpProto)) != NULL) { // host contains http://
	for(a=0; dest[a] != '\0'; a++) { //copy domain name into hostName
		if(dest[a] != '/') {
			hostName[a] = dest[a];
		}
		else { // forward slash found
			forwardSlash = 1;
			break;
		}
	}
	while( (forwardSlash == 1) && (dest[a] != '\0')) { 
		request[b] = dest[a];
		a++;
		b++;
	}
	fprintf(configfp,"Domain:\n%s\n",hostName);
	fprintf(configfp,"Request: \n%s\n",request);
}
else { // host didn't contain http:// to begin with
...
// extract directory after domain name
}

Open in new window

0
 
LVL 8

Author Comment

by:pzozulka
ID: 39604280
I know I didn't have to extract hostName twice (once in the IF and once in the ELSE) depending if the original host name had a http:// in it or not. I just realized that.
0
Free camera licenses with purchase of My Cloud NAS

Milestone Arcus software is compatible with thousands of industry-leading cameras for added flexibility. Upon installation on your My Cloud NAS, you will receive two (2) camera licenses already enabled in the software. And for a limited time, get additional camera licenses FREE.

 
LVL 8

Author Comment

by:pzozulka
ID: 39604323
I'm doing something wrong. Here's what I get:
fprintf(configfp,"Host:\n%s\n",host);
fprintf(configfp,"Domain:\n%s\n",hostName);

Open in new window

Host:
www.facebook.com
Domain:
www.facebook.comðÙÐÁ
0
 
LVL 28

Expert Comment

by:Bill Bach
ID: 39604535
You never null-terminate the Hostname string.  Add line 30.5:
    hostName[a]=0;
0
 
LVL 8

Author Comment

by:pzozulka
ID: 39604585
I also have another urgent question, if you could please take a look at:

http://www.experts-exchange.com/Programming/Languages/C/Q_28278601.html
0
 
LVL 28

Expert Comment

by:Bill Bach
ID: 39605406
Looks like two good answers to that one already.  I agree that allocating memory in a function is a bad idea, and I also agree that the issue is in the malloc to begin with. "buffer" is defined as "char **", but the malloc call (which returns "char *") gets assigned to buffer directly.  Depending on the first few bytes in the string buffer, this will try to be interpretted as another pointer, and it breaks.
0

Featured Post

Give your grad a cloud of their own!

With up to 8TB of storage, give your favorite graduate their own personal cloud to centralize all their photos, videos and music in one safe place. They can save, sync and share all their stuff, and automatic photo backup helps free up space on their smartphone and tablet.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
C++ vs C compilers 13 157
Console based application in Linux 1 84
why "." vs "->" 23 119
xamarin c# deserialize Json containing nested object 2 79
Have you thought about creating an iPhone application (app), but didn't even know where to get started? Here's how: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Important pre-programming comments: I’ve never tri…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

943 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

1 Experts available now in Live!

Get 1:1 Help Now