asked on

How do I use memcpy to allocate multiple characters?

This code is supposed to copy a token (two characters) from the 1-D character array linetext to *token. But when I cout *token, only the second character is printed. Does this mean that I incorrectly copied linetext or that I'm incorrectly printed token?

(*token) = (char *) malloc(tokenLength + 1);
memcpy (*token, &linetext[i], tokenLength);
cout << "token: " << *token << endl;

Open in new window

97WideGlide

Check the value of i. C arrays are zero based (first character is at positon 0). I am guessing that you are indexing into linetext assuming the first character is at linetext[1].

Hope it helps.
Please let me know.

97WideGlide

Also, note the memory is not initialized by malloc() so you might want to make sure *token is null terminated after the memcpy().

vwps

ASKER

>> Check the value of i

Thanks 97WideGlide! You were right, the values of i in memcpy were incorrect. While I think I fixed it for 1-character tokens, it still doesn't work for 2-character tokens. Is it because of the way the lines are looped through? (Please see attached code)

>> you might want to make sure *token is null terminated after the memcpy()

What does this mean?

/* system_utilities.cpp */
 
#include <iostream>
#include <fstream>
#include <string.h> 
#include "system_utilities.h"
#include "definitions.h"
 
using namespace std;
 
ifstream file; // create input stream object
 
char linetext[256]; // array of characters to hold 255 characters + terminating null char
int length; // variable to hold length of current line of input (num chars)
int pos = 0; // variable to hold position of last character of current input line read by getNextToken
char *word;
 
int errorcode;
 
void printError(int errorcode)
{
   switch ( errorcode )
   {
      case END_OF_FILE:
         cout << "Error: End of file has been reached." << endl;
         break;
      case FILE_NOT_OPENED:
         cout << "Error: File could not be opened." << endl;
         break;
      case TOKEN_NOT_FOUND:
         cout << "Error: Token not found." << endl;
         break;
      default:
         cout << "There is an error." << endl;
   }
}
 
int openInputFile (char fname[])
{
   file.open(fname, ios::in); // what is ios::in for?
   if (file.good())
   {
      cout << fname << " was read successfully" << endl;
      return 0; 
   }
   else
   {
      cout << "file not read successfully" << endl;
      return FILE_NOT_OPENED; 
   }
}
 
// NOTE: WILL NOT READ NEXT LINE UNLESS LAST CHARACTER OF CURRENT LINE HAS A SPACE AFTER IT
 
int getNextToken(char **token) 
{
   int tokenFound = 0; // whether or not a token has been found
   int tokenLength = 0; // length of token
  
   while(tokenFound == 0) // when no token has been found
   {
      if(pos == 0) // if you're at the beginning of a line
      {
         int j;
         for (j = 0; j <256; j++) // set all character array values to NULL
         {
            linetext[j] = '\0';
         }
         file.getline(linetext, 256); // read in a line of text from the file
         length = strlen(linetext); // get the length of the line
      }
      
	  // recognizing single characters 
	  
	  int i;
	  for(i = pos; i < length && (tokenFound == 0); i++) // loop through whole line
	  { 
	     if( ((linetext[i] != ' ') && (linetext[i] != '\0')) && ( (linetext[i+1] == ' ') || (i == (length - 1))) ) // if the character is not a space or a null and the next character is a space // and the next character is a space, or this is the last character in the line
		 {
			cout << "       pos: " << pos << " linetext[" << i << "] = " << linetext[i] << endl;
			cout << "       i = " << i << "      length = " << length << endl;
			tokenLength = 1; // QUESTION: how do i fix this? i am confused about pos and i's difference
			(*token) = (char *) malloc(tokenLength + 1); // first row of token has enough space for length of token + null character
			memcpy(*token, &linetext[i - tokenLength + 1], tokenLength); // copy token into *token
			cout << "ONE CHARACTER TOKEN: " << *token << endl;
			cout << "       pos: " << pos << endl;
			pos = pos + tokenLength; 
		 }
		 
		 else if( (symbol(linetext[i]) && (symbol(linetext[i+1]))) ) // else if it's a symbol and the next one is a symbol 
		 {
			cout << linetext[i] << linetext[i+1] << " is a symbol with more than 1 character" << endl; // how is it executing both the if and the else statements?
			cout << "pos: " << pos << endl;
			tokenLength = 2; // QUESTION: how do i fix this? i am confused about pos and i's difference
			(*token) = (char *) malloc(tokenLength + 1);
			memcpy(*token, &linetext[i - tokenLength + 1], tokenLength);
			cout << "tokenLength: " << tokenLength << endl;
			cout << "TWO CHARACTER SYMBOL TOKEN: " << *token << endl; // QUESTION: why is this only printing the last character of the token?
			pos = pos + tokenLength; 
			cout << "pos: " << pos << endl;
			tokenFound = 1;
			i++;
		 } 
		 
		 else if( (linetext[i] != ' ') && (linetext[i] != '\0') && (linetext[i + 1] != ' ') && (linetext[i + 1] != '\0') )
		 {
			while( (linetext[i + 1] != ' ') && (linetext[i + 1] != '\0') && (i != length) )
			{
				cout << "                     next character is part of the token" << endl;
			// else if this one and the next one is a letter (two or more letters --> a word)
			/*cout << linetext[i] << linetext[i + 1] << " is a 2-letter word" << endl;
			tokenLength = 2; // QUESTION: how do i fix this? i am confused about pos and i's difference
			cout << "      tokenLength = " << tokenLength << endl;
			(*token) = (char *) malloc(tokenLength + 1);
			memcpy(*token, &linetext[i], tokenLength);
			*/
				i++;
			//tokenFound = 1;
			}
		 }
		 
	  }
	  
	  if(i == length) // if you've reached the end of the line 
	  {
		pos = 0; // reset pos to 0
		if(file.eof()) // if you're also at the end of the file
		{
			return END_OF_FILE;
		}
	  }
	}
	return 0;
}
 
int symbol(char c)
{
   if ((c == '>') || (c == '<') || (c == '/') || (c == '='))
      {
      return 1;
      }
   else
      {
      return 0;
      }
}
 
/* system_utilities.h */
 
int openInputFile(char fname[]); // input file name string, open file and assign to file-level ifstream variable
int getNextToken( char **token ); // finds next token in input file, allocates new space, assigns token to point to new space, copies characters
void printError(int errorcode); // prints appropriate error message
int symbol(char c);
 
/* main.cpp */
 
#include <iostream>
#include "system_utilities.h"
#include "definitions.h"
 
using namespace std;
 
char *token1[10000];
 
int main() 
{
   openInputFile("/Users/vshen/Documents/EECS 211/Program4/onlychars.txt");
   while(getNextToken(token1) != END_OF_FILE);
   cout << "The file has been tokenized." << endl;
   return 0;
}
 
/* definitions.h */
 
#define END_OF_FILE 201 
#define FILE_NOT_OPENED 202 
#define TOKEN_NOT_FOUND 301 
#define MAX_LINE_LENGTH 255

Open in new window

Infinity08

I would suggest using strncpy instead of memcpy :

http://www.cplusplus.com/reference/clibrary/cstring/strncpy.html

strncpy(*token, &linetext[i], tokenLength);
(*token)[tokenLength] = '\0';

Open in new window

SOLUTION

evilrix

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

97WideGlide

I'm not sure what you are trying to accomplish.

For example, depending on your input stream, I don't think you are guaranteed that your token will be "1 char only" as you mention in your comments.

but to answer your questions directly without trying to guess :

82: tokenLength = 1; // QUESTION: how do i fix this? i am confused about pos and i's difference
tokenLength = i + 1;

Hold on. As I was in the process of making suggestions, I thought that it would be best if you explained what it is that you are trying to do before I make more suggestions. If this is a homework project and you are constrained in the way you should solve the problem, say so. We'll work with what you have. Otherwise, if you just want to get the job done there might be standard C functions which will make your code much more straightforward.

In the meantime, consider the change to line 82 above. It might be enough of a suggestion to enable you to change the rest of your code to function the way you want.

vwps

ASKER

Thank you for your comments. I'm sorry I didn't include the specifications earlier. evilrix, I agree that my code is a mess. Unfortunately, every function that I wrote (except for "symbol") is required. I am explicitly forbidden from using the built-in string class.

The purpose of this program is to split lines of text read in from a text file into tokens. The tokens will be single characters, words, symbols (one or two characters long), and quoted strings. For the full assignment description, see the attached file.

>> In the meantime, consider the change to line 82 above.

tokenLength = i + 1 does not work because there can be multiple tokens in a line. I think it's supposed to be closer to tokenLength = i - pos + 1, with i being the index of the last character in the token and pos being the last character read by the function before this token began. I'm not really sure how to implement pos.
p4directions.txt

SOLUTION

itsmeandnobodyelse

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER CERTIFIED SOLUTION

97WideGlide

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

vwps

ASKER

Thanks very much, and I'm sorry for the delay in assigning points.