• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 418
  • Last Modified:

Split method in evc++

Hi all

I know that there is a Split method in VB, PHP
I am wondering how I can do that in EVC++
Thanks in advance
0
mwcmp
Asked:
mwcmp
  • 21
  • 13
  • 11
  • +1
2 Solutions
 
AlexFMCommented:
See strtok function.
0
 
jkrCommented:
'strtok()' is one way:

/* STRTOK.C: In this program, a loop uses strtok
 * to print all the tokens (separated by commas
 * or blanks) in the string named "string".
 */

#include <string.h>
#include <stdio.h>

char string[] = "A string\tof ,,tokens\nand some  more tokens";
char seps[]   = " ,\t\n";
char *token;

void main( void )
{
   printf( "%s\n\nTokens:\n", string );
   /* Establish string and get the first token: */
   token = strtok( string, seps );
   while( token != NULL )
   {
      /* While there are tokens in "string" */
      printf( " %s\n", token );
      /* Get next token: */
      token = strtok( NULL, seps );
   }
}

Another alternative would be CTokenizer: http://www.codeproject.com/string/tokenizer.asp

CTokenizer tok(_T("A-B+C*D-E"), _T("-+"));
CString cs;

while(tok.Next(cs))
    TRACE2("Token: '%s', Tail: '%s'\n", (LPCTSTR)cs, (LPCTSTR)tok.Tail());
0
 
mwcmpAuthor Commented:
but my buffer is in TCHAR format... and my delimiter is a whitespace, and I want to split them into CSting. How can that be done? And also how can I implement it in EVC++?
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
jkrCommented:
>> but my buffer is in TCHAR format... and my delimiter is a whitespace, and I want to split them into CSting

Where do you see a problem here?

CTokenizer tok(_T("A-B+C*D-E"), _T(" ")); // change "-+" to " "
CString cs;

while(tok.Next(cs))
   TRACE2("Token: '%s', Tail: '%s'\n", (LPCTSTR)cs, (LPCTSTR)tok.Tail());
0
 
mwcmpAuthor Commented:
while(tok.Next(cs))
   TRACE2("Token: '%s', Tail: '%s'\n", (LPCTSTR)cs, (LPCTSTR)tok.Tail());

What does this do?
0
 
mwcmpAuthor Commented:
fatal error C1083: Cannot open include file: 'bitset': No such file or directory
>>#      include <bitset>

and also, I get this error
0
 
jkrCommented:
>>What does this do?

It splits a string - as mentioned in the article linked above:  CTokenizer - http://www.codeproject.com/string/tokenizer.asp

>>and also, I get this error

Um, from what code?
0
 
mwcmpAuthor Commented:
fatal error C1083: Cannot open include file: 'bitset': No such file or directory
>>#     include <bitset>
0
 
mwcmpAuthor Commented:
what I mean is, what does this part of the code does?

while(tok.Next(cs))
   TRACE2("Token: '%s', Tail: '%s'\n", (LPCTSTR)cs, (LPCTSTR)tok.Tail());

0
 
jkrCommented:
And, where does that code/line live?
0
 
jkrCommented:
while(tok.Next(cs))
  TRACE2("Token: '%s', Tail: '%s'\n", (LPCTSTR)cs, (LPCTSTR)tok.Tail());

As elaborated in the article (hint! hint!), the above code breaks down the line and pronts the parts...
0
 
mwcmpAuthor Commented:
>>As elaborated in the article (hint! hint!), the above code breaks down the line and pronts the parts...
but I do not need to output the token, instead, I need to store some of them in some other CString


>>And, where does that code/line live?
>>#     include <bitset>
in the header file of Tokenizer
0
 
jkrCommented:
>>but I do not need to output the token, instead, I need to store some of them in some other CString

That's what it does before printing it.... notice the 1st argument to 'TRACE2()', this is the string 'cs' in which the token was stored.

>>>>And, where does that code/line live?
>>>>#     include <bitset>
>>in the header file of Tokenizer

'bitset' usually is in you 'include' directory - it might not be available for evc++, though (well, the article suggests that it should work for WinCE also, but...) If you do not have 'bitset', you might want to give 'strtok()' a chance. Using that with 'CString' works like

#include <string.h>

//...

CString strSrc = "This is a test string";
CString strToken;
char* psz = strdup(strSrc);
char* pszToken = NULL;

pszToken = strtok(psz, " ");

while (pszToken) {

    strToken = pszToken; // token is now stored in 'strToken'

    pszToken = strtok(NULL, " ");
}

free(psz);
0
 
mwcmpAuthor Commented:
CString strToken;
char* psz = _strdup(storage);//where storage is a TCHAR array of size 81
char* pszToken = NULL;
pszToken = strtok(psz, " ");

while (pszToken) {
   strToken = pszToken;
   pszToken = strtok(NULL, " ");
}

and I get this error:-
>>error C2664: '_strdup' : cannot convert parameter 1 from 'unsigned short [81]' to 'const char *'
        Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast

does it means that TCHAR cannot be use here?
0
 
jkrCommented:
Yes, it can - sorry, then, we need to move to the respective TCHAR-aware functions

CString strToken;
TCHAR* psz = _tcsdup(storage);//where storage is a TCHAR array of size 81
TCHAR* pszToken = NULL;
pszToken = _tcstok(psz, _T" "));

while (pszToken) {
  strToken = pszToken;
   pszToken = _tcstok(NULL, _T(" "));
}
0
 
mwcmpAuthor Commented:
ok.. error free now..
But how can I retrieve the data?
0
 
jkrCommented:
That depends on what you want to do with that. In the above sample, the contents of 'strToken' are overwritten on each run of the loop. You could store them in a string list though, e.g.

CStringList lstTokens;
CString strToken;
TCHAR* psz = _tcsdup(storage);//where storage is a TCHAR array of size 81
TCHAR* pszToken = NULL;
pszToken = _tcstok(psz, _T" "));

while (pszToken) {
  strToken = pszToken;
  lstTokens.AddTail(&strToken); // add to list
  pszToken = _tcstok(NULL, _T(" "));
}

0
 
mwcmpAuthor Commented:
this line gave me error:-
lstTokens.AddTail(&strToken); // add to list

>>error C2664: 'struct __POSITION *__thiscall CStringList::AddTail(const unsigned short *)' : cannot convert parameter 1 >>from 'class CString *' to 'const unsigned short *'
        Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast

0
 
jkrCommented:
That's a bit weird, but CStringLists might be different for evc++ - try

while (pszToken) {
 strToken = pszToken;
 lstTokens.AddTail(strToken); // add to list
 pszToken = _tcstok(NULL, _T(" "));
}

instead.
0
 
Jaime OlivaresCommented:
I will suggest you a pure MFC solution, because if you are working with MFC I think you don't have to deal with strtok, it has some problems (it is not re-entrant)

int split(TCHAR separator, CString theString, CStringArray &result, bool alowEmptyElement)
{
    CString part;
   
    result.RemoveAll();
    for (int pos=0; pos<theString.GetLength(); pos++) {
         if (theString[pos]==separator) {
               if (!part.IsEmpty() || alowEmptyElement) {
                       result.Add(part);
                       part.Empty();
               }
         } else
               part += theString[pos];
     }
    return result.GetSize();
}
0
 
Jaime OlivaresCommented:
It would be better to use:
int split(TCHAR separator, CString &theString, CStringArray &result, bool alowEmptyElement)

notice change in 2nd argument.
0
 
mwcmpAuthor Commented:
I will try both of them
0
 
jkrCommented:
>> think you don't have to deal with strtok, it has some problems (it is not re-entrant)

And how would that cause *any* trouble here?
0
 
mwcmpAuthor Commented:
jkr

removing '&' resolve the error, but I had never deal with CStringList before, thus I am still trying.
I tried with the GetAt(), but I wonder why I had to pass in a POSITION type, and I had no idea what that is.
Why can't I use int?
0
 
Jaime OlivaresCommented:
>> think you don't have to deal with strtok, it has some problems (it is not re-entrant)

>>And how would that cause *any* trouble here?

mwcmp will use this formula in a secondary thread, so if he call concurrently split() in the primary thread, he will have troubles.
0
 
mwcmpAuthor Commented:
jamie

>int split(TCHAR separator, CString &theString, CStringArray &result, bool alowEmptyElement)

1) why is the separator in TCHAR, and not CString?
2) theString --> the String that I want to split? But my string is in TCHAR
3) what is alowEmptyElement?  Do I set it to be false by default?
0
 
jkrCommented:
>>Why can't I use int?

You can - using 'FindIndex()', e.g.

CString str = strTokens.FindIndex(0);

>>mwcmp will use this formula in a secondary thread

Hmm, cannot see that here. And, even if so, that would only apply if both threads use 'strtok()' concurrently. And even then, you won't have a problem when using the multithreaded CRT. From the docs: "However, calling this function simultaneously from multiple threads does not have undesirable effects."
0
 
Jaime OlivaresCommented:
>1) why is the separator in TCHAR, and not CString?
That's my implementation, could be CString, but funtion will be more complex and slow

2) theString --> the String that I want to split? But my string is in TCHAR
Your string could be TCHAR *, I suggest a variation for this case below

3) what is alowEmptyElement?  Do I set it to be false by default?
It enables or disables the posibility to generate an array element when 2 consecutive separator appear. Yes you can set false by default, is your decision.

---------------------------------------

int split(TCHAR separator, TCHAR *theString, CStringArray &result, bool alowEmptyElement)
{
    CString part;

    result.RemoveAll();
    for (int pos=0; pos<theString[pos];  pos++) {
         if (theString[pos]==separator) {
               if (!part.IsEmpty() || alowEmptyElement) {
                       result.Add(part);
                       part.Empty();
               }
         } else
               part += theString[pos];
     }
    return result.GetSize();
}
0
 
Jaime OlivaresCommented:
Sorry, about my last post, must be:

   for (int pos=0; theString[pos];  pos++) {
 
0
 
mwcmpAuthor Commented:
I did this:

CStringArray temp;
bool flag=false;
TCHAR delimiter = "";
int size = split(delimiter, buffer1, temp, flag);

and had 5 errors:

1) error C2440: 'initializing' : cannot convert from 'char [1]' to 'unsigned short'
        This conversion requires a reinterpret_cast, a C-style cast or function-style cast
2) error C2664: 'split' : cannot convert parameter 2 from 'unsigned short [81]' to 'unsigned short &'
        A reference that is not to 'const' cannot be bound to a non-lvalue
3) error C2109: subscript requires array or pointer type
4) error C2109: subscript requires array or pointer type
5) error C2109: subscript requires array or pointer type

error 1 refer to TCHAR delimiter = "";
error 2 refer to int size = split(delimiter, buffer1, temp, flag);
error 3-5 refer to the for, if and else respectively in your method
0
 
mwcmpAuthor Commented:
BTW
I had tried the strtok method, and succeed.

jkr. FYI. It should be lstTokens.GetAt(lstTokens.FindIndex(8))
As FindIndex() return a POSITION that can be pass into GetAt()
0
 
mwcmpAuthor Commented:
So should I use the MFC or strtok solution?
0
 
jkrCommented:
I'd use 'strtok()' - for the simple reason that it is tested well over the last 30 years and *probably* is a bit faster (but that might also not matter in this case)
0
 
Jaime OlivaresCommented:
I see two problems: separator must be something like:

TCHAR separator = ',';

You must use my last version. Here is debugged:

int split(TCHAR separator, TCHAR *theString, CStringArray &result, bool alowEmptyElement)
{
    CString part;

    result.RemoveAll();
    for (int pos=0; theString[pos];  pos++) {
         if (theString[pos]==separator) {
               if (!part.IsEmpty() || alowEmptyElement) {
                       result.Add(part);
                       part.Empty();
               }
         } else
               part += theString[pos];
     }
    return result.GetSize();
}
0
 
mwcmpAuthor Commented:
ok.. I am now left with error 1 & 2

>TCHAR separator = ',';
Does that means that I am not able to use the method with a whitespace then? In other words, I also cannot use it in my context here?

What is error 2 about?
0
 
Jaime OlivaresCommented:
TCHAR separator = ' ';  <--- space there

or

TCHAR separator = 32;

I think the second error must be related to my first version, second version use second argument as TCHAR * (or maybe you have written TCHAR &)
0
 
mwcmpAuthor Commented:
I had changed from & to * in the method


CStringArray temp;
bool flag=false;
TCHAR delimiter = ' ';
int size = split(delimiter, buffer1, temp, flag);

but still get this error
>>error C2664: 'split' : cannot convert parameter 2 from 'unsigned short [81]' to 'unsigned short &'
        A reference that is not to 'const' cannot be bound to a non-lvalue
0
 
Jaime OlivaresCommented:
is buffer1 CString type=, you mentioned it is TCHAR *, if true, then you have to use my first version (this with CString & argument)

0
 
mwcmpAuthor Commented:
Nope. buffer1 is of TCHAR
0
 
Jaime OlivaresCommented:
Try a simple test:
TCHAR *buffer1 = TEXT("ONE,TWO,THREE,FOUR");
CStringArray temp;
bool flag=false;
TCHAR delimiter = ',';
int size = split(delimiter, buffer1, temp, flag);
0
 
mwcmpAuthor Commented:
it is still the same error. :?
0
 
Jaime OlivaresCommented:
I can't believe it, I am compiling with EVC++ right now. No errors. I think you have not copied proper version correctly.
I will transcript again:
int split(TCHAR separator, TCHAR *theString, CStringArray &result, bool alowEmptyElement)
{
    CString part;

    result.RemoveAll();
    for (int pos=0; theString[pos];  pos++) {
         if (theString[pos]==separator) {
               if (!part.IsEmpty() || alowEmptyElement) {
                       result.Add(part);
                       part.Empty();
               }
         } else
               part += theString[pos];
     }
    return result.GetSize();
}
0
 
mwcmpAuthor Commented:
I wonder why too. I had already copied the final version and I still get the same error
0
 
mwcmpAuthor Commented:
What is 'strtok()' actually? Why can't I use it?
0
 
jkrCommented:
>> What is 'strtok()' actually? Why can't I use it?

Of course you can use it. Who said (incorrectly) that you couldn't?
0
 
Jaime OlivaresCommented:
Because traditionally strtok is not threadsafe, excepting for specific implementations, but not in all, thus is not recommended to use in portable code. There are dozens of forums that discuss this issue, so, it is not incorrect to mention this.
But, in fact, you can use it.

0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 21
  • 13
  • 11
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now