michaelh77
asked on
Parsing unicode text file with wcstok?
Hello,
I have an Excel spreadsheet containing data in multiple languages (English, Korean, Japanese, Chinese, and Arabic). I'm looking for a way to get this data into the wchar_t portion of my C structure.
My first thought was to save the excel spreadsheet to a unicode text file, and then parse the file using wcstok.
However, I noticed when doing this that my second call to wcstok returns NULL. If I replace the unicode text file with a normal text file with only english characters it seems to parse properly.
Am I barking up the wrong tree? Is it possible to parse a Unicode text file with wcstok? Or is there something more involved to this process?
Thanks,
Mike
I have an Excel spreadsheet containing data in multiple languages (English, Korean, Japanese, Chinese, and Arabic). I'm looking for a way to get this data into the wchar_t portion of my C structure.
My first thought was to save the excel spreadsheet to a unicode text file, and then parse the file using wcstok.
However, I noticed when doing this that my second call to wcstok returns NULL. If I replace the unicode text file with a normal text file with only english characters it seems to parse properly.
Am I barking up the wrong tree? Is it possible to parse a Unicode text file with wcstok? Or is there something more involved to this process?
Thanks,
Mike
ASKER
Yes, I am developing in VC++ 6.0.
I am using fgetws to read in each line and I also noticed
that when I used fgetws to count the number of records in my file it does not get the proper number. There are over 100 entries in my file and the count gives me 25.
Is it possible I could be using fgetws wrong?
Thanks.
int main( int argc, char ** argv )
{
wchar_t line[256];
int i,entries;
wchar_t *token;
FILE *fp;
fp = fopen("unitest.txt","r");
if (fp == NULL)
{
printf("could not open i18n file.\n");
exit(0);
}
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
i++;
}
rewind(fp);
entries = i;
// malloc space for my names_from_file structure...
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
token = wcstok (line, L",");
//wcscpy(names_from_file_e nglish[i]. name,token );
token = wcstok (NULL, L",");
//wcscpy(names_from_file_c hinese[i]. name,token );
token = wcstok (NULL, L",");
//wcscpy(names_from_file_k orean[i].n ame,token) ;
i++;
}
fclose(fp);
//free names_from_file structure...
exit(0);
}
I am using fgetws to read in each line and I also noticed
that when I used fgetws to count the number of records in my file it does not get the proper number. There are over 100 entries in my file and the count gives me 25.
Is it possible I could be using fgetws wrong?
Thanks.
int main( int argc, char ** argv )
{
wchar_t line[256];
int i,entries;
wchar_t *token;
FILE *fp;
fp = fopen("unitest.txt","r");
if (fp == NULL)
{
printf("could not open i18n file.\n");
exit(0);
}
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
i++;
}
rewind(fp);
entries = i;
// malloc space for my names_from_file structure...
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
token = wcstok (line, L",");
//wcscpy(names_from_file_e
token = wcstok (NULL, L",");
//wcscpy(names_from_file_c
token = wcstok (NULL, L",");
//wcscpy(names_from_file_k
i++;
}
fclose(fp);
//free names_from_file structure...
exit(0);
}
ASKER
Yes, I am developing in VC++ 6.0.
I am using fgetws to read in each line and I also noticed
that when I used fgetws to count the number of records in my file it does not get the proper number. There are over 100 entries in my file and the count gives me 25.
Is it possible I could be using fgetws wrong?
Thanks.
int main( int argc, char ** argv )
{
wchar_t line[256];
int i,entries;
wchar_t *token;
FILE *fp;
fp = fopen("unitest.txt","r");
if (fp == NULL)
{
printf("could not open i18n file.\n");
exit(0);
}
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
i++;
}
rewind(fp);
entries = i;
// malloc space for my names_from_file structure...
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
token = wcstok (line, L",");
//wcscpy(names_from_file_e nglish[i]. name,token );
token = wcstok (NULL, L",");
//wcscpy(names_from_file_c hinese[i]. name,token );
token = wcstok (NULL, L",");
//wcscpy(names_from_file_k orean[i].n ame,token) ;
i++;
}
fclose(fp);
//free names_from_file structure...
exit(0);
}
I am using fgetws to read in each line and I also noticed
that when I used fgetws to count the number of records in my file it does not get the proper number. There are over 100 entries in my file and the count gives me 25.
Is it possible I could be using fgetws wrong?
Thanks.
int main( int argc, char ** argv )
{
wchar_t line[256];
int i,entries;
wchar_t *token;
FILE *fp;
fp = fopen("unitest.txt","r");
if (fp == NULL)
{
printf("could not open i18n file.\n");
exit(0);
}
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
i++;
}
rewind(fp);
entries = i;
// malloc space for my names_from_file structure...
i = 0;
while (fgetws (line, 256, fp) != NULL)
{
token = wcstok (line, L",");
//wcscpy(names_from_file_e
token = wcstok (NULL, L",");
//wcscpy(names_from_file_c
token = wcstok (NULL, L",");
//wcscpy(names_from_file_k
i++;
}
fclose(fp);
//free names_from_file structure...
exit(0);
}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Nothing has happened on this question in more than 13 months. It's time for cleanup!
My recommendation, which I will post in the Cleanup topic area, is to
accept answer by Kryp [grade B] (only a hint towards a solution).
PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!
jmcg
EE Cleanup Volunteer
My recommendation, which I will post in the Cleanup topic area, is to
accept answer by Kryp [grade B] (only a hint towards a solution).
PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!
jmcg
EE Cleanup Volunteer
Is it returning the whole string as the first token for example (when it shouldn't be)?
I assume that you're using VC++ as your compiler, since you mention excel.