Can you Explain by an example , I'll be really appreciated.
Main Topics
Browse All TopicsHi ,
I Want a code that Decoding a string recived in (ISO-8859-1) CharSet format Here Is the sample:
أهد
The Above String Represents an arabic words.
I have the code that decoding it in a (Base64) here is it:
static inline int DecodeBase64Char(unsigned int nCode)
{
if (nCode >= 'A' && nCode <= 'Z')
return nCode - 'A';
if (nCode >= 'a' && nCode <= 'z')
return nCode - 'a' + 26;
if (nCode >= '0' && nCode <= '9')
return nCode - '0' + 52;
if (nCode == '+')
return 62;
if (nCode == '/')
return 63;
return 64;
}
int CMimeCodeBase64::Decode(un
{
const unsigned char* pbData = m_pbInput;
const unsigned char* pbEnd = m_pbInput + m_nInputSize;
unsigned char* pbOutStart = pbOutput;
unsigned char* pbOutEnd = pbOutput + nMaxSize;
int nFrom = 0;
unsigned char chHighBits = 0;
while (pbData < pbEnd)
{
if (pbOutput >= pbOutEnd)
break;
unsigned char ch = *pbData++;
if (ch == '\r' || ch == '\n')
continue;
ch = (unsigned char) DecodeBase64Char(ch);
if (ch >= 64) // invalid encoding, or trailing pad '='
break;
switch ((nFrom++) % 4)///--->Mod
{
case 0:
chHighBits = ch << 2;//---->shif lif
break;
case 1:
*pbOutput++ = chHighBits | (ch >> 4);///---->shift right
chHighBits = ch << 4;
break;
case 2:
*pbOutput++ = chHighBits | (ch >> 2);//------->
chHighBits = ch << 6;
break;
default:
*pbOutput++ = chHighBits | ch;
}
}
return (int)(pbOutput - pbOutStart);
}///-------------------->
But This decoding is sutible for characters Like This format
1-UKnown like : =?UNKNOWN?B?zdHf5ccg7ccgw8
2-Windows-1256 Charset Like this: =?windows-1256?B?2sfj4SDj5
Please Help.
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Here is a function that will do the conversion.
dest is the the destination space where to copy the result
maxspace is the maximum amount of characters dest can hold
src is the input string
---
int decodeCharEntities( wchar_t *dest, size_t maxspace, const char *src );
char *str = "أهد
int main( int argc, char **argv )
{
wchar_t dest[ 100 ];
int len = decodeCharEntities( dest, 100, str );
wprintf( L"%s (len=%d)\n", dest, len );
}
// helper function for decodeCharEntities
static wchar_t parseCharEntity( const char *&src )
{
int val = 0, base = 10;
if( *src != '#' )
return (wchar_t) *src;
if( *++src == 'x' )
{
base = 16;
++src;
}
for(;; ++src ) switch( *src )
{
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
val = val * base + ( *src & 0xf );
break;
case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
val = ( val << 4 ) + 9 + ( *src & 0xf );
break;
default:
return (wchar_t) val;
case '\0':
return 0;
}
}
// decodes xml-style char entities
// returns the number of characters written to dest (excluding '\0')
int decodeCharEntities( wchar_t *dest, size_t maxspace, const char *src )
{
size_t cnt;
if( !( cnt = maxspace ) )
return 0;
for( --maxspace; --cnt && *src; ++dest, ++src )
{
if( *src != '&' )
*dest = (wchar_t) *src;
else if( !( *dest = parseCharEntity( ++src ) ) )
break;
}
*dest = 0;
return maxspace - cnt;
}
You can do that like
#include <stdlib.h>
#include <string>
#include <iostream>
using namespace std;
wchar_t EntityToUnicode(const string& strEntity) {
char* pcCnvEnd;
long l = strtol(strEntity.c_str(), &pcCnvEnd, 10);
if (*pcCnvEnd == ';') return (wchar_t) l;
return 0;
}
int DecodeEntities(string strIn, wstring& strResult) {
const string strDelim = "&#";
int nPos = 0;
int nCount = 0;
int nFound;
string strToken;
while(1) {
nFound = strIn.find(strDelim,nPos);
if (-1 == nFound) {
strToken = strIn.substr(nPos,strIn.le
strResult += EntityToUnicode(strToken);
break;
}
strToken = strIn.substr(nPos,nFound - nPos);
strResult += EntityToUnicode(strToken);
nPos = nFound + strDelim.size();
++nCount;
}
return nCount;
}
int main () {
wstring strResult;
DecodeEntities("أ
wcout << strResult;
}
See also http://en.wikipedia.org/wi
Business Accounts
Answer for Membership
by: x4uPosted on 2006-12-19 at 05:15:26ID: 18165308
The example you posted is not base64 encoded. It obviously uses a character escaping as it is known from XML/SGML, so called characher entities. I.e. ف is character with the decimal value 1601.
To sucessfully decode and process such strings you need to strore them as 16 bit strings i.e. wchat_t *. A char * can hold only 8 bit strings.