Writing wstring to text file

I tried writing a wstring to a text file, but I keep getting a text file that has single byte charactors instead of a wchar?

Can someone please provide example code.
MACCONSUELAAsked:
Who is Participating?
 
AxterConnect With a Mentor Commented:
Another approach would be to use write function with a typecast to char*

Example:
#include <stdlib.h>

#include <fstream>
using namespace std;
int main(int, char*)
{
     wstring wstr = L"This is UNICODE";
     
     ofstream ofs;
    ofs.open( "c:\\myfile.bin", ios_base::out | ios_base::trunc | ios_base::binary);
     ofs.write((char*)&wstr[0], wstr.size());

     system("pause");
     return 0;
}
0
 
jkrCommented:
#include <string>

using namespace std;

void main () {

 wstring wstr = L"This is UNICODE";
 wofstream wofs ( "myfile.txt");

 wofs << wstr << endl;
}

actually should do it....
0
 
MACCONSUELAAuthor Commented:
That's what I tried in the first place, but it didn't work.
The text file I get has single byte charactors.
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
jkrCommented:
>>The text file I get has single byte charactors.

Err, how do you know? most editors will display the text as readable chars? Hex editors will show you the 'real' content.
0
 
MACCONSUELAAuthor Commented:
I used VC++, and opened it as a binary file.
0
 
MACCONSUELAAuthor Commented:
Just incase, I copied your code and pasted in a new VC++ project.

#include <string>
#include <iostream>
#include <fstream>

using namespace std;

int main(int argc, char* argv[])
{
     wstring wstr = L"This is UNICODE";
     wofstream wofs ( "c:\\myfile.txt");
     wofs << wstr << endl;

     return 0;
}

I ran the above program, and I still ended up with a single byte charactor text file.
I verified this by opening the file with VC++ in BINARY mode.
I also checked the file via explore, and view the property of the file.  It showed the file as having 17 bytes.
If it was a UNICODE, it would be double that size.

jkr, have you tried this code on  your computer?  If so, what did you get.
0
 
MACCONSUELAAuthor Commented:
By the way, I tried this out on a Windows 2000 and Windows 98.
I compiled it in VC++ version 6.0.

The code failed to work in both operating systems.
0
 
fl0ydCommented:
That's really strange... I also tried to get it to work on my win2k-system with vs6sp5. I can't get the src to compile to begin with; vc is complaining about the <<-operator not being defined for a std::wstring. So I kicked the std::wstring and placed the string-literal right in that place. I #undefine'd _MBCS (multi-byte-character-set), #define'd _UNICODE and even UNICODE even though this shouldn't be necessary. To no avail - all I get is single-byte characters in the output-file. I also know for *SURE* they are single characters.
0
 
DanRollinsCommented:
As long as wifstream can read themn back, it seems that the standard allows wofstream to write 80bit values:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF8&th=ab095398862f62b2&rnum=12

So the question remains:  How does one write 16-bit values to the file?

Well, as with all STL problems there is a very easy solution.  It is as obvious as the nose on your face.  All you need to do is single-step through the code to find that the STL geniuses determine the size of the output character based upon the character attributes of the active locale.  A locale is a very simple object, like they cover in DP 101 when they teach you about punchcards and all that.  Sheesh. I goes without SAYING!

Here is the simple solution:

#include <locale>
#include <fstream>

using namespace std;

typedef codecvt<wchar_t, char, mbstate_t> Mybase;

// CLASS Simple_codecvt
class Simple_codecvt : public Mybase {
public:
     typedef wchar_t _E;
     typedef char _To;
     typedef mbstate_t _St;
     explicit Simple_codecvt(size_t _R = 0)
     : Mybase(_R) {}
protected:
     virtual result do_in(_St& _State,
     const _To *_F1, const _To *_L1, const _To *& _Mid1,_E *_F2, _E *_L2, _E *& _Mid2) const
          {return (noconv); }
     virtual result do_out(_St& _State,     const _E *_F1, const _E *_L1, const _E *& _Mid1,_To *_F2, _To *_L2, _To *& _Mid2) const
          {return (noconv); }

     virtual result do_unshift(_St& _State,_To *_F2, _To *_L2, _To *& _Mid2) const
          {return (noconv); }

     virtual int do_length(_St& _State, const _To *_F1,const _To *_L1, size_t _N2) const _THROW0()
          {return (_N2 < (size_t)(_L1 - _F1)? _N2 : _L1 - _F1); }

     virtual bool do_always_noconv() const _THROW0(){return (true); }
     virtual int do_max_length() const _THROW0(){return (2); }
     virtual int do_encoding() const _THROW0(){return (2); }
};

void main()
{
    wstring wstr = L"This is UNICODE";

     locale loc = _ADDFAC(locale::classic(), new Simple_codecvt);
    wofstream wofs;
     wofs.imbue(loc);
     wofs.open( "c:\\temp\\myfile.bin", ios_base::out | ios_base::trunc | ios_base::binary);
    wofs << wstr << endl;
}

=-=-=-=-=-=-=-
-- Dan
P.S.  I know that most programmer types are too literal to understand sarcasm, so to the above description, please add:

            NOT!
<bows>
0
 
DanRollinsCommented:
P.S.  THe above code was cribbed from
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF8&selm=3b82680f%240%2423296%40wodc7nh0.news.uu.net&rnum=9

which cribbed it from:
P.J. Plauger, ``Standard C/C++: Unicode Files,'' C/C++ Users Journal,April 1999.

0
 
DanRollinsCommented:
Axter is correct.  The original question is about how to write a wide-character string to the file and that solution should work.  The stuff about wofstream is a side issue.

-- Dan

0
 
rstaveleyCommented:
This is really a plea to DanRollins....

Please could you take a look at http:/Q_20797258.html#9747020 and tell me if you know why my attempt to implement your cribbed suggestion didn't work?

I appreciate that Axter has a working solution for this, doing binary writes, but I'd really like to see how it can be done with codecvt.
0
 
rstaveleyCommented:
Actually, looking at Axter's response (all these many moons later and with the benefit of more modern copilers), I see that his approach gets UTF-32 UNICODE on GCC 3.2 and UTF-16 UNICODE on VC 7.1, which is a reflection of the respective wchar_t implementations. That is exacltly what MACCONSUELA asked for of course.

But aren't you supposed to have a wchar_t('\0') at the beginning of the file for it to be valid UNICODE.... or is that just a convention for UTF-16 XML?

--------8<--------
#include <iostream>
#include <fstream>
#include <string>

int main()
{
        using std::ios;
        std::wstring wstr = L"abc";    
        std::ofstream ofs;
        ofs.open("three_wchars.txt",ios::out|ios::trunc|ios::binary);
        ofs.write(reinterpret_cast<char*>(&wstr[0]),wstr.size()*sizeof(wchar_t));
}
--------8<--------
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.