redsg
asked on
Creating and writing into a Unicode file
I'm using Delphi 7 and would like to find out how to programmatically create a Unicode text file, and write lines of Unicode strings (of WideString type) into the file. Will using FileCreate automtically encode the text file as non-Unicode?
sorry, but delphi 7 doesn't handle unicode strings , only ansi ! you should code the file byte by byte, but it would be very uneasy to handle...
When using Delphi 7, you can use the TNT controls. They used to be free, but are now part of the TMS Software offerings.
ASKER
i'm already using TNT controls i.e. the Unicode strings/lines are currently in the TNTMemo component. the challenge for me is to create a Unicode encoded text file and to write the Unicode lines into it.
TNT controls should support TTntInifile as far as I remember. Didn't they also support text files or streams?
you should have told all the information in the beginning ... :-P
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I've tried the following and it seems to be able to write the Unicode characters into the text file. However, I've noticed that each line is not written on a new line when the files are viewed in notepad. But there are displayed line by line when viewed in MS Word.
How can I ensure that each string being "fed" into the method is printed on a new line? Seems like adding the '#13#10' to the end of the WideString variable isn't working.
In addition, the FreeAndNil() function in briangochnauer's code is returning an error. Is there an alternative to closing the TFileStream instance?
How can I ensure that each string being "fed" into the method is printed on a new line? Seems like adding the '#13#10' to the end of the WideString variable isn't working.
In addition, the FreeAndNil() function in briangochnauer's code is returning an error. Is there an alternative to closing the TFileStream instance?
procedure WriteToTextFile(newFile: TFileStream; StringToWrite: WideString);
var
ws: PWideChar;
buf: array of byte;
begin
ws := PWideChar(StringToWrite + #13#10);
SetLength := (buf, Length(ws)*2);
buf[0] := $FF;
buf[1] := $FE;
Move(ws[0], buf[2], Length(buf)-2);
newFile.Write(buf[0], Length(buf));
end;
ASKER
I think I've got it:
// create TFileStream instance
str := TFileStream.Create(fileLocation, fmCreate);
// for each line (of WideString type) in TNTMemo..
for i:=0 to memo.Lines.Count-1 do
begin
// .. insert line
WriteToTextFile(str, memo.Lines[i]);
// .. insert line break
WriteToTextFile(str, #13#10);
end;
ASKER
then to include this line at the end of it all:
str.Destroy;
str.Destroy;
str.Feee would be better (compared to calling the destructor directly). Or FreeAndNil(str);
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
@briangochnauer,
understood what you meant. i shall build up the widestring with the necessary #13#10's before passing it through the write function. thanks much!
understood what you meant. i shall build up the widestring with the necessary #13#10's before passing it through the write function. thanks much!
>> i shall build up the widestring with the necessary #13#10's before passing it through the write function.
This is not needed with my procedure (shown below) it will take a string; and work correctly under Delphi 7 through Delphi XE
Building the string with CRLF (#13#10) is needed but not as a wide string.
I suggest maybe a TStringList to concat string;
var SL:TStringlist;
SL := TStringlist.create;
then you can add a string with SL.Add('My next string');
and finally
WriteUnicodeFileString('c: \temp\file name.txt; SL.DelimitedText);
This is not needed with my procedure (shown below) it will take a string; and work correctly under Delphi 7 through Delphi XE
Building the string with CRLF (#13#10) is needed but not as a wide string.
I suggest maybe a TStringList to concat string;
var SL:TStringlist;
SL := TStringlist.create;
then you can add a string with SL.Add('My next string');
and finally
WriteUnicodeFileString('c:
procedure WriteUnicodeFileString(AFilename:String; AString:String);
var
str : TFIlestream;
buf : TBytes;
ws : PWideChar;
SysPrepStrings :TStringlist;
begin
str := TFIlestream.Create(AFilename,fmCreate);
try
{$IFDEF Unicode}
ws := PChar(AString); /// in Unicode, a string = WideString
{$ELSE} ws := PWidechar(UTF8Decode(AString));{$ENDIF}
setlength(buf,Length(ws)*2);
buf[0] := $FF; buf[1] := $FE ; //unicode preamble;
Move(ws[1],buf[2],Length(Buf)-2);
str.Write(buf[0],Length(Buf));
finally
FreeAndNil(str);
end;
end;
A quote from my Delphi XE Development Essentials courseware manual:
>> Console or Text File I/O
First the bad news: neither console nor Text file I/O support reading Unicode strings. And writing also only supports AnsiStrings. This means that as soon as you call write or writeln, the contents of a (Unicode) string will be converted to AnsiString when needed, and written to the output.
This means that any Text file I/O needs to be rewritten using streams or other techniques. However, since a UTF8String is also an AnsiString (with the 65001 code page specified), there is a good workaround for writing to console output provided you set the console codepage to UTF8 and use a font that can display the Unicode characters (that’s Lucida Console for example):
program ConsoleUTF8;
{$APPTYPE CONSOLE}
uses
Windows, SysUtils;
begin
SetConsoleOutputCP(65001);
write(AnsiChar(239), AnsiChar(187), AnsiChar(191)); // UTF-8 BOM
Writeln(Output, UTF8String('[¿¿¿¿¿¿¿¿¿¿¿¿ ¿¿¿¿¿¿¿]'));
end.
This will produce Cyrillic characters on the standard output. Note that Lucida Console cannot display all Unicode characters – Chinese and the Clef are not shown, but at least Cyrillic characters display without problems.
Note that I’m also writing the BOM to the output in case you want to save the console output to a text file and read it afterwards. That way, you can set the font afterwards and also see the Chinese or Clef characters without problems. Provided they were written as UTF8.
This is also the basis for writing UTF8 data to Text files: printing UTF8Strings on a file which starts with the UTF-8 BOM:
program UnicodeTextFile;
{$APPTYPE CONSOLE}
uses
Windows, SysUtils;
var
F: Text;
begin
Assign(F, 'output.txt');
Rewrite(f);
write(f, AnsiChar(239), AnsiChar(187), AnsiChar(191)); // UTF-8 BOM
writeln(f, UTF8String('[¿¿¿¿¿¿¿¿¿¿¿¿ ¿¿¿¿¿¿¿]'));
Close(f);
end.
Since UTF8String is an AnsiString, we can combine the code above with writeln of normal strings, which will be converted to AnsiStrings, as long as we keep away from high-ascii characters (since these would indicate the start of a UTF8 special character byte sequence).
program UnicodeTextFile;
{$APPTYPE CONSOLE}
uses
Windows, SysUtils;
var
F: Text;
begin
Assign(F, 'output.txt');
Rewrite(f);
write(f, AnsiChar(239), AnsiChar(187), AnsiChar(191)); // UTF-8 BOM
writeln(f, UTF8String('[¿¿¿¿¿¿¿¿¿¿¿¿ ¿¿¿¿¿¿¿]'));
writeln(f, 'This is a UTF-16 String which will be written as AnsiString');
Close(f);
end.
As long as we convert UTF-16 Unicode Strings to UTF8 before writing to Text files, and don’t forget to use the UTF-8 BOM as prefix, this will work fine for writing files with Unicode UTF-8 output.
>> TStrings / TStringList
Apart from the UTF-8 testfile trick just covered, the easiest way to produce text output that supports the TEncoding formats, is using the SaveToFile method of a TStrings or TStringList. The SaveToFile method has been extended with a second argument, specifying the encoding.
begin
Memo1.Lines.SaveToFile('Me
By default, the second argument uses TEncoding.Default, which is the default ANSI Code Page of the machine. This means that by default, the SaveToFile will not produce Unicode output, but ANSI output instead (in other words: the previous behavior of the application, but any explicit Unicode characters or data will be lost, unless the SaveToFile gets a second argument value using a TEncoding field other than Default, ASCII or UTF7).
Note that the corresponding LoadFromFile does not take a second argument of type TEncoding, since the encoding should be determinable from the BOM in the first few characters of the file:
Memo1.Lines.LoadFromFile('
end;
"