delphi: japan website and webbrowser

Hello experts
I have to capture some elements from a japan website using twebbrowser component
I have delphi6 and delphi2007

Trouble is when I look insiside the source of twebbrowser I got ????? onstead of japanase characters

I believe some UTF-8 or unicode problems:
Any idea on how to do it with delphi6 or delphi2007 ?

Or do I need a more recent version of delphi ? And will Twebbrowser will work on it ? How should I adopt my program ?

Sorry for all these questions but I have no idea how to work with japanse chars and what changes/modification it does imply

regards

yarekGmailAsked:
Who is Participating?
 
Mahdi78Connect With a Mentor Commented:
I made an application with delphi 2009 (support unicode) i attached it with this reply, i did browse the website you gave me, it work well Japanese letters are clear, try this application with your website by the following way:

1- Select utf-8 checkbox and browse the website.
2- Unselect the utf-8 checkbox and browse the website.

Then tell me what happened in every step
Project1.exe
0
 
Mahdi78Commented:
Add mshtml to uses clause and type in event OnDocumentComplete the following


uses mshtml;

procedure TForm1.WebBrowser1DocumentComplete(ASender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
IHTMLDocument2(WebBrowser1.Document).Charset:= 'utf-8';
end;

Open in new window

0
 
yarekGmailAuthor Commented:
ok have added this, but still have : ????

1) do you think it can work on delphi 6 or delphi 2010 ?
2) if you can help me, I am ok to pay you for that
Regards
0
Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

 
Mahdi78Commented:
I have some questions:
Do you have this problem with all Japanese website?

Is the website yours (you can manage it)?

Can you give me the website address?
0
 
yarekGmailAuthor Commented:
it is not my website:
ex : http://crm.cegedim.jp/

But it can be any japanase website !

regards
0
 
yarekGmailAuthor Commented:
yes everything looks great.. but this is not the goal:
dipslay works well for me as well in d6

But when I get the source HTML and try to save it, then I lose the japanase chars and got '????'

I use that to get the source code from the Twebbrowser component
http://delphi.about.com/od/adptips2005/qt/webbrowserhtml.htm

The goal of the project is not to display correct the webpage (this works), but to capture some data from it : Now when I get data through source code, it is ???? chars

Regards
0
 
Mahdi78Commented:
You should save the html code as utf-8 file
0
 
yarekGmailAuthor Commented:
Hello
Thanks again for your help

Now about your answer :  HOW since the source code of the page gives me '????' instead of japan chars, when I save it I also get '????'
I am stuck here.

regards

0
 
Mahdi78Commented:
OK, save the html text in memo by this way

Memo1.Lines.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html', TEncoding.UTF8);

I tried it with Arabic characters, it work well
0
 
Mahdi78Commented:
To save html source of webbrowser to utf-8 file use this code


procedure TForm1.Button4Click(Sender: TObject);
var List : TStringList;
begin
List := TStringList.Create;
  try
  List.Text := WebBrowser1.OleObject.Document.documentElement.innerHTML;
  List.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html', TEncoding.UTF8);
  finally
  List.Free;
  end;
end;

Open in new window

0
 
yarekGmailAuthor Commented:
Error on delphi 2007
saveToFile takes only 1 parameter !
, TEncoding.UTF8

Regards

0
 
Mahdi78Commented:
Be right back
This application built with delphi 2009 i didn't get the error, check it


Project1.exe
0
 
yarekGmailAuthor Commented:
No: it does not work correctly here:
I check the utf-8, then press the GO button
and when the pags is loaded press the Save button
Then wehn I open it in Notepad++ (with UTF-8 checcked ) I got blank chars as well !
chars still bad
0
 
jimyXCommented:
Hmmm Unicode issue. Delphi 2007 does not support Unicode by default you have to use supporting packages or upgrade to Delphi 2009 or above. Any way here is a PAQ hope it helps:

http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_26807134.html
0
 
Mahdi78Commented:
Open the file with Notepad look at this


try this code or you should use delphi 2010 or delphi 2009

procedure TForm1.Button2Click(Sender: TObject);
var
  UTF8Encoding: TEncoding;
begin
  UTF8Encoding := TEncoding.GetEncoding(65001);
  try
  Memo1.Lines.Text := WebBrowser1.OleObject.Document.documentElement.innerHTML;
  Memo1.Lines.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html', UTF8Encoding);
  finally
      UTF8Encoding.Free;
  end;
end;

Open in new window

Screenshot.jpg
0
 
Mahdi78Commented:
I have other solution if you need it, without using D2009, D2010 or TMS Unicode component.
It is a DLL i will build it to use with any one of your projects

if this way tell me.
0
 
yarekGmailAuthor Commented:
great ! that would be the best solution. Maybe we can get in touch so I can explain you the project in details: yarekc at gmail dot com.

regards
0
 
Mahdi78Commented:

OK, i sent you an email

You're welcome
0
 
Mahdi78Commented:
I have attached the DLL in this reply, you should put it in project folder and use like the following sample


unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, OleCtrls, SHDocVw;

type
  TForm1 = class(TForm)
    WebBrowser1: TWebBrowser;
    Button1: TButton;
    Edit1: TEdit;
    Button4: TButton;
    procedure Button1Click(Sender: TObject);
    procedure Button4Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

  Function SaveHTML(WB : TWebBrowser; Filename: string ): Boolean; external 'MyDLL.dll';

implementation

uses mshtml;

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
begin
WebBrowser1.Navigate(Edit1.Text);
end;

procedure TForm1.Button4Click(Sender: TObject);
begin
If SaveHTML(WebBrowser1, ExtractFilePath(Application.ExeName)+'file.html') then
Showmessage('File exported successfully!');
end;

end.

Open in new window

MyDLL.dll
0
 
sYk0Connect With a Mentor Commented:
There's a very simple answer to this problem.

In Delphi 2007 add "WideStrings" to the uses section of your application then do as Mahdi78 suggested but with a few modifications.

uses
  ..., WideStrings;

implementation

procedure TForm1.Button1Click(Sender: TObject);
var
  List : TWideStringList;
begin
  List := TWideStringList.Create;
  try
    List.Text := WebBrowser1.OleObject.Document.documentElement.innerHTML;
    List.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html');
  finally
    List.Free;
  end;
end;

Open in new window


0
 
Ephraim WangoyaConnect With a Mentor Commented:
You dont need external dll or any special function. You just need to use UTF8Encode, that will convert it properly to the strings you want. Also don't save using TStringList or TMemo
procedure TForm3.Button1Click(Sender: TObject);
begin
  WebBrowser1.Navigate('http://crm.cegedim.jp/');
end;

procedure TForm3.ButtonSaveClick(Sender: TObject);
var
  Element: IHTMLElement;
  FileName: string;
  Text: UTF8String;
  Stream: TFileStream;
begin
 if Assigned(WebBrowser1.Document) then
 begin
   Element := (WebBrowser1.Document AS IHTMLDocument2).body;

   while Element.parentElement <> nil do
   begin
     Element := Element.parentElement;
   end;
   Text := UTF8Encode(Element.outerHTML);

   FileName := ChangeFileExt(ParamStr(0), '.html');
   Stream := TFileStream.Create(FileName, fmCreate or fmOpenWrite);
   try
     Stream.Write(Pointer(Text)^, Length(Text));
   finally
     FreeAndNil(Stream);
   end;
 end;
end;

procedure TForm3.WebBrowser1DocumentComplete(ASender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
  IHTMLDocument2(WebBrowser1.Document).Charset:= 'utf-8';
end;

Open in new window

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.