?
Solved

delphi: japan website and webbrowser

Posted on 2011-03-01
21
Medium Priority
?
1,769 Views
Last Modified: 2012-05-11
Hello experts
I have to capture some elements from a japan website using twebbrowser component
I have delphi6 and delphi2007

Trouble is when I look insiside the source of twebbrowser I got ????? onstead of japanase characters

I believe some UTF-8 or unicode problems:
Any idea on how to do it with delphi6 or delphi2007 ?

Or do I need a more recent version of delphi ? And will Twebbrowser will work on it ? How should I adopt my program ?

Sorry for all these questions but I have no idea how to work with japanse chars and what changes/modification it does imply

regards

0
Comment
Question by:yarekGmail
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
21 Comments
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35006515
Add mshtml to uses clause and type in event OnDocumentComplete the following


uses mshtml;

procedure TForm1.WebBrowser1DocumentComplete(ASender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
IHTMLDocument2(WebBrowser1.Document).Charset:= 'utf-8';
end;

Open in new window

0
 

Author Comment

by:yarekGmail
ID: 35006739
ok have added this, but still have : ????

1) do you think it can work on delphi 6 or delphi 2010 ?
2) if you can help me, I am ok to pay you for that
Regards
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35006807
I have some questions:
Do you have this problem with all Japanese website?

Is the website yours (you can manage it)?

Can you give me the website address?
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 

Author Comment

by:yarekGmail
ID: 35006940
it is not my website:
ex : http://crm.cegedim.jp/

But it can be any japanase website !

regards
0
 
LVL 9

Accepted Solution

by:
Mahdi78 earned 668 total points
ID: 35006989
I made an application with delphi 2009 (support unicode) i attached it with this reply, i did browse the website you gave me, it work well Japanese letters are clear, try this application with your website by the following way:

1- Select utf-8 checkbox and browse the website.
2- Unselect the utf-8 checkbox and browse the website.

Then tell me what happened in every step
Project1.exe
0
 

Author Comment

by:yarekGmail
ID: 35007194
yes everything looks great.. but this is not the goal:
dipslay works well for me as well in d6

But when I get the source HTML and try to save it, then I lose the japanase chars and got '????'

I use that to get the source code from the Twebbrowser component
http://delphi.about.com/od/adptips2005/qt/webbrowserhtml.htm

The goal of the project is not to display correct the webpage (this works), but to capture some data from it : Now when I get data through source code, it is ???? chars

Regards
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35007308
You should save the html code as utf-8 file
0
 

Author Comment

by:yarekGmail
ID: 35007338
Hello
Thanks again for your help

Now about your answer :  HOW since the source code of the page gives me '????' instead of japan chars, when I save it I also get '????'
I am stuck here.

regards

0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35007464
OK, save the html text in memo by this way

Memo1.Lines.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html', TEncoding.UTF8);

I tried it with Arabic characters, it work well
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35007630
To save html source of webbrowser to utf-8 file use this code


procedure TForm1.Button4Click(Sender: TObject);
var List : TStringList;
begin
List := TStringList.Create;
  try
  List.Text := WebBrowser1.OleObject.Document.documentElement.innerHTML;
  List.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html', TEncoding.UTF8);
  finally
  List.Free;
  end;
end;

Open in new window

0
 

Author Comment

by:yarekGmail
ID: 35007773
Error on delphi 2007
saveToFile takes only 1 parameter !
, TEncoding.UTF8

Regards

0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35008066
Be right back
This application built with delphi 2009 i didn't get the error, check it


Project1.exe
0
 

Author Comment

by:yarekGmail
ID: 35008222
No: it does not work correctly here:
I check the utf-8, then press the GO button
and when the pags is loaded press the Save button
Then wehn I open it in Notepad++ (with UTF-8 checcked ) I got blank chars as well !
chars still bad
0
 
LVL 24

Expert Comment

by:jimyX
ID: 35008563
Hmmm Unicode issue. Delphi 2007 does not support Unicode by default you have to use supporting packages or upgrade to Delphi 2009 or above. Any way here is a PAQ hope it helps:

http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_26807134.html
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35009564
Open the file with Notepad look at this


try this code or you should use delphi 2010 or delphi 2009

procedure TForm1.Button2Click(Sender: TObject);
var
  UTF8Encoding: TEncoding;
begin
  UTF8Encoding := TEncoding.GetEncoding(65001);
  try
  Memo1.Lines.Text := WebBrowser1.OleObject.Document.documentElement.innerHTML;
  Memo1.Lines.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html', UTF8Encoding);
  finally
      UTF8Encoding.Free;
  end;
end;

Open in new window

Screenshot.jpg
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35015808
I have other solution if you need it, without using D2009, D2010 or TMS Unicode component.
It is a DLL i will build it to use with any one of your projects

if this way tell me.
0
 

Author Comment

by:yarekGmail
ID: 35015859
great ! that would be the best solution. Maybe we can get in touch so I can explain you the project in details: yarekc at gmail dot com.

regards
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35015901

OK, i sent you an email

You're welcome
0
 
LVL 9

Expert Comment

by:Mahdi78
ID: 35025557
I have attached the DLL in this reply, you should put it in project folder and use like the following sample


unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, OleCtrls, SHDocVw;

type
  TForm1 = class(TForm)
    WebBrowser1: TWebBrowser;
    Button1: TButton;
    Edit1: TEdit;
    Button4: TButton;
    procedure Button1Click(Sender: TObject);
    procedure Button4Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

  Function SaveHTML(WB : TWebBrowser; Filename: string ): Boolean; external 'MyDLL.dll';

implementation

uses mshtml;

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
begin
WebBrowser1.Navigate(Edit1.Text);
end;

procedure TForm1.Button4Click(Sender: TObject);
begin
If SaveHTML(WebBrowser1, ExtractFilePath(Application.ExeName)+'file.html') then
Showmessage('File exported successfully!');
end;

end.

Open in new window

MyDLL.dll
0
 
LVL 3

Assisted Solution

by:sYk0
sYk0 earned 664 total points
ID: 35025696
There's a very simple answer to this problem.

In Delphi 2007 add "WideStrings" to the uses section of your application then do as Mahdi78 suggested but with a few modifications.

uses
  ..., WideStrings;

implementation

procedure TForm1.Button1Click(Sender: TObject);
var
  List : TWideStringList;
begin
  List := TWideStringList.Create;
  try
    List.Text := WebBrowser1.OleObject.Document.documentElement.innerHTML;
    List.SaveToFile(ExtractFilePath(Application.ExeName)+'file.html');
  finally
    List.Free;
  end;
end;

Open in new window


0
 
LVL 32

Assisted Solution

by:Ephraim Wangoya
Ephraim Wangoya earned 668 total points
ID: 35173674
You dont need external dll or any special function. You just need to use UTF8Encode, that will convert it properly to the strings you want. Also don't save using TStringList or TMemo
procedure TForm3.Button1Click(Sender: TObject);
begin
  WebBrowser1.Navigate('http://crm.cegedim.jp/');
end;

procedure TForm3.ButtonSaveClick(Sender: TObject);
var
  Element: IHTMLElement;
  FileName: string;
  Text: UTF8String;
  Stream: TFileStream;
begin
 if Assigned(WebBrowser1.Document) then
 begin
   Element := (WebBrowser1.Document AS IHTMLDocument2).body;

   while Element.parentElement <> nil do
   begin
     Element := Element.parentElement;
   end;
   Text := UTF8Encode(Element.outerHTML);

   FileName := ChangeFileExt(ParamStr(0), '.html');
   Stream := TFileStream.Create(FileName, fmCreate or fmOpenWrite);
   try
     Stream.Write(Pointer(Text)^, Length(Text));
   finally
     FreeAndNil(Stream);
   end;
 end;
end;

procedure TForm3.WebBrowser1DocumentComplete(ASender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
  IHTMLDocument2(WebBrowser1.Document).Charset:= 'utf-8';
end;

Open in new window

0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
Introduction I have seen many questions in this Delphi topic area where queries in threads are needed or suggested. I know bumped into a similar need. This article will address some of the concepts when dealing with a multithreaded delphi database…
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Suggested Courses

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question