Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

reading web page data

Posted on 2003-12-05
11
Medium Priority
?
694 Views
Last Modified: 2010-04-05
When I display a progarm in Netscape 7 I can us the "File/Save pages as.." to save the page contents as eitehr an HTML file or a text file.

In Delphi 7 I can  display a web page using the webbrowser control.

Does anyone know how to save the contents of the page displayed in a web browser to a test file, preferrably in the text file format (ie dropping all the html tags)

Alternately can I down load an html file directly from a site using some other method within Delphi.

Having downloaded a file containing html code - can I strip that back to just the actual text displayed without all the formatting tabs?
0
Comment
Question by:Kymberley
  • 4
  • 4
  • 2
  • +1
11 Comments
 
LVL 23

Accepted Solution

by:
Ferruccio Accalai earned 600 total points
ID: 9881711
uses UrlMon;

procedure TForm1.Button1Click(Sender: TObject);
begin
if URLDownloadToFile(nil, 'http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_20817340.html', 'c:\MyQuestion.txt', 0, nil) <> 0
then
MessageBox(Handle, 'An error ocurred while downloading the file.', PChar
(Application.Title), MB_ICONERROR or MB_OK);
end;
0
 
LVL 26

Assisted Solution

by:EddieShipman
EddieShipman earned 200 total points
ID: 9884823
You can use this to get just the text from the HTML string:

uses..., mshtml;


function RemoveHTMLFromString(const AHTML: string): string;
var
  vDocument : IHTMLDocument2;
  vHTML : OleVariant;
begin
  Result := AHTML;
  vDocument := CoHTMLDocument.Create as IHTMLDocument2;
  vDocument.designMode := 'On';
  vHTML := VarArrayCreate([0, 0], varVariant);
  vHTML[0] := Result;
  vDocument.Write(PSafeArray(TVarData(vHTML).VArray));
  vDocument.Close;
  Result := vDocument.body.outerText;
  vDocument := nil;
end;
0
 
LVL 1

Assisted Solution

by:mgazza
mgazza earned 200 total points
ID: 9906642
why dont we just use wininet stupid 3rd part conponents
oh declair wininet in the uses bit
procedure HTTPDownload(Remote:String; var Data:string);
var create,file_remote_handle:Phandle;
Data_written:cardinal;
buffer:array[0..512] of char;
begin

create := InternetOpen('Mozilla/4.0 (compatible)', INTERNET_OPEN_TYPE_PRECONFIG , NIL, NIL, 0);
file_remote_handle:=InternetOpenUrl(create, pchar(remote), NiL, 0, INTERNET_FLAG_RAW_DATA, 0);
if file_remote_handle<>nil then begin

        repeat
                FillChar(buffer,sizeof(buffer),#0);
                InternetReadFile(file_remote_handle,addr(buffer),sizeof(buffer),Data_Written);
                data:=data+copy(buffer,1,sizeof(buffer));
        until Data_Written<=0;



 internetclosehandle(file_remote_handle);
end
else begin
MessageBox(0,'Could Not Resolve Host!','Error',0);
end;
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 1

Expert Comment

by:mgazza
ID: 9906666
imput a fully valid http address and you get the raw data back erm if u put in a scrpit it reurns the results not the file!
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 9906884
mgazza, it will return the HTML text and that is what idHTTP does for him, anyway.
0
 
LVL 1

Expert Comment

by:mgazza
ID: 9906937
ye i know just thats all you need, no web brouser component
0
 

Author Comment

by:Kymberley
ID: 9909656
Thanks for your comments - the URLDownloadToFile method in the first reponse worked so I have whatever other components were required. I wrote my own html stripping routine since the data was all in HTML tables so I was able to built up rows from cells in tab delimited format.

BTW - It's her not him

Kymberley
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 9912436
Sorry for the confusion...
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 9912441
I have code to get the data from table cells using the DOM if you want it.
0
 

Author Comment

by:Kymberley
ID: 9912573
Thanks for the offer eddie, but i have already written that code - and already downloaded the data i was after from the internet and loaded into my databases. The download method was the crucial hint I needed.
0
 
LVL 1

Expert Comment

by:mgazza
ID: 9912911
good luck all!!!
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
Loops Section Overview
Suggested Courses

927 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question