Link to home
Start Free TrialLog in
Avatar of ST3VO
ST3VOFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Assistance needed with code

Hi all,

I am using thing code from a previous question but I need the code to be more specific.

This is the code:

function ExtractCSSFileName(Text: String): string;
var
  AUrl: String;
  i,t,p,x: Integer;
begin
  if pos('<link',text) > 0 then
    begin
      t := posex('href=', Text, pos('<link',text)+5)+6;
      x := posex('>', Text, pos('<link',text)+5)-1;
      AUrl := copy(Text,t,x-t);
      i := LastDelimiter('/', AUrl);
      Result := Copy(AUrl, i + 1, Length(AUrl) - (i));
    end else result := 'Not Found';
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  s: string;
begin
  s := ExtractCSSFileName(memo1.text);
  ShowMessage(s);
end;

The code is fine but sometimes it does not get just the .css filename so I need to code to do these checks:

1. Look for any "<link" //without the quotes
2. If found then look for between "<link" and ">" //without quotes
3. Look and extract anything between href="  and " 

examples:

this:
 <link href="/styles/main.css" rel="stylesheet" type="text/css" />
 
should result as "/styles/main.css" //without the quotes


this:
<link href="style.css" rel="stylesheet" type="text/css" />

should result in: "style.css" //without the quotes

and this:

<link rel="stylesheet" type="text/css" href="styles/style.css">

should result in: "styles/style.css" //without quotes

Hope you can help

thx

st3vo




ASKER CERTIFIED SOLUTION
Avatar of Ashok
Ashok
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Revised .....

procedure TForm1.Button1Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iStart, iStop : Integer;
begin
  s1 := '<link href="/styles/main.css" rel="stylesheet" type="text/css" />';
  // should result as "/styles/main.css" //without the quotes
  s2 := '<link href="style.css" rel="stylesheet" type="text/css" />';
  // should result in: "style.css" //without the quotes
  s3 := '<link rel="stylesheet" type="text/css" href="styles/style.css">';
  // should result in: "styles/style.css" //without quotes
  sfileName := GetFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sfileName := GetFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  // Following was added for testing only.....
  // s3 := '<link rel="stylesheet" type="text/css" href="styles/style.txt">';
  sFileName := GetFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.GetFilename(s1: String): String;
var
  bFound: Boolean;
  sPart, sFilename: String;
  iStart, iStop : Integer;
begin
  s1 := Lowercase(s1);
  bFound := ((Pos('href=', s1) > 0) and (Pos('.css', s1) > 0));
  if bFound then
  begin
    iStart := Pos('href=', s1) + 6;
    sPart := Copy(s1, iStart, 500);
    iStop := Pos('.css', sPart) + 3;
    sFilename := Copy(sPart, 1, iStop);
  end
  else
    sFileName := 'Not Found';
  Result := sFilename;
end;

HTH
Ashok
Avatar of ST3VO

ASKER

It's almost perfect, just the part where a <link can also have a href= to an .ico file or any other that's not .css .... thats why it should only show if there is a <link and the filetype is .css ...

Know what I mean?  

Everything else is perfect...
So you can have two "href=" in one string?  I was going by 3 string samples you provided.

If only one "href=", then it would send "Not Found" as Filename.

Ashok
Modified to handle upto THREE href.  You could add more code if required.

procedure TForm1.Button1Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iStart, iStop : Integer;
begin
  s1 := '<link href="/styles/main.txt" href="/styles/main.two" href="/styles/main.css" rel="stylesheet" type="text/css" />';
  // should result as "/styles/main.css" //without the quotes
  s2 := '<link href="style.css" rel="stylesheet" type="text/css" />';
  // should result in: "style.css" //without the quotes
  s3 := '<link rel="stylesheet" type="text/css" href="styles/style.css">';
  // should result in: "styles/style.css" //without quotes
  sfileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sfileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  // Following was added for testing only.....
  // s3 := '<link rel="stylesheet" type="text/css" href="styles/style.txt">';
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  bFound: Boolean;
  sPart, sFilename: String;
  iStart, iStop : Integer;
begin
  s1 := Lowercase(s1);
  bFound := ((Pos('href=', s1) > 0) and (Pos('.css', s1) > 0));
  if bFound then
  begin
    iStart := Pos('href=', s1) + 6;
    sPart := Copy(s1, iStart, 500);
    bFound := (Pos('href=', sPart) > 0);
    if bFound then  // # 2nd href
    begin
      iStart := Pos('href=', sPart) + 6;
      sPart := Copy(sPart, iStart, 500);
      bFound := (Pos('href=', sPart) > 0);
      if bFound then  // # 3rd href
      begin
        iStart := Pos('href=', sPart) + 6;
        sPart := Copy(sPart, iStart, 500);
      end;
    end;
    iStop := Pos('.css', sPart) + 3;
    sFilename := Copy(sPart, 1, iStop);
  end
  else
    sFileName := 'Not Found';
  Result := sFilename;
end;

HTH
Ashok
Here is the BEST solution.....  For UNLIMITED hreft (MAXIMUM href = 999,999,999 within one string)

procedure TForm1.Button3Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iPos: Integer;
begin // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
  s1 := '<link href="/styles/main.abc" href="/styles/main.two" href="/styles/main.css" rel="stylesheet" type="text/css" />';
  s2 := '<link href="style.css" rel="stylesheet" type="text/css" />';
  s3 := '<link rel="stylesheet" type="text/css" href="styles/style.css">';
  sFileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.LastPos(const SubStr: String; const S: String): Integer;
begin
   result := Pos(ReverseString(SubStr), ReverseString(S)) ;
   if (result <> 0) then
     result := ((Length(S) - Length(SubStr)) + 1) - result + 1;
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := LastPos('.css', s1);
  sPart := Copy(s1, 1, iPos + 3);
  iPos := LastPos('href=', sPart);
  sPart := Copy(sPart, iPos + 6, 500);
  Result := sPart;
end;

HTH
Ashok
BTW, if your string could be bigger than 500 characters, just replcace 500 with bigger number.

HTH
Ashok
This is perfectly OK.

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := LastPos('.css', s1);
  sPart := Copy(s1, 1, iPos + 3);
  iPos := LastPos('href=', sPart);
  sPart := Copy(sPart, iPos + 6, 99999999);
  Result := sPart;
end;

HTH
Ashok
Slightly modified function would also work.....

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := Pos('.css', s1);
  sPart := Copy(s1, 1, iPos + 3);
  iPos := LastPos('href=', sPart);
  sPart := Copy(sPart, iPos + 6, 99999999);
  Result := sPart;
end;

HTH
Ashok
Slightly modified function would also work.....

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := Pos('.css', s1);
  sPart := Copy(s1, 1, iPos + 3);
  iPos := LastPos('href=', sPart);
  Result := Copy(sPart, iPos + 6, 99999999);
end;
St3vo, to get what you need you should just simply modify tha function like this

function ExtractCSSFileName(Text: String): string;
var
  AUrl: String;
  i,t,p,x: Integer;
begin
  if pos('<link',text) > 0 then
    begin
      t := posex('href=', Text, pos('<link',text)+5)+6;
      x := posex('>', Text, pos('<link',text)+5)-1;
      AUrl := copy(Text,t,x-t);
      Result := AUrl; //the full found path, without extracting just the filename
    end else result := 'Not Found';
end;

have you checked the EE website ?
it uses import for importing stylesheets
just my 2 cents, parsing is all ok, as long as the structure of the document does not change

otherwise use the tools that come with delphi ...
have a look at this, no parsing involved
it should give you all the stylesheets linked to a page
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, OleCtrls, SHDocVw;

type
  TForm1 = class(TForm)
    Memo1: TMemo;
    WebBrowser1: TWebBrowser;
    Button1: TButton;
    Button2: TButton;
    Edit1: TEdit;
    procedure Button1Click(Sender: TObject);
    procedure WebBrowser1DocumentComplete(ASender: TObject;
      const pDisp: IDispatch; var URL: OleVariant);
    procedure Button2Click(Sender: TObject);
  private
    procedure AddMsg(Msg: String);
  end;

var
  Form1: TForm1;

implementation

uses MSHTML_TLB;

{$R *.dfm}


procedure TForm1.AddMsg(Msg: String);
begin
  Memo1.Lines.Add(Msg);
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  // 'http://www.experts-exchange.com'
  WebBrowser1.Navigate(Edit1.Text);
end;

procedure TForm1.Button2Click(Sender: TObject);
var doc: IHTMLDocument2;
  n: integer;
  ss: IDispatch;
  i: OleVariant;
  sx: IHTMLStyleSheet;
begin
  doc := WebBrowser1.Document as IHTMLDocument2;
  n := doc.styleSheets.length;
  AddMsg('Number of items : ' + IntToStr(n));
  i := 0;
  repeat
    ss := doc.styleSheets.item(i);
    sx := ss as IHTMLStyleSheet;
    if not VarIsNull(sx) and (sx <> nil) then
      AddMsg(Format('StyleSheet %d: Title="%s", Type="%s", HRef="%s", cssText="%s"', [integer(i)+1, sx.title, sx.type_, sx.href, sx.cssText]));
    i := i + 1;
  until i > n-1;
end;

procedure TForm1.WebBrowser1DocumentComplete(ASender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
  AddMsg('Doc complete');
  Button2.Enabled := True;
end;

end.

Open in new window

Avatar of ST3VO

ASKER

OK, I've tested all of the examples an none work as good as this one:

function GetFilename(s1: String): String;
var
  bFound: Boolean;
  sPart, sFilename: String;
  iStart, iStop : Integer;
begin
  s1 := Lowercase(s1);
  bFound := ((Pos('href=', s1) > 0) and (Pos('.css', s1) > 0));
  if bFound then
  begin
    iStart := Pos('href=', s1) + 6;
    sPart := Copy(s1, iStart, 500);
    iStop := Pos('.css', sPart) + 3;
    sFilename := Copy(sPart, 1, iStop);
  end
  else
    sFileName := 'Not Found';
  Result := sFilename;
end;

The only problem with it is that it gets non .css files too...otherwise it's what I need.

St3vo, I'd like to understand one thing.
In this question https://www.experts-exchange.com/questions/25007404/get-website-webpage-css-link-from-html-code.html you've paqed my last function but here you are talking about one other function that also comes form that question.
Did you try the accepted function (.css) from there?
That function gets just the filename.css
Do you need instead to get also the path?
'cause in this case that function should be changed as follows


function ExtractCSSFileName(Text: String; WholePath: Boolean): string;
var
  AUrl, temppath: String;
  i, t: Integer;
begin
  if pos('.css', Text) > 0 then
  begin
    i := pos('.css', Text); // found drive
    begin
      t := i;
      while Text[t] <> '"' do
        dec(t);
      temppath := copy(Text, t, i-t+1);
    end;
    AUrl := temppath+'css';
    if WholePath then begin
       i := LastDelimiter('/', AUrl);
       Result := copy(AUrl, i + 1, Length(AUrl) - (i));
       end else Result := AUrl;
  end
  else
    Result := 'Not Found';
end;
procedure TForm1.Button1Click(Sender: TObject);
var
  s: string;
begin
  s := ExtractCSSFileName(Memo1.Text, True);
  ShowMessage(s);
end;

Open in new window

Avatar of ST3VO

ASKER

Hi Ferruccio68,

I've tried your last function here and it's returning " before the css filename ... so instead of for example: style.css  I'm getting "style.css that is happening only when there's no path...so if there is no path to before the filename.css I'm getting "filename.css ...can the " be removed?

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
stevo, does this has to work for all websites ?
Avatar of ST3VO

ASKER

If possible at all yes.

I'll try to explain further.

If you go to ANY website with your browser and save to html page (not saved as MHT) and then try to view that page then any style pages ".css" are not used because the webpage uses relative url's. BUT if you know the name of the styles path and stylesheet file then you can create an absolute url to the css file and it will all look fine.

That's why I need to get that information...know what I mean?

Here is the BEST solution.....  For UNLIMITED hreft (MAXIMUM href = 999,999,999 within one string)

procedure TForm1.Button3Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iPos: Integer;
begin // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
  s1 := '';
  s2 := '';
  s3 := '';
  sFileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.LastPos(const SubStr: String; const S: String): Integer;
begin
   result := Pos(ReverseString(SubStr), ReverseString(S)) ;
   if (result <> 0) then
     result := ((Length(S) - Length(SubStr)) + 1) - result + 1;
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := LastPos('.css', s1);&nbsp;&nbsp;&nbsp;&nbsp; // Here you need LastPos, not Pos (That's why it did not work for you last time you tried.)
  sPart := Copy(s1, 1, iPos + 3);
  iPos := Pos('href=', sPart);
  Result := Copy(sPart, iPos + 6, 500);
end;

HTH
Ashok
Please ignore &nbsp;&nbsp;&nbsp;&nbsp; from my last post.

Thanks,
Ashok
Here is the BEST solution.....  For UNLIMITED hreft (MAXIMUM href = 999,999,999 within one string)
This is tested solution.

procedure TForm1.Button3Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iPos: Integer;
begin // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
  s1 := '';
  s2 := '';
  s3 := '';
  sFileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.LastPos(const SubStr: String; const S: String): Integer;
begin
   result := Pos(ReverseString(SubStr), ReverseString(S)) ;
   if (result <> 0) then
     result := ((Length(S) - Length(SubStr)) + 1) - result + 1;
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := LastPos('.css', s1); // Here you need LastPos, not Pos (That's why it did not work for you last time you tried.)
  sPart := Copy(s1, 1, iPos + 3);
  iPos := Pos('href=', sPart);
  Result := Copy(sPart, iPos + 6, 500);
end;

HTH
Ashok
Here is the BEST solution.....  For UNLIMITED hreft (MAXIMUM href = 999,999,999 within one string)
This is tested solution.

procedure TForm1.Button3Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iPos: Integer;
begin // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
  s1 := '<link href="/styles/main.abc" href="/styles/main.two" href="/styles/main.css" rel="stylesheet" type="text/css" />';
  s2 := '<link href="style.css" rel="stylesheet" type="text/css" />';
  s3 := '<link rel="stylesheet" type="text/css" href="styles/style.css">';
  sFileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.LastPos(const SubStr: String; const S: String): Integer;
begin
   result := Pos(ReverseString(SubStr), ReverseString(S)) ;
   if (result <> 0) then
     result := ((Length(S) - Length(SubStr)) + 1) - result + 1;
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := LastPos('.css', s1); // Here you need LastPos, not Pos (That's why it did not work for you last time you tried.)
  sPart := Copy(s1, 1, iPos + 3);
  iPos := Pos('href=', sPart);
  Result := Copy(sPart, iPos + 6, 500);
end;

HTH
Ashok
Here is the BEST solution.....  For UNLIMITED hreft (MAXIMUM href = 999,999,999 within one string)
This is tested solution.

procedure TForm1.Button3Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iPos: Integer;
begin // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
  s1 := '<link href="/styles/main.abc" href="/styles/main.two" href="/styles/main.css" rel="stylesheet" type="text/css" />';
  s2 := '<link href="style.css" rel="stylesheet" type="text/css" />';
  s3 := '<link rel="stylesheet" type="text/css" href="styles/style.css">';
  sFileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.LastPos(const SubStr: String; const S: String): Integer;
begin
   result := Pos(ReverseString(SubStr), ReverseString(S)) ;
   if (result <> 0) then
     result := ((Length(S) - Length(SubStr)) + 1) - result + 1;
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := LastPos('.css', s1); // Here you need LastPos, not Pos (That's why it did not work for you last time you tried.)
  sPart := Copy(s1, 1, iPos + 3);
  iPos := Pos('href=', sPart);
  Result := Copy(sPart, iPos + 6, 999999999);
end;

HTH
Ashok
Avatar of ST3VO

ASKER

Still have 1 issue.... what the css path is something like this:

/styles/main.css

because it begins with a "/" (without quotes) my full url becomes:

http://www.somesite.com//styles/main.css  ... or you can see 2 // ...slashes...

I've set the baseurl to:

var base:string;
begin
base:='http://www.somesite.com/';

then when I run the function is will return  2 // ...slashes...

thx

Can you provide me actual FULL string value and what you expect?

For example,
  s1 := '<link href="/styles/main.txt" href="/styles/main.two" href="/styles/main.css" rel="stylesheet" type="text/css" />';
  // should result as "/styles/main.css"

BTW, I would be able to respond tomorrow morning because I do not have Delphi installed at home.

Thanks,
Ashok
Or do you always want it like this?

styles/main.css

Without starting first slash.

Thanks,
Ashok
Avatar of ST3VO

ASKER

It's strange actually..I don't mind having the slash...Actually I need to get whatever is after href="  but for some reason I'm getting 2 slashes instead of the one :o/
My last post marked with "This is tested solution." does not return two slashes.  It returns everything between double quotes.

Can you post ths code here mentioning exactly where do you see two slashes?

Thanks,
Ashok
It could be because

base := 'http://www.somesite.com/';  // This already ends with ONE SLASH

Now if you try to combine above plus FileName returned by my code like this....

sFileName := := GetCSSFilename(s3);
sCompleteStr := base + sFileName;

Is this the case?

Ashok
It could be because

base := 'http://www.somesite.com/';  // This already ends with ONE SLASH

Now if you try to combine above plus FileName returned by my code like this....

sFileName := GetCSSFilename(s3);
sCompleteStr := base + sFileName;

Is this the case?

Ashok
Here is the BEST solution.....  For UNLIMITED hreft (MAXIMUM href = 999,999,999 within one string)
This is tested solution.  My mistake in previous post.  First I had it working with LastPos in both places.

procedure TForm1.Button3Click(Sender: TObject);
var
  s1, s2, s3, sFilename: String;
  iPos: Integer;
begin // 12345678901234567890123456789012345678901234567890123456789012345678901234567890
  s1 := '<link href="/styles/main.abc" href="/styles/main.two" href="/styles/main.css" rel="stylesheet" type="text/css" />';
  s2 := '<link href="style.css" rel="stylesheet" type="text/css" />';
  s3 := '<link rel="stylesheet" type="text/css" href="styles/style.css">';
  sFileName := GetCSSFilename(s1);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s2);
  ShowMessage('@'+ sFilename + '@');
  sFileName := GetCSSFilename(s3);
  ShowMessage('@'+ sFilename + '@');
end;

function TForm1.LastPos(const SubStr: String; const S: String): Integer;
begin
   result := Pos(ReverseString(SubStr), ReverseString(S)) ;
   if (result <> 0) then
     result := ((Length(S) - Length(SubStr)) + 1) - result + 1;
end;

function TForm1.GetCSSFilename(s1: String): String;
var
  sPart: String;
  iPos: Integer;
begin
  iPos := Pos('.css', s1);               // We assume that you cannot have more than one .CSS file within one String.
  sPart := Copy(s1, 1, iPos + 3);  // Now we have String upto the end of .CSS
  iPos := LastPos('href=', sPart);  // Here you need LastPos, not Pos (That's why it did not work for you last time you tried.)
  Result := Copy(sPart, iPos + 6, 999999999);
end;

HTH
Ashok
What's the problem with my last posted funtioc? It works right and give you the possibility to extract just the filename or the wholepath+filename, that is anything between href" and "
Avatar of ST3VO

ASKER

Thanks for all your help. There are too many combinations so it's really not going to be possible to do it this way so I'm going to think of some other way. Happy new year!