Solved

Hard question about string search

Posted on 2000-04-06
7
148 Views
Last Modified: 2010-05-02
Ok..
I was just wondering if anyone knows how the browsers strips out the values of tags and such...
I want to know how to take a tag like this.
<img src="hello.jpg" border=0 width="2" height=34 align="texttop" alt=hey>

a tag like this and get its values, this must be very difficult cause the tags can be like border=0 and border="0"
how the heck does this work?
Is it possible to make a function that works like this, call StripTags(richtextbox.text,"<img src,border,width,align,height,alt,output) and output contains their values separated by "," . This is to hard to do in vb right?
0
Comment
Question by:Geo24
7 Comments
 
LVL 28

Expert Comment

by:AzraSound
Comment Utility
you may get ideas from these two samples.  One strips all the hyperlinks out of the document and the other strips all the tags out of the document.

http://www.freecode.com/cgi-bin/viewproduct.pl?3171

http://www.freecode.com/cgi-bin/viewproduct.pl?3170

0
 

Expert Comment

by:LeXien
Comment Utility
you will need to think methodically. Think to yourself how you would go about doing it manually, then automate that. Use commands such as Right, Left, Mid and InStr. You might want to keep a progress bar in mind, because VB's string manipulation is sssslllloooowww!
If you could make a function to return the specified Tag or word in the string, then that would be a great start.

I'll have a look for you, see if i can work out some functions.
0
 
LVL 28

Expert Comment

by:AzraSound
Comment Utility
the way i would approach it is create an array of strings that holds all the possible tags.  then i would load the html source into a richtextbox and use its find method and perform a loop that finds the tags and go from there using string manipulation and such
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 9

Expert Comment

by:samopal
Comment Utility
It's very simpe, if you are using HTML object model, not string parsing. Here is a small example :

Dim ob As Variant
For Each ob In WebBrowser1.document.All
    If ob.tagName = "IMG" Then
        Debug.Print "Src=" & ob.src & " Border=" & ob.border & " Width=" & ob.Width
    End If
Next

WebBrowser1 is a WebBrowser :-)) or may be created

set WebBrowser1 = CreateObject("internetexplorer.application")
WebBrowser1.Navigate "www....."
WebBrowser1.visible = True

HTH
0
 
LVL 2

Expert Comment

by:Jagar
Comment Utility
try something like this

sStr = "<img src=""hello.jpg"" border=0 width=""2"" height=34 align=""texttop"" alt=hey>"

vparts = Split(sStr, " ")

for p = LBound(vparts) to UBound(vparts)
   if Instr(vparts(p), "=") then
      if Right(vparts(p), 1) = """" then
         sTemp = Right(vparts(p), len(vparts(p)) - Instr(vparts(p), "=")) & ","
         sResults = sResults & Mid(sTemp, 2, len(sTemp) - 2)
      else
         sResults = sResults & Right(vparts(p), len(vparts(p)) - Instr(vparts(p), "=")) & ","
      end if
   end if
next p
Debug.Print sResults

That should give you what you want without having to create a webbrowser object.
0
 
LVL 1

Accepted Solution

by:
ATM earned 100 total points
Comment Utility
hey here is cool code which can parse your string ... not only when You use double quotes, also when there is single quote ... but what will You to do if your ALT parameter like:
ALT=mama miya bambarabiya kerkudu
This code also can doing that, create new form add Text1, Command1, List1, List2 and copy paste code ...

Dim TagParamIndex(2, 9) As String

Private Sub Command1_Click()
Dim Tagz() As String
Dim TagCount As Integer
Dim StartPos As Long
Dim StopPos As Long
Dim DefText As String
Dim TagArrayPos As Long
Dim ParamArrayPos As Long
Dim defTag As String
Dim defParam As String
Dim defValue As String
Dim bStillSeekForChar As Boolean

List1.Clear

DefText = ""

StartPos = InStr(1, Text1.Text, "<")
If StartPos > 0 Then
   StopPos = InStr(StartPos, Text1.Text, ">")
   If StopPos > StartPos Then
      DefText = Trim(Mid(Text1.Text, StartPos + 1, StopPos - StartPos - 1))
   End If
End If

If DefText = "" Then
   MsgBox "Can't find <>"
   Exit Sub
End If

TagArrayPos = 0
ParamArrayPos = 0

Text1.Text = DefText


'find end of tag name
StopPos = InStr(1, DefText, Chr(32))
If StopPos > 1 Then
defTag = Mid(DefText, 1, StopPos - 1)
MsgBox defTag
'---------------------------------------------------------------------------
Do While TagParamIndex(TagArrayPos, 0) <> Chr(0)
     If UCase(defTag) = TagParamIndex(TagArrayPos, 0) Then
       'here tag found
        If StopPos < Len(DefText) - 1 Then
'---------------------------------------------------------------------------
             DefText = Right(DefText, Len(DefText) - StopPos)
            Do While DefText <> ""
             Text1.Text = DefText
             StartPos = InStr(1, DefText, "=")
             'ok param name found
             If StartPos > 1 Then
                defParam = Trim(Mid(DefText, 1, StartPos - 1))
                ParamArrayPos = 1
                'determine parameter type
                Do While TagParamIndex(TagArrayPos, ParamArrayPos) <> Chr(0)
                     If UCase(defParam) = TagParamIndex(TagArrayPos, ParamArrayPos) Then
                        'ok its found, add to list
                          List1.AddItem defParam
                          DefText = Right(DefText, Len(DefText) - StartPos)
                          Text1.Text = DefText
                          'get param value
                          bStillSeekForChar = True
                          'skip one word back
                          StartPos = InStr(StartPos, DefText, "=")
                          If StartPos > 0 Then
                             StopPos = StartPos - 1
                             Do While StopPos > 0
                                 If Mid(DefText, StopPos, 1) <> Chr(32) Then
                                    If bStillSeekForChar Then
                                       bStillSeekForChar = False
                                    End If
                                 Else
                                    If Not (bStillSeekForChar) Then Exit Do
                                 End If
                                 StopPos = StopPos - 1
                             Loop
                             If StopPos > 0 Then
                                defValue = Trim(Replace(Left(DefText, StopPos), Chr(34), Chr(32)))
                                defValue = Trim(Replace(defValue, Chr(39), Chr(32)))
                                List2.AddItem defValue
                                DefText = Right(DefText, Len(DefText) - StopPos)
                             End If
                          Else
                            defValue = Trim(Replace(DefText, Chr(34), Chr(32)))
                            defValue = Trim(Replace(defValue, Chr(39), Chr(32)))
                            List2.AddItem defValue
                            DefText = ""
                          End If
                          Exit Do
                     End If
                     ParamArrayPos = ParamArrayPos + 1
                Loop
                MsgBox defParam
             End If
            Loop
'---------------------------------------------------------------------------
        End If
       Exit Do
     End If
     TagArrayPos = TagArrayPos + 1
Loop
'---------------------------------------------------------------------------
End If

End Sub

Private Sub Form_Load()

Text1.Text = "<img src=" & Chr(34) & "hello.jpg" & Chr(34) & " border=0 width=" & Chr(34) & "2" & Chr(34) & " height=34 align=" & Chr(34) & "texttop" & Chr(34) & " alt=hey>"

TagParamIndex(0, 0) = "IMG"
TagParamIndex(0, 1) = "SRC"
TagParamIndex(0, 2) = "BORDER"
TagParamIndex(0, 3) = "WIDTH"
TagParamIndex(0, 4) = "HEIGHT"
TagParamIndex(0, 5) = "ALIGN"
TagParamIndex(0, 6) = "ALT"
TagParamIndex(0, 7) = "NAME"
TagParamIndex(0, 8) = "ID"
TagParamIndex(0, 9) = Chr(0)

TagParamIndex(1, 0) = "A"
TagParamIndex(1, 1) = "HREF"
TagParamIndex(1, 2) = "TARGET"
TagParamIndex(1, 3) = "NAME"
TagParamIndex(1, 4) = "ID"
TagParamIndex(1, 5) = Chr(0)

TagParamIndex(2, 0) = Chr(0)

End Sub
0
 

Author Comment

by:Geo24
Comment Utility
This code is more than EXCELLENT!!

Thank u man!
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Background What I'm presenting in this article is the result of 2 conditions in my work area: We have a SQL Server production environment but no development or test environment; andWe have an MS Access front end using tables in SQL Server but we a…
This article describes some techniques which will make your VBA or Visual Basic Classic code easier to understand and maintain, whether by you, your replacement, or another Experts-Exchange expert.
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now