Solved

parsing HTML code

Posted on 2004-08-26
9
235 Views
Last Modified: 2010-05-02
hi - has anyone had experience in taking HTML code - searching for all possible graphics paths and changing those paths then resaving the HTML code-?
further explaination...
    I am creating a "packager" that the "graphic guys and gals"  can create a html page using graphics from several sources (ie paths) across our network
    When the HTML file is put in "the packager" it would seach though HTML code and find all external references -copy those files locally and strip the paths off within the HTML code - In other words the HTML would run properly if all the graphic files were in the same directory as the HTML code ( which is what I want)
   The program then makes a cab file of this HTML file and all  graphic files

* what is this used for?? well we send all types of graphic formats to be displayed on remote advertising machines - but these are all single files (mostly .swf s) - we need to keep this methodology ( 1 file ) so thats why the "packaging")
 
Ive got the interface done - they can drag and drop files and pick from file chooser

Ive got the cab file maker done

I can 'prolly seach the HTML text for obvious "scr=" and "BACKGROUND=" and " .jpg" and " .gif" ect and copy then strip out the paths but just wondered if anyone has done this kind of thing before ??

(OR IS THERE A WAY THAT A WEB PAGE CAN BE PACKAGED WITH ALL GRAPHICS CONTAINED WITH IN ?) obviously im not a web page programmer :)

thanks in advance


0
Comment
Question by:bczingo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 52

Expert Comment

by:Carl Tawn
ID: 11907971
I've done a couple of similar things.  One was XML based, converting absolute paths to relative paths before packaging the document and uploading it to the web.  Done a fair bit of tag stripping and the such with HTML pages.  

Haven't come across a simple way of doing it. It usually just a case of reading the page line by line and looking for the start and end of tags and looking for "src=", "background=" and the rest.

Probably not what you wanted to hear :o)

0
 
LVL 3

Author Comment

by:bczingo
ID: 11908180
Not all I wanted but I appreciate the comment Carl
- what other tags are there that would point to external files?
0
 
LVL 52

Expert Comment

by:Carl Tawn
ID: 11910353
not many really.  

src and background obviously for images.  maybe <object> tags and <param> tags if you embed flash movies or anything like that.  possibly "includes" if you use external stylesheets or js files.

also, background-image (i think) if you use styles rather than just the "background" attribute.
0
[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

 
LVL 7

Expert Comment

by:Burbble
ID: 11919207
Do you have an example HTML file (or at least part of one)?

-Burbble
0
 
LVL 3

Author Comment

by:bczingo
ID: 11931273
Burbble
example wouldn't help really - could be anything at all - pick a page - any page :)


thanks again carl

(I thought I had posted this comment before ) gues ii\ didn't hit submit


0
 
LVL 7

Accepted Solution

by:
Burbble earned 250 total points
ID: 11931600
Ok, could you give an example of a line of HTML and how it should be changed? I'm just not entirely sure how you want to alter the path is all :-)

Like... <IMG SRC="http://www.mysite.com/images/file.jpg"> changes to <IMG SRC="images/file.jpg">?

Sorry about the confusion :/

-Burbble
0
 
LVL 3

Author Comment

by:bczingo
ID: 11947689
well I have it done :) $$$$$

not full proof but gets job done splendidly
- if anyones interested in the rest of the code give me a hollar

heres just the parse stuff
as you can see if theres other tags to look other then just
    " src"   " background"  " background-image"
 then just put in another case statement
           

Private Type theFilesInStr
    startLoc As Long
    endLoc As Long
    FileName As String
End Type
Public tF() As theFilesInStr
Private Const WEBSPACE = "%20"
Private theString As String
Private backupString As String

Public Function parseFindAttachments(sentfile As String) As Boolean
    Dim t As Double
    Dim namePart As String
    Dim fileStartLoc As Long
    Dim fileLen As Long
    Dim strLoc As Long
    Dim equalsLoc As Long
    Dim quoteLoc1 As Long
    Dim quoteLoc2 As Long
    Dim lastStrLoc As Long
    Dim startLook As Long
    Dim numFiles As Long
    Dim whatLookMain As String
    Dim whatLook As String
    Dim fsoFileSys As New FileSystemObject
    Dim theFile As String
    Dim whatCount As Long

    Dim f
    Set f = fsoFileSys.OpenTextFile(sentfile, ForReading, False)
    theString = f.ReadAll
    backupString = theString
    'the string has the file!!
    numFiles = -1
    whatCount = 0
    Do
        whatCount = whatCount + 1
        Select Case whatCount
            Case 1
                whatLookMain = " src"
            Case 2
                whatLookMain = " background"
            Case 3
                whatLookMain = " background-image"
        End Select
        startLook = 1
        strLoc = 0
        Do
            equalsLoc = 0
            quoteLoc1 = 0
            quoteLoc2 = 0
            whatLook = whatLookMain
            strLoc = InStr(startLook, theString, whatLook, vbTextCompare)
            If strLoc > 0 Then
                startLook = strLoc + 1
                whatLook = "="
                equalsLoc = InStr(startLook, theString, whatLook, vbTextCompare)
                If equalsLoc > 0 Then
                    startLook = equalsLoc + 1
                    whatLook = Chr(34)                     ' quote sign
                    quoteLoc1 = InStr(startLook, theString, whatLook, vbTextCompare)
                    If quoteLoc1 > 0 Then
                        startLook = quoteLoc1 + 1
                        whatLook = Chr(34)
                        quoteLoc2 = InStr(startLook, theString, whatLook, vbTextCompare)
                        If quoteLoc2 > 0 Then
                            numFiles = numFiles + 1
                            fileStartLoc = quoteLoc1 + 1
                            fileLen = (quoteLoc2 - quoteLoc1) - 1
                            theFile = Mid(theString, fileStartLoc, fileLen)
                            ReDim Preserve tF(numFiles)
                            tF(numFiles).startLoc = fileStartLoc
                            tF(numFiles).endLoc = fileLen
                            namePart = fs.getNamePart(theFile)
                            If InStr(namePart, WEBSPACE) > 0 Then
                                tF(numFiles).FileName = replaceChars_TSB(namePart, WEBSPACE, " ")
                            Else
                                tF(numFiles).FileName = namePart
                            End If
                         '   Debug.Print theFile
                        '    Debug.Print tF(numFiles).FileName
                            startLook = quoteLoc2 + 1
                        End If
                    End If
                End If
            End If
        Loop While quoteLoc2 > 0
    Loop While whatCount < 4

    f.Close

End Function

0
 
LVL 7

Expert Comment

by:Burbble
ID: 11953731
Glad you got it solved :)

I don't think my comment should be the accepted answer though...

-Burbble
0
 
LVL 3

Author Comment

by:bczingo
ID: 11958275
'prolly not :)
do't know how to change it though
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you have ever used Microsoft Word then you know that it has a good spell checker and it may have occurred to you that the ability to check spelling might be a nice piece of functionality to add to certain applications of yours. Well the code that…
This article describes some techniques which will make your VBA or Visual Basic Classic code easier to understand and maintain, whether by you, your replacement, or another Experts-Exchange expert.
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…
Suggested Courses
Course of the Month8 days, 15 hours left to enroll

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question