Solved

parsing HTML code

Posted on 2004-08-26
9
234 Views
Last Modified: 2010-05-02
hi - has anyone had experience in taking HTML code - searching for all possible graphics paths and changing those paths then resaving the HTML code-?
further explaination...
    I am creating a "packager" that the "graphic guys and gals"  can create a html page using graphics from several sources (ie paths) across our network
    When the HTML file is put in "the packager" it would seach though HTML code and find all external references -copy those files locally and strip the paths off within the HTML code - In other words the HTML would run properly if all the graphic files were in the same directory as the HTML code ( which is what I want)
   The program then makes a cab file of this HTML file and all  graphic files

* what is this used for?? well we send all types of graphic formats to be displayed on remote advertising machines - but these are all single files (mostly .swf s) - we need to keep this methodology ( 1 file ) so thats why the "packaging")
 
Ive got the interface done - they can drag and drop files and pick from file chooser

Ive got the cab file maker done

I can 'prolly seach the HTML text for obvious "scr=" and "BACKGROUND=" and " .jpg" and " .gif" ect and copy then strip out the paths but just wondered if anyone has done this kind of thing before ??

(OR IS THERE A WAY THAT A WEB PAGE CAN BE PACKAGED WITH ALL GRAPHICS CONTAINED WITH IN ?) obviously im not a web page programmer :)

thanks in advance


0
Comment
Question by:bczingo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 52

Expert Comment

by:Carl Tawn
ID: 11907971
I've done a couple of similar things.  One was XML based, converting absolute paths to relative paths before packaging the document and uploading it to the web.  Done a fair bit of tag stripping and the such with HTML pages.  

Haven't come across a simple way of doing it. It usually just a case of reading the page line by line and looking for the start and end of tags and looking for "src=", "background=" and the rest.

Probably not what you wanted to hear :o)

0
 
LVL 3

Author Comment

by:bczingo
ID: 11908180
Not all I wanted but I appreciate the comment Carl
- what other tags are there that would point to external files?
0
 
LVL 52

Expert Comment

by:Carl Tawn
ID: 11910353
not many really.  

src and background obviously for images.  maybe <object> tags and <param> tags if you embed flash movies or anything like that.  possibly "includes" if you use external stylesheets or js files.

also, background-image (i think) if you use styles rather than just the "background" attribute.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 7

Expert Comment

by:Burbble
ID: 11919207
Do you have an example HTML file (or at least part of one)?

-Burbble
0
 
LVL 3

Author Comment

by:bczingo
ID: 11931273
Burbble
example wouldn't help really - could be anything at all - pick a page - any page :)


thanks again carl

(I thought I had posted this comment before ) gues ii\ didn't hit submit


0
 
LVL 7

Accepted Solution

by:
Burbble earned 250 total points
ID: 11931600
Ok, could you give an example of a line of HTML and how it should be changed? I'm just not entirely sure how you want to alter the path is all :-)

Like... <IMG SRC="http://www.mysite.com/images/file.jpg"> changes to <IMG SRC="images/file.jpg">?

Sorry about the confusion :/

-Burbble
0
 
LVL 3

Author Comment

by:bczingo
ID: 11947689
well I have it done :) $$$$$

not full proof but gets job done splendidly
- if anyones interested in the rest of the code give me a hollar

heres just the parse stuff
as you can see if theres other tags to look other then just
    " src"   " background"  " background-image"
 then just put in another case statement
           

Private Type theFilesInStr
    startLoc As Long
    endLoc As Long
    FileName As String
End Type
Public tF() As theFilesInStr
Private Const WEBSPACE = "%20"
Private theString As String
Private backupString As String

Public Function parseFindAttachments(sentfile As String) As Boolean
    Dim t As Double
    Dim namePart As String
    Dim fileStartLoc As Long
    Dim fileLen As Long
    Dim strLoc As Long
    Dim equalsLoc As Long
    Dim quoteLoc1 As Long
    Dim quoteLoc2 As Long
    Dim lastStrLoc As Long
    Dim startLook As Long
    Dim numFiles As Long
    Dim whatLookMain As String
    Dim whatLook As String
    Dim fsoFileSys As New FileSystemObject
    Dim theFile As String
    Dim whatCount As Long

    Dim f
    Set f = fsoFileSys.OpenTextFile(sentfile, ForReading, False)
    theString = f.ReadAll
    backupString = theString
    'the string has the file!!
    numFiles = -1
    whatCount = 0
    Do
        whatCount = whatCount + 1
        Select Case whatCount
            Case 1
                whatLookMain = " src"
            Case 2
                whatLookMain = " background"
            Case 3
                whatLookMain = " background-image"
        End Select
        startLook = 1
        strLoc = 0
        Do
            equalsLoc = 0
            quoteLoc1 = 0
            quoteLoc2 = 0
            whatLook = whatLookMain
            strLoc = InStr(startLook, theString, whatLook, vbTextCompare)
            If strLoc > 0 Then
                startLook = strLoc + 1
                whatLook = "="
                equalsLoc = InStr(startLook, theString, whatLook, vbTextCompare)
                If equalsLoc > 0 Then
                    startLook = equalsLoc + 1
                    whatLook = Chr(34)                     ' quote sign
                    quoteLoc1 = InStr(startLook, theString, whatLook, vbTextCompare)
                    If quoteLoc1 > 0 Then
                        startLook = quoteLoc1 + 1
                        whatLook = Chr(34)
                        quoteLoc2 = InStr(startLook, theString, whatLook, vbTextCompare)
                        If quoteLoc2 > 0 Then
                            numFiles = numFiles + 1
                            fileStartLoc = quoteLoc1 + 1
                            fileLen = (quoteLoc2 - quoteLoc1) - 1
                            theFile = Mid(theString, fileStartLoc, fileLen)
                            ReDim Preserve tF(numFiles)
                            tF(numFiles).startLoc = fileStartLoc
                            tF(numFiles).endLoc = fileLen
                            namePart = fs.getNamePart(theFile)
                            If InStr(namePart, WEBSPACE) > 0 Then
                                tF(numFiles).FileName = replaceChars_TSB(namePart, WEBSPACE, " ")
                            Else
                                tF(numFiles).FileName = namePart
                            End If
                         '   Debug.Print theFile
                        '    Debug.Print tF(numFiles).FileName
                            startLook = quoteLoc2 + 1
                        End If
                    End If
                End If
            End If
        Loop While quoteLoc2 > 0
    Loop While whatCount < 4

    f.Close

End Function

0
 
LVL 7

Expert Comment

by:Burbble
ID: 11953731
Glad you got it solved :)

I don't think my comment should be the accepted answer though...

-Burbble
0
 
LVL 3

Author Comment

by:bczingo
ID: 11958275
'prolly not :)
do't know how to change it though
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When trying to find the cause of a problem in VBA or VB6 it's often valuable to know what procedures were executed prior to the error. You can use the Call Stack for that but it is often inadequate because it may show procedures you aren't intereste…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question