Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

text data manipulation

Posted on 2009-05-14
10
Medium Priority
?
209 Views
Last Modified: 2013-11-15
I recently received a screenscrape of a website.  The references to images in content was scrambled.  Fortunately there is a pattern to the scramble so I believe an Update statement or query can remedy the situation.  I would appreciate to be provided with the statement and/or string manipulation code to get this done.

Here is the broken pattern:   <IMG SRC="filename.pdf? images kb site>
Here is what it needs to be:  <IMG SRC="/kb/images/filename.pdf">

We have removed the "site" folder in our new system.
0
Comment
Question by:plord1234
  • 5
  • 5
10 Comments
 
LVL 65

Expert Comment

by:rockiroads
ID: 24389171
how about this

Assuming there is always a ? and double quotes

Replace(Left$(x, InStr(1, x, "?") - 1), Chr$(34), Chr$(34) & "/kb/images/") & Chr$(34) & ">"

where x is your fieldname (note the two occurrences of it)
0
 

Author Comment

by:plord1234
ID: 24389613
thanks rocki,

Your solution is valid, however:

I did not explain in my original question that text could contain multiple images and that a question mark (?) does not necessarily always follow an image file name. However, an image file name is always followed by a "?".
0
 
LVL 65

Expert Comment

by:rockiroads
ID: 24389701
Here is the broken pattern:   <IMG SRC="filename.pdf? images kb site>
Here is what it needs to be:  <IMG SRC="/kb/images/filename.pdf">

RIght, multiple images, can u provide an example? and your saying

the image filename can be either before or after ?

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:plord1234
ID: 24390065
Example in code snippet
<div class="box greyBox"><table cellspacing="0" cellpadding="3" width="50%" align="center" border="0"><tbody><TR><TD width="100%"><P align=center><B><FONT color=#000046 size=3 face=Arial>Rodeo Advantage  - FAQs</FONT></B></P></TD></TR><TR><TD><FONT size=1>All files and documentation are offered on an *AS IS* basis and you assume full responsibility for using them.</FONT></TD></TR></TBODY></TABLE><FONT size=2 face="arial, veranda"><TABLE border=0 cellPadding=5 width="40%" align=center><TBODY><TR bgColor=#aaaaaa><TH colSpan=2>Effects Tab - Turn ON / OFF Effects</TH></TR><TR><TD><PRE style="font-family: Trebuchet MS, Arial, Verdana;font-size: 10pt">Click on the tab to activate the controls <BR>and Virtual Surround Sound effects.<BR><BR><IMG SRC="aa_efare1.gif? kb_files images site><BR><IMG <IMG SRC="aa_efare2.gif? kb_files images site ><BR><BR>ENVIRONMENTS: 

Open in new window

0
 

Author Comment

by:plord1234
ID: 24390202
Rocki,

Also, what I meant was that a (?) could be in the text and be unrelated to an image file. for instance, a question could be asked in the text.
0
 
LVL 65

Expert Comment

by:rockiroads
ID: 24393821
Is it always fixed to prefix with /kb/images ?

If so this will do that, otherwise a little more tweaking required
This reads a line and then displays the parsed lines to the immediate window

    Dim sHtml As String
    Dim i As Integer
    Dim j As Integer
    Dim sSplitHtml() As String
    Dim sImages As String
   
    sHtml = "MY HTML LINE GOES HERE"
   
    sSplitHtml = Split(sHtml, "<IMG SRC=")
    For i = 1 To UBound(sSplitHtml)
       
        sSplitHtml(i) = Replace(Trim$(left$(sSplitHtml(i), InStr(1, sSplitHtml(i), ">") - 1)), Chr$(34), "")
        sImages = "<IMG SRC=" & Chr$(34) & "/kb/images/" & Replace(left$(sSplitHtml(i), InStr(1, sSplitHtml(i), " ") - 1), "?", "") & Chr$(34) & ">"
   
'HERE ARE YOUR RESULTS        
        Debug.Print sImages
    Next


So from the last sample you gave, it comes up with

            <IMG SRC="/kb/images/aa_efare1.gif">
            <IMG SRC="/kb/images/aa_efare2.gif">

0
 

Author Comment

by:plord1234
ID: 24395671
This is great Rocki.

Could you modify your code slightly so that all of the html is returned with the repaired image references?
0
 
LVL 65

Expert Comment

by:rockiroads
ID: 24398622
that makes it a little more interesting, the worry is the length of the line. Also, as you dont want individual values but the whole line, it might need a rethink, current code probably wont support this.
Anything else or is this it. Helps if you give info as much as possible.
0
 
LVL 65

Accepted Solution

by:
rockiroads earned 2000 total points
ID: 24398759
ok. I thought about this and how to make use of what has been done so far and this is what I came up with
first pass is to go thru and sort out img src
then we loop again and display data, note loop starts from different start points. this is intentional

    Dim sHtml As String
    Dim i As Integer
    Dim j As Integer
    Dim sSplitHtml() As String
    Dim sImages As String
    Dim iChev As Integer
    Dim sLine As String
   
   
   
'    <div class="box greyBox"><table cellspacing="0" cellpadding="3" width="50%" align="center" border="0"><tbody><TR><TD width="100%"><P align=center><B><FONT color=#000046 size=3 face=Arial>Rodeo Advantage  - FAQs</FONT></B></P></TD></TR><TR><TD><FONT size=1>All files and documentation are offered on an *AS IS* basis and you assume full responsibility for using them.</FONT></TD></TR></TBODY></TABLE><FONT size=2 face="arial, veranda"><TABLE border=0 cellPadding=5 width="40%" align=center><TBODY><TR bgColor=#aaaaaa><TH colSpan=2>Effects Tab - Turn ON / OFF Effects</TH></TR><TR><TD><PRE style="font-family: Trebuchet MS, Arial, Verdana;font-size: 10pt">Click on the tab to activate the controls <BR>and Virtual Surround Sound effects.<BR><BR><IMG SRC="aa_efare1.gif? kb_files images site><BR><IMG <IMG SRC="aa_efare2.gif? kb_files images site ><BR><BR>ENVIRONMENTS:

    sHtml = "YOUR HTML LINE ABOVE HERE - HOWEVER YOU READ IT"
   
    sSplitHtml = Split(sHtml, "<IMG SRC=")
   
    For i = 0 To UBound(sSplitHtml)
        Debug.Print "Before", sSplitHtml(i)
    Next i
   
    For i = 1 To UBound(sSplitHtml)
       
        iChev = InStr(1, sSplitHtml(i), ">")
        If iChev > 0 Then
            sLine = Replace(Trim$(Left$(sSplitHtml(i), InStr(1, sSplitHtml(i), ">") - 1)), Chr$(34), "")

            sSplitHtml(i) = "<IMG SRC=" & Chr$(34) & "/kb/images/" & Replace(Left$(sLine, InStr(1, sLine, " ") - 1), "?", "") & Chr$(34) & ">" & Mid$(sSplitHtml(i), iChev + 1)
   
        End If
    Next

    For i = 0 To UBound(sSplitHtml)
        Debug.Print "After", sSplitHtml(i)
    Next i
 
0
 

Author Comment

by:plord1234
ID: 24424219
Thanks Rocki!

This worked great.  Is there anyway for me to send you $.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This month, Experts Exchange sat down with resident SQL expert, Jim Horn, for an in-depth look into the makings of a successful career in SQL.
Creating a Cordova application which allow user to save to/load from his Dropbox account the application database.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question