Solved

text data manipulation

Posted on 2009-05-14
10
202 Views
Last Modified: 2013-11-15
I recently received a screenscrape of a website.  The references to images in content was scrambled.  Fortunately there is a pattern to the scramble so I believe an Update statement or query can remedy the situation.  I would appreciate to be provided with the statement and/or string manipulation code to get this done.

Here is the broken pattern:   <IMG SRC="filename.pdf? images kb site>
Here is what it needs to be:  <IMG SRC="/kb/images/filename.pdf">

We have removed the "site" folder in our new system.
0
Comment
Question by:plord1234
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
10 Comments
 
LVL 65

Expert Comment

by:rockiroads
ID: 24389171
how about this

Assuming there is always a ? and double quotes

Replace(Left$(x, InStr(1, x, "?") - 1), Chr$(34), Chr$(34) & "/kb/images/") & Chr$(34) & ">"

where x is your fieldname (note the two occurrences of it)
0
 

Author Comment

by:plord1234
ID: 24389613
thanks rocki,

Your solution is valid, however:

I did not explain in my original question that text could contain multiple images and that a question mark (?) does not necessarily always follow an image file name. However, an image file name is always followed by a "?".
0
 
LVL 65

Expert Comment

by:rockiroads
ID: 24389701
Here is the broken pattern:   <IMG SRC="filename.pdf? images kb site>
Here is what it needs to be:  <IMG SRC="/kb/images/filename.pdf">

RIght, multiple images, can u provide an example? and your saying

the image filename can be either before or after ?

0
Webinar: Choosing a MySQL HA Solution

Join Percona’s Principal Technical Services Engineer, Marcos Albe as he presents Choosing a MySQL High Availability Solution on Thursday, June 29, 2017 at 10:00 am PDT / 2:00 pm EDT (UTC-7).

 

Author Comment

by:plord1234
ID: 24390065
Example in code snippet
<div class="box greyBox"><table cellspacing="0" cellpadding="3" width="50%" align="center" border="0"><tbody><TR><TD width="100%"><P align=center><B><FONT color=#000046 size=3 face=Arial>Rodeo Advantage  - FAQs</FONT></B></P></TD></TR><TR><TD><FONT size=1>All files and documentation are offered on an *AS IS* basis and you assume full responsibility for using them.</FONT></TD></TR></TBODY></TABLE><FONT size=2 face="arial, veranda"><TABLE border=0 cellPadding=5 width="40%" align=center><TBODY><TR bgColor=#aaaaaa><TH colSpan=2>Effects Tab - Turn ON / OFF Effects</TH></TR><TR><TD><PRE style="font-family: Trebuchet MS, Arial, Verdana;font-size: 10pt">Click on the tab to activate the controls <BR>and Virtual Surround Sound effects.<BR><BR><IMG SRC="aa_efare1.gif? kb_files images site><BR><IMG <IMG SRC="aa_efare2.gif? kb_files images site ><BR><BR>ENVIRONMENTS: 

Open in new window

0
 

Author Comment

by:plord1234
ID: 24390202
Rocki,

Also, what I meant was that a (?) could be in the text and be unrelated to an image file. for instance, a question could be asked in the text.
0
 
LVL 65

Expert Comment

by:rockiroads
ID: 24393821
Is it always fixed to prefix with /kb/images ?

If so this will do that, otherwise a little more tweaking required
This reads a line and then displays the parsed lines to the immediate window

    Dim sHtml As String
    Dim i As Integer
    Dim j As Integer
    Dim sSplitHtml() As String
    Dim sImages As String
   
    sHtml = "MY HTML LINE GOES HERE"
   
    sSplitHtml = Split(sHtml, "<IMG SRC=")
    For i = 1 To UBound(sSplitHtml)
       
        sSplitHtml(i) = Replace(Trim$(left$(sSplitHtml(i), InStr(1, sSplitHtml(i), ">") - 1)), Chr$(34), "")
        sImages = "<IMG SRC=" & Chr$(34) & "/kb/images/" & Replace(left$(sSplitHtml(i), InStr(1, sSplitHtml(i), " ") - 1), "?", "") & Chr$(34) & ">"
   
'HERE ARE YOUR RESULTS        
        Debug.Print sImages
    Next


So from the last sample you gave, it comes up with

            <IMG SRC="/kb/images/aa_efare1.gif">
            <IMG SRC="/kb/images/aa_efare2.gif">

0
 

Author Comment

by:plord1234
ID: 24395671
This is great Rocki.

Could you modify your code slightly so that all of the html is returned with the repaired image references?
0
 
LVL 65

Expert Comment

by:rockiroads
ID: 24398622
that makes it a little more interesting, the worry is the length of the line. Also, as you dont want individual values but the whole line, it might need a rethink, current code probably wont support this.
Anything else or is this it. Helps if you give info as much as possible.
0
 
LVL 65

Accepted Solution

by:
rockiroads earned 500 total points
ID: 24398759
ok. I thought about this and how to make use of what has been done so far and this is what I came up with
first pass is to go thru and sort out img src
then we loop again and display data, note loop starts from different start points. this is intentional

    Dim sHtml As String
    Dim i As Integer
    Dim j As Integer
    Dim sSplitHtml() As String
    Dim sImages As String
    Dim iChev As Integer
    Dim sLine As String
   
   
   
'    <div class="box greyBox"><table cellspacing="0" cellpadding="3" width="50%" align="center" border="0"><tbody><TR><TD width="100%"><P align=center><B><FONT color=#000046 size=3 face=Arial>Rodeo Advantage  - FAQs</FONT></B></P></TD></TR><TR><TD><FONT size=1>All files and documentation are offered on an *AS IS* basis and you assume full responsibility for using them.</FONT></TD></TR></TBODY></TABLE><FONT size=2 face="arial, veranda"><TABLE border=0 cellPadding=5 width="40%" align=center><TBODY><TR bgColor=#aaaaaa><TH colSpan=2>Effects Tab - Turn ON / OFF Effects</TH></TR><TR><TD><PRE style="font-family: Trebuchet MS, Arial, Verdana;font-size: 10pt">Click on the tab to activate the controls <BR>and Virtual Surround Sound effects.<BR><BR><IMG SRC="aa_efare1.gif? kb_files images site><BR><IMG <IMG SRC="aa_efare2.gif? kb_files images site ><BR><BR>ENVIRONMENTS:

    sHtml = "YOUR HTML LINE ABOVE HERE - HOWEVER YOU READ IT"
   
    sSplitHtml = Split(sHtml, "<IMG SRC=")
   
    For i = 0 To UBound(sSplitHtml)
        Debug.Print "Before", sSplitHtml(i)
    Next i
   
    For i = 1 To UBound(sSplitHtml)
       
        iChev = InStr(1, sSplitHtml(i), ">")
        If iChev > 0 Then
            sLine = Replace(Trim$(Left$(sSplitHtml(i), InStr(1, sSplitHtml(i), ">") - 1)), Chr$(34), "")

            sSplitHtml(i) = "<IMG SRC=" & Chr$(34) & "/kb/images/" & Replace(Left$(sLine, InStr(1, sLine, " ") - 1), "?", "") & Chr$(34) & ">" & Mid$(sSplitHtml(i), iChev + 1)
   
        End If
    Next

    For i = 0 To UBound(sSplitHtml)
        Debug.Print "After", sSplitHtml(i)
    Next i
 
0
 

Author Comment

by:plord1234
ID: 24424219
Thanks Rocki!

This worked great.  Is there anyway for me to send you $.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Your data is at risk. Probably more today that at any other time in history. There are simply more people with more access to the Web with bad intentions.
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question