converting html to a string / page scraping

Posted on 2006-03-23
Last Modified: 2009-12-16
Hi all,

I'm new to PHP programming so please bear with me.

I'm developing a java app for mobile phones that will take user input, send it to the web and get the response.
the response is formatted html that unfortunately has more page elements than i need and does not display well on the phone.

the page i send data to / from i have no control over.

what i was thinking is that if i wrote my own php script that recieved the request from the user, sent it on to the correct page
recieved the result from that page, stripped all the useless info and returned a simple string so that when the phones app
recieved the string it would be ready to display w/ no processing required.
something similar to page scraping i guess.

my reasons for trying it this way are
1) minimize the overhead on the phone, leaving the processing to the server which i think should be faster
2) minimize the data sent/recieved fom the phone to a minimum
3) to see the difference in the lag time and data transmission size between the current implementation (which recieves the whole html and scrapes it on the phone) and this implementation.
4) minimize the size of the app on the phone.

how do i go about doing this?

the user enters a string and that is transmitted to the site

the response that i need is always after a </form> tag
two elements later will always be either
a) "<p align=\"center\">"
(indicating nothing found)
b) "<b>"
(indicating something found)
and the data continues until a </div> is encountered.

the rest is junk

i know how to get the argument passed to the page from the app, but from there i'm kind of lost.

thoughts, suggestions?

any help is greatly appreciated!


Question by:sgaggerj
    LVL 6

    Accepted Solution

    This code parses out the response:

    $_input = "I am using the <b>actual
    mail text</b> as a way <p>to test the reg ex.
    I know it isn't exactly what is in the
    html page.</form> tag
    two elements later will always be either
    a) \"<p align=\"center\">\"
    (indicating nothing found)
    b) \"<b>\"
    (indicating something found)
    and the data continues until a </div>

    $_pattern = "/<\/FORM>(.*?<P ALIGN=\"CENTER\">.*?)<\/DIV>/si";
    preg_match($_pattern, $_input, $_match);

    echo "<PRE>\n";
    echo "</PRE>\n";

    I'm not sure if you are already pushing data to the other site and
    getting back the response, yet.

    If not, you can use cURL. Let me know, I can give you some examples.
    LVL 1

    Author Comment

    thanks Brian - sorry it took me so long to get back to this q.


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    Suggested Solutions

    Both Easy and Powerful How easy is PHP? (  Very easy.  It has been described as "a programming language even my grandmother can use." How powerful is PHP?  http://en.wikiped…
    Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this.Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it is …
    The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
    The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    22 Experts available now in Live!

    Get 1:1 Help Now