Php advice from Ray/experts

Hi Ray/experts,

I have been getting help from experts in EE , then i will look into the code and learn things.

But my problem is my company puts me in many tech areas so i cant concentrate in particular language.

My goal now is to learn things and make a code to scrap data from a store myself.

Can you tell me what are the php functions i need to master and what are the html elements i need to master .

I already gone thru your old scripts posted in EE , but if  i want to understand things needed for getting the job done  ,then i need to learn the correct things so please provide your kind support again for me to complete my goal.

Who is Participating?
Julian HansenConnect With a Mentor Commented:
If you want to scrape a site - you need to load the remote page and then search it for relevant items.

Loading the page you can use these functions
The cUrl library

Here is a simple file_get_contents sample
$x = file_get_contents("");
<textarea style="width: 100%; border: 1px solid black; height: 600px"><?php echo $x;?></textarea>

Open in new window

After running this $x holds the HTML for the page.
If you need to send parameters to the site (login / post variables etc) then you would need to look at the cUrl library

file_get_contents []
cUrl []

Once you have the page downloaded you need to find the bits you are looking for. To do this you need to look at the
preg_match []
preg_match_all []

If you want to pull one item from the page - preg_match is the answer - if you want to pull a set of items that have the same pattern then use preg_match_all.

The difficult bit is constructing the regex pattern to match what you are looking for - but if you do that part correctly you should end up with an array with all the values in it you are looking for.

My advice - jump in the deep end play around with the above functions until you are familiar with them.

Then fire up Google and type in "php screen scraping" - there should be a wealth of resources there to get you started.
Cornelia YoderArtistCommented:
I would suggest you get some good books on web development (HTML, php, and MySQL) and make the learning project into a real learning project, not just a quick tell-me-what-I-might-need.

Depending on your project, there are a whole lot of different functions that might be useful, and it's not really possible to just list a few to "master".

Here are some examples of really excellent tutorial books, all available on amazon or in bookstores.

Web Design in a Nutshell, Jennifer Niederst Robbins, O'Reilly Publishing, 2006 (Covers HTML, CSS, PHP, and MySQL, plus several other important things)

PHP & MySQL For Dummies, Valde, Janet, Wiley Publishing, ISBN 0-470-09600

PHP Bible, Converse, Tim and Park, Joyce, Wiley Publishing, ISBN #0-7645-4955

SAMS Teach Yourself MySQL in 21 Days, Butcher, Anthony, SAMS Publishing, ISBN# 0-672-32392

PHP and MySQL Web Development, Welling, Luke and Thomson, Laura, SAMS Publishing, ISBN# 0-672-32672

Build Your Own Database Driven Website Using PHP & MySQL, Kevin Yank
Loganathan NatarajanLAMP DeveloperCommented:
Also, go with Object Oriented Programming (PHP5)  then start learning framework based development like CodeIgniter, Zend, CakePHP, YII ... Then coding practices of PHP and other stuffs.  Finally you can participate in the EE & solve questions..
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

Ray PaseurConnect With a Mentor Commented:
Good "getting started" learning resources are available in this article.  The first I would send you to is the latest edition of the Welling/Thompson book.

The reason I recommend the latest edition is because technology is always advancing.  A book written in 2010 is obsolete today, so just don't go there.  If you have a book that is more than a couple of years old, go buy the newest edition and give the old copy to one of your enemies.

You might want to learn regular expressions.  Or not.  It just depends on whether it's worth your time to take on an arcane programming language made from nothing but punctuation.  Mostly it's not needed for string parsing when you're talking about HTML documents.  For a humorous, but mostly true, view of REGEX:

Instead, I would suggest you look for creative ways to use PHP string functions.  These are particularly useful; study the PHP man pages, especially the user-contributed notes and become familiar with them.  You will write a few more lines of code when you use these instead of regular expressions, but you will get good results faster.
strpos() and stripos()

You can use your browser's "view source" to get a copy of the HTML document.  Store it on your server where you can experiment with a stable copy of the data.   Later, you can go back to the original URL and see if anything has changed.

A word of caution -- by now (2014) most web publishers who have data that is of any economic value have figured out that the world is full of hackers who will try to use screen scrapers to steal the data (D'Oh).  So they may not put the data into the HTML document, choosing instead to put placeholders.  Then, using JavaScript and AJAX, they can load the data dynamically after the HTML document is ready.  You may not find any good way to work around this issue, and it's a growing trend (Google stopped clear-text search results years ago).  So don't plan on building an application on such a shaky foundation.  Instead, go to the publisher of the data and ask for an API that exposes the information you need.  If they want to let you have it, they will give you an API (and they may expect you to pay for the use of the API).  An introduction to the concepts of APIs here:

HTH, ~Ray
magentoAuthor Commented:
Thanks for all the experts to take ur time and provide your advice. Thanks again for that.

Julian , i see ur post was really good. I will sure go thru those topics. I know looping and variables my problem is havent used them for long time eg. havent code anything for more time.

Now the rock of php , Mr. Ray - Sir , i have followed you from my start at EE . You are a great programmer , now i have gone thru the newbie article and gathered the docs for my learning. I prefer to go with your ~60 articles first before going thru the references u mentioned.

Also i paid for Lynda , can you show me which will be good video for me .

greetings magento, since you mentioned "html elements i need to master", I will say that this mozilla develope site has many helpful things to use and learn -

It has Challenges (kinda like homework) , and Some advanced HTML5, and a Demos page , where you can see the page display you get using different html-css techniques , there are also CSS and javascript instuctions for developers.

But you will need to spend the time for absorbing all of the interconnected web tech (html - css - javascriprt - php - webImages) that is now in the current server versions and browser versions.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.