Avatar of blueantz_cpw
blueantz_cpw
 asked on

Scrape data from a website using php

I need to scrape data from websiteA and then upload it to websiteB's homepage.
websiteA urls: http://www.joecigar.com and  http://dailycigardeal.com

The data being scraped are daily deals and the content changes daily. This website currently does the exact same thing I need: http://www.cigarstash.com/cigar-deals.php

Please help. I hope that you can provide me with the code since I only have idea of web scraping and especially regex codes used for selective scraping.

Thank you!!!
PHP

Avatar of undefined
Last Comment
blueantz_cpw

8/22/2022 - Mon
Ray Paseur

The general design for something like this is to read the web page with file_get_contents() or CURL, then tease the HTML apart.  You would start your work with "view source" and build your scripts based on what you see in the HTML.  Every such application is a bespoke script, and they can break without notice at any time, so good error checking is vital.

Just looking at the site, I found this: http://www.joecigar.com/jcRss.asp and if that has the information you need, it will be much more stable and easier to process.  You can use the SimpleXML class to process RSS feeds.

Best regards, ~Ray
blueantz_cpw

ASKER
Rss feed didn't seem to work. Hoping that someone can provide me with the code since I only have idea of web scraping. Quick learner though.

Thanks.
Ray Paseur

Well, as I wrote, this is a bespoke web application script, and it can break without notice at any time, so good error checking is vital.  You might want to consider hiring a professional developer to help you with this project.  It is not easy or something that is a project for a quick study.  I would be glad to help you if I could in a few hundred lines of code but that is not an option here.  You have to study string manipulation in some detail to be successful.  Or simply acknowledge that time is money and make the conversion between time and money to hire a developer.

Please consider the resources you can find when you make a Google search for "PHP developer."

Good luck, ~Ray
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
ASKER CERTIFIED SOLUTION
Ray Paseur

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
blueantz_cpw

ASKER
Ray,

That's exactly what I need. How do I incorporate this into my Wordpress website? I use the Thesis Theme and I put the code into the custom_functions.php file, but it didn't work.

Any ideas??
Ray Paseur

Yes, I do not know WP well enough to advise you, but I have an idea.  There is a Wordpress Zone here at EE.  Use the Request Attention link up near the original question and ask a moderator to add this question to that zone.  Since it is the weekend it may take a day or two to get that part of the question answered.  Or you could just ask a new question in the WP and include a link to this one.  You can ask in up to three zones at a time.

BTW, I didn't reply to your "hire me" email because we can use the Atom feed and frankly, this is just not something that I would expect to get paid for.  Too small a job!

Best regards, ~Ray
blueantz_cpw

ASKER
I'm getting the dreaded message:Warning: file_get_contents(http://www.joecigar.com/jcRss.asp) [function.file-get-contents]: failed to open stream: HTTP request failed!

Is there a way to test the file_get_contents() function first and then scrape the data?
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.