Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

script to get info over http

Posted on 2001-07-30
4
Medium Priority
?
199 Views
Last Modified: 2010-03-05
anyone ever see this?

http://www.experts-exchange.com/jsp/qShow.jsp?ta=suggestion&qid=10066793


how the hell did he do it?

just give me some ideas...

seems to me he would have to search through every paq.

isn't that an insane overhead?
0
Comment
Question by:bebonham
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
4 Comments
 
LVL 8

Accepted Solution

by:
shlomoy earned 400 total points
ID: 6336686
This is the way to go:

take the EE from page and extract all the urls of the various sections.

for each section url {
     get the HTML of that section
     extract all experts names from the page
     for each expert (but only for the first time you saw that exprts - as you don't want to process every expert more than once) {
          get the member profile page
          for each section in the "answered questions" section {
               save the sections's title along with the points
               be sure to get all the pages listing each section using the next XX link (this will need to be iterative - add the points until you finish all section's questions)
          }
     }
     repeat that also for the pages you can reach by following the next XX links
}



I think that's about it.
0
 
LVL 8

Author Comment

by:bebonham
ID: 6338823
thank you sir,

ouch.  That's a lot of requests...

so you are saying use LWP right?

basically, get an array like @membernames

through the process you described, and then

foreach(@membernames)
{
get "ee/jsp/memberProfile.jps?mbr=$_";
##then process
}

something like that?

I saw Interiot did a nice script for checking new questions too. You can see it on his profile.

thanks, shlomoy

regards,

Bob
0
 
LVL 8

Expert Comment

by:shlomoy
ID: 6339893
Many many requests.
You can dramatically reduce the number of requests if you have access to EE's database.

You can use LWP for doing "GET", sure :-)

You are right. You got the idea!


Sure.
Glad to help.

I'm actually very interested in such scripts which "data mine" sites.

0
 
LVL 8

Expert Comment

by:shlomoy
ID: 6339905
can you give me a link to his script?
I couldn't find it from his profile
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question