Solved

How to fetch  live data from website ?

Posted on 2012-03-22
7
684 Views
Last Modified: 2012-03-24
How to  fetch  live data from website  ( without RSS feed / Webservice )?  

Do you  use any API ?

Example:  Live cricket scores


I want to write a standalone java program which will fetch  live data from  site. How do I start ?  What are the things I require ?

Could you throw some light .
0
Comment
Question by:cofactor
7 Comments
 
LVL 30

Expert Comment

by:IanTh
ID: 37751683
you need a database so the web site works and you can view the data remotely

Do you use any api

that depends on your setup I expect it will be linux and you can install mysql its usually installed anyway although it can be installed in windows

for remote view
see
http://www.softpedia.com/get/Internet/Servers/Database-Utils/Remote-MySQL-Viewer.shtml
0
 
LVL 40

Accepted Solution

by:
gurvinder372 earned 500 total points
ID: 37751860
you can invoke that URL using HTTPURLConnection
http://docs.oracle.com/javase/tutorial/networking/urls/readingWriting.html

once you have got the content of that URL, then you can parse it using Xpath
http://www.rgagnon.com/javadetails/java-0550.html
http://htmlparser.sourceforge.net/
0
 
LVL 40

Expert Comment

by:gurvinder372
ID: 37751864
you may have to use jTidy to purify HTML
http://jtidy.sourceforge.net/
0
How our DevOps Teams Maximize Uptime

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us. Read the use case whitepaper.

 

Author Comment

by:cofactor
ID: 37752007
@gurvinder372
>>>you can invoke that URL using HTTPURLConnection

Is not that involve a  loop ?  something like this ...

while(true)
{
//HTTPURLConnection ...get data from site
//parse and get  required data
  Thread.sleep(30 secs)
}


OR   there  is some other smart and better approach exists I'm missing here ?


I  know a good package which  is capable of doing this kind of work neatly.  
http://httpunit.sourceforge.net/doc/faq.html

But I'm not sure whether you'll be using a loop to hit server every 30 secs interval to fetch latest updates  or  there is  better  a way ?


Please post you  comments
0
 
LVL 40

Expert Comment

by:gurvinder372
ID: 37752211
smarter approach would be to use web-services exposed by that site to use their data.

for example, google services, or facebook services, or twitter services
0
 
LVL 47

Expert Comment

by:for_yan
ID: 37755883
This is a simple working example of implementation of getting RSS feed form
the list of sites

It uses ROME API, let me find the appropriate link

import java.io.*;
import java.net.URL;
import java.util.ArrayList;
import java.util.Date;
import java.util.Iterator;
import java.util.StringTokenizer;

import com.sun.syndication.feed.synd.SyndEntry;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.io.SyndFeedInput;
import com.sun.syndication.io.XmlReader;

public class Reader {

 static String [] urls =  {"http://rss.cnn.com/rss/cnn_topstories.rss", "http://www.nytimes.com/services/xml/rss/nyt/World.xml",
 "http://www.nytimes.com/services/xml/rss/nyt/US.xml","http://newsrack.in/crawled.feeds/frontline.rss.xml"
};
public static void main(String[] args) throws Exception {                              

//URL url = new URL("http://viralpatel.net/blogs/feed");

  //  URL url = new URL("http://rss.cnn.com/rss/cnn_topstories.rss");
XmlReader reader = null;

try {

    while(true) {

long now = (new java.util.Date()).getTime();

    System.out.println("checking at " + (new Date(now)).toString());    
long week_before = now - (24L*3600L*7L*1000L);


ArrayList list = new ArrayList();

DataInputStream in = new DataInputStream(new FileInputStream("C:\\temp\\test\\visited.txt"));
PrintStream psout = new PrintStream(new FileOutputStream("C:\\temp\\test\\visited1.txt"));

String buff = null;

while((buff=in.readLine()) != null){
StringTokenizer t = new StringTokenizer(buff);
String ttime = t.nextToken();
if(Long.parseLong(ttime) <week_before)continue;
String llink = t.nextToken().trim();
list.add(llink);
psout.println(ttime + " " + llink);

}

          for (int jj=0; jj<urls.length; jj++){
            URL  url = new URL(urls[jj]);
reader = new XmlReader(url);
SyndFeed feed = new SyndFeedInput().build(reader);
//System.out.println("Feed Title: "+ feed.getAuthor());

for (Iterator i = feed.getEntries().iterator(); i.hasNext();) {
SyndEntry entry = (SyndEntry) i.next();
    String title = entry.getTitle();
     String link = entry.getUri().trim();
     if(list.contains(link))continue;
     Date date = entry.getPublishedDate();
// Problem here -->         **     SyndEntry source = item.getSource();
     String description;
     if (entry.getDescription()== null){
       description = "";
     } else {
       description = entry.getDescription().getValue();
     }
     String cleanDescription = description.replaceAll("\\<.*?>","").replaceAll("\\s+", " ");
        System.out.println(title);
System.out.println(link);
    System.out.println(cleanDescription);



//System.out.println(entry.getTitle());
//System.out.println(entry.getContents());
    System.out.println("");
     System.out.println("");
     psout.println(now + " " + link);
}

}

in.close();
psout.close();
File f0 = new File("C:\\temp\\test\\visited.txt");
File f1 = new File("C:\\temp\\test\\visited1.txt");
f0.delete();
f1.renameTo(f0);

Thread.currentThread().sleep(600000);

    }


} finally {
if (reader != null)
reader.close();
}
}
}
                                            

Open in new window

0
 
LVL 47

Expert Comment

by:for_yan
ID: 37755889
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
The viewer will learn how to implement Singleton Design Pattern in Java.

860 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question