• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1237
  • Last Modified:

Increasing my custom search engine's results?

Hi
I know that some searches made in Google can return millions of results items. In my Java custom search engine, CSE, I am at the point where it returns only 10 urls per call, as it states in the documentation. How do I get a massive return on my calls? I'd like to process thousands!

Here is what works right now...(appropriate setup done) (it returns 10 url's for the search given)

String searchString = "Michael Jordan";
       
        List<Result> items = customsearch.cse().list(searchString).execute().getItems();
       
       
        System.out.println("Search  for "+searchString+" , size = "+items.size());
        StringArrayOfLinks = new String[items.size()];
        int linkCount=0;
       
        for (Result item : items) {
            System.out.println(item.getTitle() + " (" + item.getLink() + ")");
            StringArrayOfLinks[linkCount] = item.getLink();
           
           
            linkCount++;
        }

I'd like to be able to process way more than the 10 items returned from Google.
Right now, I get the same 10 links returned.  It should be a different 10 every time.
Ideas?
I'd like linkCount to be >1000, for sure.
thanks
0
beavoid
Asked:
beavoid
  • 7
  • 5
1 Solution
 
nap0leonCommented:
If you want 10 different links every time the request runs... then you need to somehow create a bank of links for it to pick from.  What it is doing now is running the search and returning the top 10 items every time.  The only time you would see a difference in the search results is if that term's search results rankings have changed.
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
some searches made in Google can return millions of results items
Well, not exactly!! They tell you that there are potentially millions of results available, but they only return ~10 results per page.

This brings me to another point... Probably the main reason why Google limits the number of results to 100 via the API... Ads! Google is a business and as such they need to make money to continue the service that they provide. When you search via the normal webpage, Google slip some ads in there and therefore they get paid some money by the company that the ad is for. When you do your search via the API, there is no such mechanism to get paid for their effort, and hence they limit the service and charge for higher usage of that service.

One thing that I will clarify with you, just because you have touched on this a couple of times now, are you interested in finding out the number that is Google's estimate of the total number of search results? (say for ranking the popularity of a subject or something)  Because that IS something that the API returns... Breakout that line that executes the search and gets the results, into two separate lines and then you have access to the "total results" number, ie...
Search searchResult = customsearch.cse().list(searchString).execute();
System.out.println("About " + searchResult.getSearchInformation().getTotalResults() + " results available");
List<Result> items = searchResult.getItems();

Open in new window

0
 
beavoidAuthor Commented:
getTotalResults still returns 10.
I think that is the basic startup return count package
How does the retrieval count work? I read thousands are possible!
Thanks
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
When I search for the word "test" I get the following output...
About 341000000 results available
1:    Create Tests for Organizational Training and Certification Programs ... (http://www.test.com/)
2:    Speedtest.net - The Global Broadband Speed Test (http://www.speedtest.net/)
3:    Personality test based on C. Jung and I. Briggs Myers type theory (http://www.humanmetrics.com/cgi-win/jtypes2.asp)
4:    Speakeasy Speed Test (http://www.speakeasy.net/speedtest/)
5:    Test your IPv6. (http://test-ipv6.com/)
6:    Test - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Test)
7:    The HTML5 test - How well does your browser support HTML5? (http://html5test.com/)
8:    Test cricket - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Test_cricket)
9:    The Acid3 Test (http://acid3.acidtests.org/)
10:    Test (wrestler) - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Test_(wrestler))

Open in new window

0
 
beavoidAuthor Commented:
Could you please attach this output's code file to a comment? Super
Thanks
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
Here is the code. You will need to set your own api key and cx values...
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.ArrayList;
import java.util.List;

import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.customsearch.Customsearch;
import com.google.api.services.customsearch.Customsearch.Builder;
import com.google.api.services.customsearch.CustomsearchRequest;
import com.google.api.services.customsearch.CustomsearchRequestInitializer;
import com.google.api.services.customsearch.model.Result;
import com.google.api.services.customsearch.model.Search;

public class TestCustomSearchAPI {
    
    public static void main(String[] args) throws GeneralSecurityException, IOException {
        List<Result> items = new ArrayList<Result>();
        for (long i = 1; i <= 10; i += 10) {
            items.addAll(executeSearch("test", i));
        }
        
        int i = 1;
        for (Result item : items) {
            System.out.println(i++ + ":    " + item.getTitle() + " (" + item.getLink() + ")");
        }
    }

    private static List<Result> executeSearch(String searchTerm, final Long start) throws GeneralSecurityException, IOException {
        Builder builder = new Customsearch.Builder(GoogleNetHttpTransport.newTrustedTransport(), new JacksonFactory(), null);
        builder.setApplicationName("Search Test");
        builder.setCustomsearchRequestInitializer(new CustomsearchRequestInitializer() {
            @Override
            protected void initializeCustomsearchRequest(CustomsearchRequest<?> request) throws IOException {
                request.setKey("###########");
                request.set("cx", "%%%%%%%%%%%%");
                request.set("start", start);
            }
        });
        Customsearch customsearch = builder.build();
        Search searchResult = customsearch.cse().list(searchTerm).execute();
        System.out.println("About " + searchResult.getSearchInformation().getTotalResults() + " results available");
        List<Result> items = searchResult.getItems();
        return items;
    }
}

Open in new window

0
 
beavoidAuthor Commented:
Thanks,

"About 340000000 results available"
Not too shabby.

Pity that I only get 10 results per search.

This page discusses bigger returns.
But it is always ten links returned, no matter what? That probably isn't the worst thing in the world, really. Just curious.

here

Thanks
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
This page discusses bigger returns.
That page refers to the number of search requests, not the number of results that are achievable.

The problem is that Google don't provide an API for there plain old vanilla "Search" service. This is a "Custom Search Engine" whose intended purpose was to provide search facilities over a site (or a set of sites) and those sites would have X number of total pages. So returning any more than 100 page results is probably not that important when X might not be that much more than 100. But the way that we are setting the CSE up is non-standard, so that is why it is really catered for to return large numbers of results.
0
 
beavoidAuthor Commented:
Thanks,
I'm still seeing only 10
What else can we fiddle with?
0
 
beavoidAuthor Commented:
Thanks
I'm interested in getting the most links possible returned by my code, even if they are in super large quantities of 10 links returned. - to get close to the results numbers they claim to have found.

I don't see a way around this. I have signed up for Google's billable searching service, to have a look, see, they claim only to bill me if I top a threshold, which I have not topped, but it still returns only the 10 links. Is there a place where I can stipulate massive returns? Might it return a new list of 10 links on successive calls, or is it always the best 10? I seem to think I saw a results count text entry somewhere on the panel? Or did you mention a way to still get many different replies in the old system?
Thanks
0
 
beavoidAuthor Commented:
What is the final piece of the puzzle? to get large numbers?

thx
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
I think I have well and truly answered this enough times already. Here's one more.. it CAN'T be done! :) I thought you were getting somewhere with using Bing though?
0
 
beavoidAuthor Commented:
I think so, thanks for making me see that. I'll ask another question, jut to test you all :)
Bing looks very promising. Their API is very straightforward, so is Google's, but I was impressed by Bing.
I have a question for them at stack exchange, because Bing had a control panel page where you could enter the required link count response, and I can't find that panel. I entered 1, so I could get something working first. I want to try 100 or more. They expect payment for very large returns, not surprisingly. I'm allowed 5,000 free searches a day, so 500* 100 found pages a day will keep me happy.

Thanks
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 7
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now