Increasing my custom search engine's results?

Hi
I know that some searches made in Google can return millions of results items. In my Java custom search engine, CSE, I am at the point where it returns only 10 urls per call, as it states in the documentation. How do I get a massive return on my calls? I'd like to process thousands!

Here is what works right now...(appropriate setup done) (it returns 10 url's for the search given)

String searchString = "Michael Jordan";
       
        List<Result> items = customsearch.cse().list(searchString).execute().getItems();
       
       
        System.out.println("Search  for "+searchString+" , size = "+items.size());
        StringArrayOfLinks = new String[items.size()];
        int linkCount=0;
       
        for (Result item : items) {
            System.out.println(item.getTitle() + " (" + item.getLink() + ")");
            StringArrayOfLinks[linkCount] = item.getLink();
           
           
            linkCount++;
        }

I'd like to be able to process way more than the 10 items returned from Google.
Right now, I get the same 10 links returned.  It should be a different 10 every time.
Ideas?
I'd like linkCount to be >1000, for sure.
thanks
beavoidAsked:
Who is Participating?
 
mccarlConnect With a Mentor IT Business Systems Analyst / Software DeveloperCommented:
I think I have well and truly answered this enough times already. Here's one more.. it CAN'T be done! :) I thought you were getting somewhere with using Bing though?
0
 
nap0leonCommented:
If you want 10 different links every time the request runs... then you need to somehow create a bank of links for it to pick from.  What it is doing now is running the search and returning the top 10 items every time.  The only time you would see a difference in the search results is if that term's search results rankings have changed.
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
some searches made in Google can return millions of results items
Well, not exactly!! They tell you that there are potentially millions of results available, but they only return ~10 results per page.

This brings me to another point... Probably the main reason why Google limits the number of results to 100 via the API... Ads! Google is a business and as such they need to make money to continue the service that they provide. When you search via the normal webpage, Google slip some ads in there and therefore they get paid some money by the company that the ad is for. When you do your search via the API, there is no such mechanism to get paid for their effort, and hence they limit the service and charge for higher usage of that service.

One thing that I will clarify with you, just because you have touched on this a couple of times now, are you interested in finding out the number that is Google's estimate of the total number of search results? (say for ranking the popularity of a subject or something)  Because that IS something that the API returns... Breakout that line that executes the search and gets the results, into two separate lines and then you have access to the "total results" number, ie...
Search searchResult = customsearch.cse().list(searchString).execute();
System.out.println("About " + searchResult.getSearchInformation().getTotalResults() + " results available");
List<Result> items = searchResult.getItems();

Open in new window

0
Cloud Class® Course: Microsoft Windows 7 Basic

This introductory course to Windows 7 environment will teach you about working with the Windows operating system. You will learn about basic functions including start menu; the desktop; managing files, folders, and libraries.

 
beavoidAuthor Commented:
getTotalResults still returns 10.
I think that is the basic startup return count package
How does the retrieval count work? I read thousands are possible!
Thanks
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
When I search for the word "test" I get the following output...
About 341000000 results available
1:    Create Tests for Organizational Training and Certification Programs ... (http://www.test.com/)
2:    Speedtest.net - The Global Broadband Speed Test (http://www.speedtest.net/)
3:    Personality test based on C. Jung and I. Briggs Myers type theory (http://www.humanmetrics.com/cgi-win/jtypes2.asp)
4:    Speakeasy Speed Test (http://www.speakeasy.net/speedtest/)
5:    Test your IPv6. (http://test-ipv6.com/)
6:    Test - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Test)
7:    The HTML5 test - How well does your browser support HTML5? (http://html5test.com/)
8:    Test cricket - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Test_cricket)
9:    The Acid3 Test (http://acid3.acidtests.org/)
10:    Test (wrestler) - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Test_(wrestler))

Open in new window

0
 
beavoidAuthor Commented:
Could you please attach this output's code file to a comment? Super
Thanks
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
Here is the code. You will need to set your own api key and cx values...
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.ArrayList;
import java.util.List;

import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.customsearch.Customsearch;
import com.google.api.services.customsearch.Customsearch.Builder;
import com.google.api.services.customsearch.CustomsearchRequest;
import com.google.api.services.customsearch.CustomsearchRequestInitializer;
import com.google.api.services.customsearch.model.Result;
import com.google.api.services.customsearch.model.Search;

public class TestCustomSearchAPI {
    
    public static void main(String[] args) throws GeneralSecurityException, IOException {
        List<Result> items = new ArrayList<Result>();
        for (long i = 1; i <= 10; i += 10) {
            items.addAll(executeSearch("test", i));
        }
        
        int i = 1;
        for (Result item : items) {
            System.out.println(i++ + ":    " + item.getTitle() + " (" + item.getLink() + ")");
        }
    }

    private static List<Result> executeSearch(String searchTerm, final Long start) throws GeneralSecurityException, IOException {
        Builder builder = new Customsearch.Builder(GoogleNetHttpTransport.newTrustedTransport(), new JacksonFactory(), null);
        builder.setApplicationName("Search Test");
        builder.setCustomsearchRequestInitializer(new CustomsearchRequestInitializer() {
            @Override
            protected void initializeCustomsearchRequest(CustomsearchRequest<?> request) throws IOException {
                request.setKey("###########");
                request.set("cx", "%%%%%%%%%%%%");
                request.set("start", start);
            }
        });
        Customsearch customsearch = builder.build();
        Search searchResult = customsearch.cse().list(searchTerm).execute();
        System.out.println("About " + searchResult.getSearchInformation().getTotalResults() + " results available");
        List<Result> items = searchResult.getItems();
        return items;
    }
}

Open in new window

0
 
beavoidAuthor Commented:
Thanks,

"About 340000000 results available"
Not too shabby.

Pity that I only get 10 results per search.

This page discusses bigger returns.
But it is always ten links returned, no matter what? That probably isn't the worst thing in the world, really. Just curious.

here

Thanks
0
 
mccarlIT Business Systems Analyst / Software DeveloperCommented:
This page discusses bigger returns.
That page refers to the number of search requests, not the number of results that are achievable.

The problem is that Google don't provide an API for there plain old vanilla "Search" service. This is a "Custom Search Engine" whose intended purpose was to provide search facilities over a site (or a set of sites) and those sites would have X number of total pages. So returning any more than 100 page results is probably not that important when X might not be that much more than 100. But the way that we are setting the CSE up is non-standard, so that is why it is really catered for to return large numbers of results.
0
 
beavoidAuthor Commented:
Thanks,
I'm still seeing only 10
What else can we fiddle with?
0
 
beavoidAuthor Commented:
Thanks
I'm interested in getting the most links possible returned by my code, even if they are in super large quantities of 10 links returned. - to get close to the results numbers they claim to have found.

I don't see a way around this. I have signed up for Google's billable searching service, to have a look, see, they claim only to bill me if I top a threshold, which I have not topped, but it still returns only the 10 links. Is there a place where I can stipulate massive returns? Might it return a new list of 10 links on successive calls, or is it always the best 10? I seem to think I saw a results count text entry somewhere on the panel? Or did you mention a way to still get many different replies in the old system?
Thanks
0
 
beavoidAuthor Commented:
What is the final piece of the puzzle? to get large numbers?

thx
0
 
beavoidAuthor Commented:
I think so, thanks for making me see that. I'll ask another question, jut to test you all :)
Bing looks very promising. Their API is very straightforward, so is Google's, but I was impressed by Bing.
I have a question for them at stack exchange, because Bing had a control panel page where you could enter the required link count response, and I can't find that panel. I entered 1, so I could get something working first. I want to try 100 or more. They expect payment for very large returns, not surprisingly. I'm allowed 5,000 free searches a day, so 500* 100 found pages a day will keep me happy.

Thanks
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.