Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium


How to build a parallel search engine in java?

Posted on 2011-04-23
Medium Priority
Last Modified: 2012-05-11

       I was trying to make a parallel search engine in java, using a parallelized depth first search algorithm, but I didn't know where to start... are there any example programs I could start off with?

Thanks in advance!
Question by:errang
LVL 47

Expert Comment

ID: 35454309

Isn't this something related to what you are looking for:

Author Comment

ID: 35454343
Yea, I ran across that, but I was looking for something more basic, that would explain how your thread is going to move through a set of nodes, when the nodes don't contain any information.
LVL 47

Assisted Solution

for_yan earned 664 total points
ID: 35454400

Well, if you google like that:

parallelized depth first search algorithm java

there seems to be a lot of information with more
explanations and less explanations.
Well, if you happen to encounter some specialists here,
maybe you'd get some better explanation.
I'm not sure that algorithms zone here is as active as java zone.
Alternatively, perhaps some graph algorithms forum would be
your better choice. After all, it would probably not  matter for you too much
if this is in Java or C++

LVL 33

Accepted Solution

Paul Sauvé earned 668 total points
ID: 35454707
In fact, a parallel search engine algorithm requires a good knowledge of artificial intelligence.

To find an algorithm for a specific problem, the nature of the the search will depend on the nature of data being searched - i.e. numerical or text, the length of the data field or fields that are being searched, data base or flat file, and so forth.

You have surely seen these already, here is a link to a scholarly paper that has a theoretical algorithm, A Scalable Distributed Parallel Breadth-First Search, which mentions:
for distributed parallel BFS algorithms where the computation moves to the processor owning the data.  Obviously, the scalability of the distributed BFS algorithm for very large graphs becomes a critical issue, since the demand for local memory and inter-processor communication increases as the graph size increases.

This article Parallel Search concerns parallel searches as they apply to chess programming and contains explainations for these algorithms:
Shared Hash Table
Parallel Alpha-Beta
Principal Variation Splitting (PVS)

LVL 28

Assisted Solution

dpearson earned 668 total points
ID: 35465594
Are you clear on how to write the single threaded depth first search algorithm?  You walk the nodes of the tree going steadily deeper until you either find the goal of the search or hit a leaf of the tree, then you back up.

I'd make sure you know how to code that first before you move to parallelizing it.  Let's assume you have a method that implements single threads depth first search like this:

public void implementDFS(Node root) { ... }

To make it run in parallel, one approach would be to spawn a thread for each child of the top node of the tree.  For a binary tree this gives you 2 threads - not a huge increase in parallelism, but pretty easy to see how to do this I hope:

public void simpleParallelSeach(Node root) {
     Node firstChild = root.getLeftChild() ;
     Node secondChild = root.getRightChild() ;

     // Not the best way to start a thread (not even sure it'll compile), but you get the idea...
     new Thread() { public void run() { implementDFS(firstChild) ; } }.start();
     new Thread() {  public void run() { implementDFS(secondChild) ; } }.start() ;

If you can get that working now you've got 2 threads, each doing half of the work.  Depending on the work (e.g. looking up some data) you may need to combine the results.  It's probably fine to just do this combining step in a single thread at the end, as long as the search happens in parallel.

Now imagine you wanted 4 threads instead of 2.  I hope you can see that you could use the same approach, but start the threads one level deeper in the tree (again still assuming it's binary).  You'd locate all 4 grandchildren of the root and start the DFS on each of those 4 nodes.  (If the tree isn't binary you can still use the same method but for n-children per node you'd get n^2 grandkids and n^2 threads).

If you can get that all working, then you can probably start to see how you could pass in the target number of threads as a parameter and have it start that many threads.  You generally want about 1-2 per core of the machine you're running on (e.g. on a quad core CPU you want 4-8 threads generally) for maximum performance.

I hope that gets your started.


Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…
Suggested Courses

580 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question