Link to home
Start Free TrialLog in
Avatar of binovpd
binovpd

asked on

Apache Solr best performance - pull all data as stored or index only and pull data from DB

Hi All,

We are planning to utilize Apace Solr (sunspot) for our website built using ruby on rails. We already have Solr setup and working using the sunspot Solr gem for rails.

When we first implemented Solr our assumption was that all indexing and data would pull from Solr, but obviously that's not the case. Solr is intended to be an index server not a database. With that said, there is an option to store data in Solr so the data pulled will come from the Solr server and not the DB direct. I've read various articles that say its ok to store the data on Solr as well as index.

The main reasons we decided to use Solr were to improve search speed, take some load of the database and for the faceted search capability that Solr offers.

We are a bit confused as to what the best practice would be to achieve these goals, more specifically to gain performance.

Should we:

1. store all needed fields displayed in the search results in Solr?
2. Use a combination of both (some stored data from Solr and other data from the DB)?
3. Only use Solr for indexing and pull all data from the DB?

Some additional info. Our database of product for our web search has been denormalized. This DB is specifically for read only purposes on the website. Total rows a little over 380,000. We are on rails 4.2, Ruby 2.2. MySQL Version 5.5.41 so we are pretty current on the software side.
ASKER CERTIFIED SOLUTION
Avatar of gheist
gheist
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of binovpd
binovpd

ASKER

Hi gheist,

So are you saying its best practice to only use Solr for indexing and not to store data in Solr, but instead pull data from the DB? I wasnt clear on what you meant.
SOLR inevitably stores some data in its search index, it is not the best memory cache out ther to store all data.
Avatar of binovpd

ASKER

Thank you for the reply. After researching this more you can use SOLR as a datastore but it depends on the application. SOLR works best if you have a denormalized database from which you are pulling the data. This is the case with our scenario. We update a "ready only table" for our products with most information that would need to be retrieved on our web application. We use store on SOLR and its much faster via SOLR direct than going via ActiveRecord. hitting the DB and pulling the information (Ruby on rails).