Avatar of binovpd
binovpd
 asked on

Apache Solr best performance - pull all data as stored or index only and pull data from DB

Hi All,

We are planning to utilize Apace Solr (sunspot) for our website built using ruby on rails. We already have Solr setup and working using the sunspot Solr gem for rails.

When we first implemented Solr our assumption was that all indexing and data would pull from Solr, but obviously that's not the case. Solr is intended to be an index server not a database. With that said, there is an option to store data in Solr so the data pulled will come from the Solr server and not the DB direct. I've read various articles that say its ok to store the data on Solr as well as index.

The main reasons we decided to use Solr were to improve search speed, take some load of the database and for the faceted search capability that Solr offers.

We are a bit confused as to what the best practice would be to achieve these goals, more specifically to gain performance.

Should we:

1. store all needed fields displayed in the search results in Solr?
2. Use a combination of both (some stored data from Solr and other data from the DB)?
3. Only use Solr for indexing and pull all data from the DB?

Some additional info. Our database of product for our web search has been denormalized. This DB is specifically for read only purposes on the website. Total rows a little over 380,000. We are on rails 4.2, Ruby 2.2. MySQL Version 5.5.41 so we are pretty current on the software side.
MySQL ServerRubyApache Web Server

Avatar of undefined
Last Comment
binovpd

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
gheist

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
binovpd

ASKER
Hi gheist,

So are you saying its best practice to only use Solr for indexing and not to store data in Solr, but instead pull data from the DB? I wasnt clear on what you meant.
gheist

SOLR inevitably stores some data in its search index, it is not the best memory cache out ther to store all data.
binovpd

ASKER
Thank you for the reply. After researching this more you can use SOLR as a datastore but it depends on the application. SOLR works best if you have a denormalized database from which you are pulling the data. This is the case with our scenario. We update a "ready only table" for our products with most information that would need to be retrieved on our web application. We use store on SOLR and its much faster via SOLR direct than going via ActiveRecord. hitting the DB and pulling the information (Ruby on rails).
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck