<

Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x

Atomically updating documents in Solr using SolrJ

Published on
8,758 Points
5,658 Views
1 Endorsement
Last Modified:
In Solr 4.0 it is possible to atomically (or partially) update individual fields in a document. This article will show the operations possible for atomic updating as well as setting up your Solr instance to be able to perform the actions. One major motivation for using atomic updating is being able to change a part of the document without the need to regenerate the entire document. So if your document is created from many different data sources where fetching the data might be expensive, atomic updating might be worth looking into.
 

Getting started

First you must be using Solr 4.0. Older versions do not support atomic updates. And second, all fields in your schema.xml file must be set to stored. So if your schema file looked like this:

<field name="id" type="number" indexed="true" stored="true" required="true" />
<field name="title" type="text_en" indexed="true" stored="false"/>
<field name="submit_date" type="date" indexed="true" stored="false" />
<field name="views" type="number" indexed="true" stored="false" />

Open in new window

all stored="false" attributes must be changed to stored="true" so that your schema file looks like this:

<field name="id" type="number" indexed="true" stored="true" required="true" />
<field name="title" type="text_en" indexed="true" stored="true"/>
<field name="submit_date" type="date" indexed="true" stored="true" />
<field name="views" type="number" indexed="true" stored="true" />

Open in new window

Out of the box, atomic updating should work correctly if your schema file is configured correctly. If atomic updating does not work, it might be helpful to view the caveats and limitations section of the Solr wiki for other helpful information.
 


Atomically updating fields in SolrJ

Once your instance of Solr is up and running and configured correctly, you can atomically update a document using SolrJ.  When atomically updating a field, it is possible to perform four actions:

  • set (two operations in one command) - set a value or remove it if null is used as the value.
  • add – adds an additional value to a multi-valued field.
  • inc - increments a number field by a certain value or decrements it when a negative value is used.
Below is an example of a method to increment a field in a document by a number given the document's ID.
public void incrementDocumentByValue(String solrServerUrl, String documentId, 
   String fieldToUpdate, int incrementValue)
{
   // create the SolrJ client
   HttpSolrServer client = new HttpSolrServer(solrServerUrl);
 
   // create a new document
   SolrInputDocument document = new SolrInputDocument();
         
   // create a map of the operation to be performed
   Map<String, Object> operation = new HashMap<>();

   // insert the operation into the map
   operation.put("inc", new Integer(incrementValue));

   // unique identifier
   SolrSearchUtil.addToDocument(document, "id", documentId);

   // add the atomic update
   SolrSearchUtil.addToDocument(document, fieldToUpdate, operation);

   // send it to the solr server and shutdown
   client.add( document );
   client.shutdown();
}

Open in new window

The important part of the code above is the operation variable, which is java.util.Map where the key is the operation to be performed and the value is the number you want to increment the document's field by. So if we wanted to increase the views of a document by 5, the code might look like:

incrementDocumentByValue("localhost:8983/solr", "unique-id-0001", "views", 5);

Open in new window

 

Cost considerations

Being able to atomically update fields is not free and comes with a cost in the form of storage and perhaps performance. By setting all fields to be stored, the size of the index will definitely increase. From my experience, the increase can be anywhere from 5% to 50% depending on the data that is stored in those fields. If you have a large index of 10 gigabytes, expect your atomically updatable index to be a 11 to 15 gigabytes in size.

As for query time, it would make sense that that a larger index would mean Solr would have more work to do when searching for data in that index, but from my experience the query time is not so far off from a regular index where most of the fields aren't set to be stored. Using the Solr Admin query interface queries would sometimes return quicker on the atomically updatable index than the regular index. I tried looking for research or other people's experience with performance regarding an atomically updatable index, but I wasn't able to find anything, so if you run into anything please let me know.

Atomically updating documents can be useful in a right situation. Setting up your index to be compatible with atomic updating is relatively simple, but your index will be larger than an index that is not compatible with atomic updating. Using SolrJ, updating individual fields of a document is very simple.
1
Comment
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
0 Comments

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Join & Write a Comment

This tutorial will teach you the special effect of super speed similar to the fictional character Wally West aka "The Flash" After Shake : http://www.videocopilot.net/presets/after_shake/ All lightning effects with instructions : http://www.mediaf…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Other articles by this author
Suggested Courses
Course of the Month11 days, 4 hours left to enroll

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month