<

Using Coldfusion With SOLR

Published on
8,177 Points
2,177 Views
Last Modified:
Approved
Sometimes databases have MILLIONS of records and we need a way to quickly query that table to return the results me need. Sure you could use CFQUERY but it takes too long when there are millions of records. That is why SOLR was invented.

Please note, this article was intended for those that have some idea of what Coldfusion and SOLR are, but lack the experience on how to use them together.

I will be writing a guide soon on how to install SOLR 4.3 with Coldfusion. While Coldfusion comes with SOLR 1.5 built in, 4.3 has much better capabilities.

First, we need to declare our server host, port, and version.

<cfset SOLR_SERVER_HOST = 'localhost'>
<cfset SOLR_SERVER_PORT = '8983'>
<cfset SOLR_SERVER_VERSION = '4.3'>

Open in new window


I also like to declare a global variable for each collection name.

<cfset endUserCollection = "end_user">

Open in new window


I created the function below called SOLR_SEARCH.

<cffunction name="SOLR_SEARCH" returnType="struct" output="yes">

  <cfargument name="collection" type="string" required="true" default="">
  <cfargument name="criteria" type="string" required="false" default="*:*">
  <cfargument name="startrow" type="string" required="false" default="0">
  <cfargument name="maxrows" type="string" required="false" default="10">
  <cfargument name="sort" type="string" required="false" default="">
  <cfargument name="columns" type="string" required="false" default="category,custom1,custom2,custom3,custom4,key,score,size,summary,title,spell">
  <cfargument name="facet" type="string" required="false" default="">
  <cfargument name="timeout" type="string" required="false" default="10">
  <cfargument name="timeAllowed" type="string" required="false" default="10">
  <cfargument name="filterQuery" type="string" required="false" default="">
  <cfargument name="statsField" type="string" required="false" default="">
  
  <cfset returnStruct = StructNew()>
  
  <cfif NOT IsDefined("timeout") OR NOT IsNumeric(timeout) OR #timeout# LT 1>
    <cfset timeout = 10>
  </cfif>

  <cfif NOT IsDefined("timeAllowed") OR NOT IsNumeric(timeAllowed) OR #timeAllowed# LT 1>
    <cfset timeAllowed = 10>
  </cfif>

  <cfif NOT IsDefined("startrow") OR NOT IsNumeric(startrow) OR #startrow# LT 0>
    <cfset startrow = 0>
  </cfif>

  <cfif NOT IsDefined("maxrows") OR NOT IsNumeric(maxrows) OR #maxrows# LT 1>
    <cfset maxrows = 1>
  </cfif>

  <cfif NOT IsDefined("criteria")>
    <cfset criteria = "*:*">
  </cfif>

  <cfset myURL = 'http://#SOLR_SERVER_HOST#:#SOLR_SERVER_PORT#/solr/#collection#/select/'>
  <cfset myURL = '#myURL#?q=#criteria#&version=#SOLR_SERVER_VERSION#&start=#startrow#&rows=#maxrows#&indent=off'>
  
  <cfif IsDefined("statsField") AND #statsField# NEQ "">
    <cfset myURL = '#myURL#&stats=true&stats.field=#statsField#'>
  </cfif>

  <cfif IsDefined("filterQuery") AND #filterQuery# NEQ "">
    <cfset myURL = '#myURL#&fq=#filterQuery#'>
  </cfif>

  <cfif IsDefined("sort") AND #sort# NEQ "">
    <cfset myURL = '#myURL#&sort=#sort#'>
  </cfif>

  <cfif IsDefined("columns") AND #columns# NEQ "">
    <cfset myURL = '#myURL#&fl=#columns#'>
  </cfif>

  <cfif IsDefined("facet") AND #facet# NEQ "">
    <cfset myURL = '#myURL#&facet=true&#facet#'>
  </cfif>  

  <cfif IsDefined("timeAllowed") AND #timeAllowed# NEQ "">
    <cfset myURL = '#myURL#&timeAllowed=#timeAllowed#'>
  </cfif>  

  <cftry>

    <cfhttp method="Get" url="#myURL#" resolveurl="Yes" timeout="#timeout#" throwonerror="yes" />
    
    <cfcatch>
    
      <cfset xmlqry3 = QueryNew("HTTP_ERROR,NULL_FIELDLIST,ERROR_OCCURRED")>
      <cfset xmlqryRows = QueryAddRow(xmlqry3,1)> 
      <cfset xmlqryCell = QuerySetCell(xmlqry3,"HTTP_ERROR","1",1)>    
      <cfset xmlqryCell = QuerySetCell(xmlqry3,"NULL_FIELDLIST","0",1)> 
      <cfset xmlqryCell = QuerySetCell(xmlqry3,"ERROR_OCCURRED","1",1)> 
      <cfset returnStruct[0] = xmlqry3>
      
      <cfreturn returnStruct>
    
    </cfcatch>
    
  </cftry>

  <cfset xmlfile=XMLParse(cfhttp.FileContent)>

  <cfset numResultsFound = #xmlfile.XmlRoot.XmlChildren[2].XmlAttributes.numFound#>
  <cfset numResultsReturned = #ArrayLen(xmlfile.XmlRoot.XmlChildren[2].XmlChildren)#>

  <cfset partialResults = 'false'>
  <cfset queryFieldList = ''>
  <cfset queryStatus = ''>
  <cfset queryTime = ''>

  <cfloop index="y" from="1" to="#ArrayLen(xmlfile.XmlRoot.XmlChildren[1].XmlChildren)#">  
    <cfif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlAttributes.name# EQ "params">
      <cfloop index="z" from="1" to="#ArrayLen(xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlChildren)#">
        <cfif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlChildren[z].XmlAttributes.name# EQ "fl">
          <cfset queryFieldList = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlChildren[z].XmlText#'>
        </cfif>
      </cfloop>
    <cfelseif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlAttributes.name# EQ "status">
      <cfset queryStatus = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlText#'>
    <cfelseif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlAttributes.name# EQ "QTime">
      <cfset queryTime = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlText#'>
    <cfelseif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlAttributes.name# EQ "partialResults">
      <cfset partialResults = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[y].XmlText#'>      
    </cfif>       
  </cfloop>

  <cfif #queryFieldList# NEQ "">     
       
    <cfset xmlqry = QueryNew("#queryFieldList#")>  
        
    <cfif #numResultsReturned# NEQ 0>

      <cfset xmlqryRows = QueryAddRow(xmlqry,numResultsReturned)>

      <cfloop index="y" from="1" to="#numResultsReturned#">
          
        <cfif #ArrayLen(xmlfile.XmlRoot.XmlChildren[2].XmlChildren[y].XmlChildren)# NEQ 0>    
          
          <cfloop index="z" from="1" to="#ArrayLen(xmlfile.XmlRoot.XmlChildren[2].XmlChildren[y].XmlChildren)#">
            
            <cfloop list="#queryFieldList#" index="k">
             
              <cfset myElement = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[y].XmlChildren[z].XmlAttributes#'>
              
   	      <cfif #ArrayLen(StructFindValue(myElement,k))# NEQ 0>
            
                <cfif #k# EQ "spell">
                  <cfset myElement = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[y].XmlChildren[z].XmlChildren[1].XmlText#'>
                  <cfset xmlqryCell = QuerySetCell(xmlqry,"#k#",myElement,"#y#")>                
                <cfelseif #k# NEQ "category">
                  <cfset myElement = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[y].XmlChildren[z].XmlText#'>
                  <cfset xmlqryCell = QuerySetCell(xmlqry,"#k#",myElement,"#y#")>
                <cfelse>
                  <cfset catList = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[y].XmlChildren[z].XmlChildren[1].XmlText#'>
                  <cfset xmlqryCell = QuerySetCell(xmlqry,"#k#",catList,"#y#")>
                </cfif>
              
              </cfif>         
      
            </cfloop>
      
          </cfloop>
        
        </cfif>
    
      </cfloop>
      
    </cfif>

	<cftry>

	 <cfif #statsField# NEQ "">
          <cfset statsSection = #xmlfile.XmlRoot.XmlChildren[3].XmlChildren[1].XmlChildren[1].XmlChildren#>
          <cfset statsMax = #statsSection[2].XmlText#>
          <cfset statsMin = #statsSection[1].XmlText#>
          <cfset statsCount = #statsSection[3].XmlText#>
          <cfset statsMissing = #statsSection[4].XmlText#>
          <cfset statsSum = #statsSection[5].XmlText#>    
          <cfset statsSumOfSquares = #statsSection[6].XmlText#>
          <cfset statsMean = #statsSection[7].XmlText#>
          <cfset statsStdDev = #statsSection[8].XmlText#>
        <cfelse>
          <cfset statsMax = 'Not Available'>
          <cfset statsMin = 'Not Available'>
          <cfset statsCount = 'Not Available'>
          <cfset statsMissing = 'Not Available'>
          <cfset statsSum = 'Not Available'>  
          <cfset statsSumOfSquares = 'Not Available'>
          <cfset statsMean = 'Not Available'>
          <cfset statsStdDev = 'Not Available'>   
        </cfif>
        
        <cfcatch>
          <cfset statsMax = 'Not Available'>
          <cfset statsMin = 'Not Available'>
          <cfset statsCount = 'Not Available'>
          <cfset statsMissing = 'Not Available'>
          <cfset statsSum = 'Not Available'>  
          <cfset statsSumOfSquares = 'Not Available'>
          <cfset statsMean = 'Not Available'>
          <cfset statsStdDev = 'Not Available'>         
        </cfcatch>
        
    </cftry>
    
    <cfset xmlqry2 = QueryNew("RESULTS_FOUND,QUERY_STATUS,QUERY_TIME,RESULTS_RETURNED,PARTIAL_RESULTS,STATS_MAX,STATS_MIN,STATS_COUNT,STATS_MISSING,STATS_SUM,STATS_SUM_SQUARES,STATS_MEAN,STATS_STD_DEV")>
    <cfset xmlqryRows = QueryAddRow(xmlqry2,1)> 
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"RESULTS_FOUND","#numResultsFound#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"QUERY_STATUS","#queryStatus#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"QUERY_TIME","#queryTime#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"RESULTS_RETURNED","#numResultsReturned#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"PARTIAL_RESULTS","#partialResults#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_MAX","#statsMax#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_MIN","#statsMin#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_COUNT","#statsCount#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_MISSING","#statsMissing#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_SUM","#statsSum#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_SUM_SQUARES","#statsSumOfSquares#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_MEAN","#statsMean#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"STATS_STD_DEV","#statsStdDev#",1)>

    <cfset xmlqry3 = QueryNew("HTTP_ERROR,NULL_FIELDLIST,ERROR_OCCURRED")>
    <cfset xmlqryRows = QueryAddRow(xmlqry3,1)> 
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"HTTP_ERROR","0",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"NULL_FIELDLIST","0",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"ERROR_OCCURRED","0",1)> 
     
    <cfset returnStruct[0] = xmlqry3>
    <cfset returnStruct[1] = xmlqry2>
    <cfset returnStruct[2] = xmlqry> 
     
  <cfelse> 

    <cfset xmlqry3 = QueryNew("HTTP_ERROR,NULL_FIELDLIST,ERROR_OCCURRED")>
    <cfset xmlqryRows = QueryAddRow(xmlqry3,1)> 
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"HTTP_ERROR","0",1)>    
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"NULL_FIELDLIST","1",1)> 
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"ERROR_OCCURRED","1",1)> 
    <cfset returnStruct[0] = xmlqry3>
        
  </cfif> 
  
  <cfreturn returnStruct>
  
</cffunction>

Open in new window


The SOLR_SEARCH function has limitless possibilities.

For example, let's say we want to find out if a user is banned.

      <cfset username = '#HTMLEditFormat(Session.MM_Username)#'>

      <cfset args = StructNew()>
      <cfset args.collection = "#endUserCollection#">
      <cfset args.startrow = 0>
      <cfset args.maxrows = 1>
      <cfset args.columns = 'END_USER_ID,END_USER_USERNAME_BANNED'>
      <cfset args.sort = 'score+desc'>
      <cfset args.criteria = 'END_USER_USERNAME:#userName#'>
      <cfset args.timeout = 5>
      <cfset args.timeAllowed = 5000> 
    
      <cfinvoke method="SOLR_SEARCH" argumentCollection="#args#" returnVariable="END_USER_STRUCT">

      <cfif #END_USER_STRUCT[0].ERROR_OCCURRED# EQ 0 AND #END_USER_STRUCT[2].RecordCount# EQ 1>
	  
        <cfset userBanned = #END_USER_STRUCT[2].END_USER_USERNAME_BANNED#>
        
        <cfif #userBanned# EQ 1>
          <cflocation url="//www.#myDomain#/error/banned.cfm" addtoken="no"> 
        </cfif>
      
      </cfif>

Open in new window


What if we wanted to query a recordset and keep returning results until all rows are exhausted? Keep incrementing START_ROW below:

<cfset args = StructNew()>
<cfset args.collection = "#endUserCollection#">
<cfset args.startrow = #URL.START_ROW#>
<cfset args.maxrows = #PAGE_RESULTS#>
<cfset args.columns = 'END_USER_ID'>
<cfset args.sort = '#URL.ORDER_BY#'>
<cfset args.criteria = '#myCriteria#'>
<cfset args.timeout = 5>
<cfset args.timeAllowed = #URL.SEARCH_TIME#> 

<cfinvoke method="SOLR_SEARCH" argumentCollection="#args#" returnVariable="USER_STRUCT">

Open in new window


The "myCriteria" variable above can be set to '*:*' to return all results or like this below:

<cfset myCriteria = 'END_USER_TYPE_ID:1'>

<cfif #myCriteria# NEQ "" AND #URL.SKILL_ID# NEQ -1>
  <cfset myCriteria = '#myCriteria# AND CODER_SKILLS:#URL.SKILL_ID#'>
<cfelseif #URL.SKILL_ID# NEQ -1>
  <cfset myCriteria = 'CODER_SKILLS:#URL.SKILL_ID#'>
</cfif>

Open in new window


It checks if the user has a given user type (lets say member) and has a certain skill. You can make the that variable be as complex as you want it to be.

You might ask me well how fast is SOLR? How long does it take for results to return compared to CFQUERY? FRACTIONS OF A SECOND!!! It is as fast as a Google search.

My code also uses SOLR's built in capability of statistics. You can return the max, min, median, etc of a given result set. It's great for setting ranges of data.

So who uses this code? I have never released it to the public until now. However, you are free to do with it as you will. If you give me credit along the way, that's great. Just reference this article!

I hope that you enjoyed my article as there will be many more to come.

This article is contributed as part of the Coldfusion Development Project.
0
Comment
0 Comments

Featured Post

Cloud Class® Course: Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

Join & Write a Comment

From store locators to asset tracking and route optimization, learn how leading companies are using Google Maps APIs throughout the customer journey to increase checkout conversions, boost user engagement, and optimize order fulfillment. Powered …
Learn the basics of Skype For Business in office 365
Other articles by this author

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month