Perhaps trail or Mrichmon will be on this...
My PDF summaries, and not all of them, have the spacing between words removed in Verity summaries. I've spent about 4.5 hours trying to figure this out. My db custom collection runs just peachy.
What I've looked at:
Summaries are not using Metadata as there is no description in the metadata of the PDF
PDF does have proper spacing (text can be selected, copied and pasted into notepad)
Squished string comes from nowhere in particular in the document.
I do not have whitespace compression turned on. There appear to be spaces between sentances (sentances are not together in the document)
I have the default setting for the style.prm.
Code as follows:
<cfcollection action="create"
collection = "PubsSearch"
path = "D:\wwwroot\external_site\
verity\">
<cfquery name="fileInfo" datasource="dsn">
Select top 20 'D:\wwwroot\external_site\
pdffiles\'
+ pdf_location as filename,
'
http://www.mysite.com/pubs/display.cfm?pubID=' + convert(varchar,c.pubID) as custom1,
briefSummary + '<br>' + convert(varchar,datename(m
onth,pubDa
te)) + ' ' + convert(varchar,datepart(y
ear,pubDat
e)) as custom2, fullTitle
table info not important
</cfquery>
<cfindex collection="PubsSearch"
action="refresh"
query="fileInfo"
key="fileName"
title="fullTitle"
type="file"
custom1="custom1"
custom2="custom2"
language="English"
>
<cfsearch collection="PubsSearch,Pub
licationSe
arch" name="getPubs" criteria="#searchText#">
<cfoutput query="getPubs" group="key">
<cfif isnumeric(key)>
<p><a href="
http://www.mysite.com/pubs/display.cfm?pubID=#key#">#title#<
/a><br />
#summary#<br />
<cfoutput>#custom1#<br /></cfoutput>
#score#<br />
</p>
<cfelse>
<p><a href="#custom1#">#title#</
a><br />
<cfoutput>#custom2#<br /></cfoutput>
#score#<br />
//The line below is where the bad summary is coming out.
#summary#</p>
</cfif>
</cfoutput>
I've also got this question in PDF area as I'm not sure whose responsibility it is. I have the alternative working just peachy - that is not using #summary#, but using my custom2 as the actual summary with the date. However, this doesn't provide the 4 sentences 500 max summary that contains the search phrase.