?
Solved

verity collection wont grab .pdf files

Posted on 2004-10-21
35
Medium Priority
?
197 Views
Last Modified: 2013-12-24
I have a policy search on my website http://www.fulton.cnyric.org/policies/default.cfm where you can either search by policy number or topic.  In the administrator I have done the collection with .doc, and .pdf - from what I can see this isn't a problem so it must be in my code.  I have the search_policy.cfm page information below.  Please let me know if you see whats wrong.  Thanks!

<CFSEARCH
   name = "GetPolicy"
   collection = "policy"
   criteria = "#Form.Criteria_1#"
   maxRows = "1000"
   startRow = "#FORM.StartRow#">

<CFIF GetPolicy.RecordCount is 0>
   <B>No files were found.  Please try again.</B>
   <CFELSE>
   <!--- At least one file fund --->
   <TABLE cellspacing=0 cellpadding=2>
   <TR bgcolor="cccccc">
      <TD><B>No</B></TD>
      <TD>&nbsp;</TD>
        <TD><B>Score</B></TD>
        <TD>&nbsp;</TD>
      <TD><B>File</B></TD>
     
   </TR>

   <CFOUTPUT query="GetPolicy" maxrows="#form.maxrows#">
   <TR bgcolor="#IIf(CurrentRow Mod 2, DE('ffffff'), DE('ffffcf'))#">

      <!--- current row information --->
      <TD>#Evaluate(Form.StartRow + CurrentRow - 1)#</TD>

      <TD>&nbsp;</TD>
        <TD>#Score#</TD>
        <TD>&nbsp;</TD>

      <!--- file name with the link returning the file --->
      <TD>
          <CFSET FileName=GetFileFromPath(Key)>
         <CFSET Ext=Right(FileName,Evaluate(Find(".", Reverse(FileName))-1))>
         <CFIF (Find(Ext,"doc,pdf") GT 0)>
            <!--- If it's a web doc, use URL returned --->
            <A target="_blank" HREF="#GetPolicy.URL#">#GetFileFromPath(Key)#</A>
         <CFELSE>
            <!--- It's not a web doc, use file path in KEY from result --->
            <A target="_blank" HREF="#Key#">#GetFileFromPath(Key)#</A>
         </CFIF>
      </TD>
     </TR>
   </CFOUTPUT>
   </TABLE>
<cfif getpolicy.recordcount GT form.maxrows>
<FORM action="search_policy.cfm" method="post">
      <CFOUTPUT>
         <INPUT type="hidden" name="Criteria_1"
          value="#Replace(Form.Criteria_1, """", "'", "ALL")#">
         <INPUT type="hidden" name="MaxRows" value="#Form.MaxRows#">
         <INPUT type="hidden" name="StartRow" value="#Evaluate(Form.StartRow + Form.MaxRows)#">
         <INPUT type="submit" value="     Next   ">
      </CFOUTPUT>
      </FORM></cfif>
 
   </CFIF>
0
Comment
Question by:ahillman
  • 19
  • 8
  • 8
35 Comments
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12371274
Can you post the code that creates your verity?
0
 

Author Comment

by:ahillman
ID: 12372089
I do it through the Administrator.
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372131
try changing this criteria = "*#LCASE(Form.Criteria_1)#*" and see what happens
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 

Author Comment

by:ahillman
ID: 12372191
where?
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372205
<CFSEARCH
   name = "GetPolicy"
   collection = "policy"
criteria = "*#LCASE(Form.Criteria_1)#*"
   maxRows = "1000"
   startRow = "#FORM.StartRow#">
0
 

Author Comment

by:ahillman
ID: 12372234
tried that and its still not happening......
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372259
try created the verity useing code instead of CFadmin
http://www.experts-exchange.com/Web/WebDevSoftware/ColdFusion/Q_21158967.html
0
 

Author Comment

by:ahillman
ID: 12372365
I am not using  the local host I am using IIS.  Would I set the directory path to point to the wwwroot\folder to be indexed on the server?

Here is what I have tried and it still didn't seem to work - everytime I tried to run I got a site wide error.

<cflock  type="exclusive" timeout="30" name="policies">

<cfindex
   collection="policy"
   action="refresh"
   type="path"
   key="d:\CFfulton\Policies"
   Extensions=".doc, .pdf"
   recurse="yes"
   language="english">

</cflock>
0
 

Author Comment

by:ahillman
ID: 12372397
okay - now I don't get an error - but it hasn't changed anything - I still can't get the .pdf's to show up and also if there is a policy that I look up that is ie: 5303E.1  it says there isn't such a policy when I know there is.
Any ideas? Thanks!
0
 

Author Comment

by:ahillman
ID: 12372467
Could this have something to do with the way some pdf's were created?  If you try to look up some policies ie:6122 then the pdf shows up - others ie:0330 do not.  What do you think?  and why if you look up 0330E does it come back with not finding anything?
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372530
can you post a link to another pdf that you exisit
0
 
LVL 9

Accepted Solution

by:
CFDevHead earned 800 total points
ID: 12372555
I think the problem is you are trying to search on scaned in docs. which in that case you can not do that because adobe turns them into image not text
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372705
Also when you search it seems to be searching on the whole words not just part of the word
example search 42 and you don't get any pdfs but search for 4200 and you get pdfs.

Good luck
And I hope this help.
0
 

Author Comment

by:ahillman
ID: 12372722
I just created a new pdf for policy 0000 and it didn't find it when I did a search for 0000.  my thought as well was scanned in docs - I do think that is part of the problem - but what about the new one I created - and still no idea why 0330 returns results but 0330E does not?
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372764
after you created the file did you reindex the verity?
If not your search will not find it.
0
 

Author Comment

by:ahillman
ID: 12372787
yep - almost forgot too - but remembered at that last second! :)
0
 

Author Comment

by:ahillman
ID: 12372798
Im goin to up the points - this seems to be a bit harder than I had anticipated.  
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373005
Hi Aimee, good to talk to you again :)

There are a couple issues we can check out:

First, were the PDFs created on a Mac? I have had some issues with this, especially using Acrobat 5 & 5.5

Also, can you check the permissions on the PDFs and make sure Everyone has Full Control?

Thanks
TJ
0
 

Author Comment

by:ahillman
ID: 12373023
No Macs here - :(   Just PC's.  Some were scanned in - so I can eliminate them from the picture, no pun intended, since I already know they won't work.  Let me take a look and see the permissions.
0
 

Author Comment

by:ahillman
ID: 12373052
Well - I took a look and actually I don't want people to have full control - these are policies that have security on them - read and execute only and printing are allowed.  If you go under the Series # you can see the 0000.pdf its only under the search that you don't get it.
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373091
Hi Aimee

In your code lets try temprarily changing    

<CFIF (Find(Ext,"doc,pdf") GT 0)>
            <!--- If it's a web doc, use URL returned --->
            <A target="_blank" HREF="#GetPolicy.URL#">#GetFileFromPath(Key)#</A>
         <CFELSE>
            <!--- It's not a web doc, use file path in KEY from result --->
            <A target="_blank" HREF="#Key#">#GetFileFromPath(Key)#</A>
         </CFIF>

to Just


#GetFileFromPath(Key)#

Are the results different?
0
 

Author Comment

by:ahillman
ID: 12373111
Did that - it returns only the doc still with no pdf? Go figure - its just not finding it for some reason.
0
 

Author Comment

by:ahillman
ID: 12373139
Just what you needed to spice up your day huh TJ.  :)
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373163
hehe, uhhhh ya!

I am beginning to wonder if it isnt the PDFs themselves.

The ones that show up under 0000, do these come from the collection too?
0
 

Author Comment

by:ahillman
ID: 12373226
Well - The items on the left of the page that have a series # are just in folders and are listed out when the page is called.  The search on the right side is the collection.
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373241
ok thats what I thought. I do not believe the PDFs are being indexed for some reason. I am going to create a test here and see what I come up with. You are still using CF 5 if I remember correctly, is this so?
0
 

Author Comment

by:ahillman
ID: 12373260
Nope - its CFMX  - just so ya know I have a meeting to go to at 3:00 and prob. wont be back for the day - I will check things out from home tonight.  Maybe a little time away will make the answer blaze to the front!  Thanks for thinking on your end and I will def. get back to ya.
Much appreciated!
Aimee
0
 

Author Comment

by:ahillman
ID: 12373275
By the way - if they aren't being indexed then why do some of them show up?  ie 4200 and 6122?  Frustrating! :)
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373382
Aimee I think I undestand why now. The text in those files is from a scanned image, therefore not indexable as text. You can see here:

http://www.sanative.net/aimee

My search term is "is". This should bring up many docs because many of the ones I grabbed from your site contain the word "is". Only my document came up. The link is broken intentionally :)
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373395
the way to fix this, by the way, is to have a REALLY good OCR program render them as text and create the PDF from that, or just recreate the documents as text before converting to PDF
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12374405
I just noticed CFDevHead had this answer before me. I should have read this previously. Sorry CFDevHead! Aimee, you should award the points to him :)
0
 

Author Comment

by:ahillman
ID: 12379883
final question - if the docs are actually images - then why are they showing up at all when a search is done?  like the 0330 one?
0
 

Author Comment

by:ahillman
ID: 12381683
Thanks a bunch CFDevHead for your help and you too TJ!
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12382219
the only ones that come up in the search for me are not scanned images :)
0
 

Author Comment

by:ahillman
ID: 12382527
Gotcha - it is def. the way the individual did things when they created the documents.  Thanks! :)
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Meet the world's only “Transparent Cloud™” from Superb Internet Corporation. Now, you can experience firsthand a cloud platform that consistently outperforms Amazon Web Services (AWS), IBM’s Softlayer, and Microsoft’s Azure when it comes to CPU and …
Media Temple is thrilled to announce the launch of our new Partner Program, specifically designed to empower digital agencies and adtech platforms by offering white-glove support and exclusive hosting enhancements to optimize their sites and their c…
Kernel Data Recovery is a renowned Data Recovery solution provider which offers wide range of softwares for both enterprise and home users with its cost-effective solutions. Let's have a quick overview of the journey and data recovery tools range he…
Hi, this video explains a free download that you can incorporate into your Access databases, or use stand-alone for contact management. Contacts -- Names, Addresses, Phone Numbers, eMail Addresses, Websites, Lists, Projects, Notes, Attachments…
Suggested Courses

599 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question