Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

verity collection wont grab .pdf files

Posted on 2004-10-21
35
Medium Priority
?
195 Views
Last Modified: 2013-12-24
I have a policy search on my website http://www.fulton.cnyric.org/policies/default.cfm where you can either search by policy number or topic.  In the administrator I have done the collection with .doc, and .pdf - from what I can see this isn't a problem so it must be in my code.  I have the search_policy.cfm page information below.  Please let me know if you see whats wrong.  Thanks!

<CFSEARCH
   name = "GetPolicy"
   collection = "policy"
   criteria = "#Form.Criteria_1#"
   maxRows = "1000"
   startRow = "#FORM.StartRow#">

<CFIF GetPolicy.RecordCount is 0>
   <B>No files were found.  Please try again.</B>
   <CFELSE>
   <!--- At least one file fund --->
   <TABLE cellspacing=0 cellpadding=2>
   <TR bgcolor="cccccc">
      <TD><B>No</B></TD>
      <TD>&nbsp;</TD>
        <TD><B>Score</B></TD>
        <TD>&nbsp;</TD>
      <TD><B>File</B></TD>
     
   </TR>

   <CFOUTPUT query="GetPolicy" maxrows="#form.maxrows#">
   <TR bgcolor="#IIf(CurrentRow Mod 2, DE('ffffff'), DE('ffffcf'))#">

      <!--- current row information --->
      <TD>#Evaluate(Form.StartRow + CurrentRow - 1)#</TD>

      <TD>&nbsp;</TD>
        <TD>#Score#</TD>
        <TD>&nbsp;</TD>

      <!--- file name with the link returning the file --->
      <TD>
          <CFSET FileName=GetFileFromPath(Key)>
         <CFSET Ext=Right(FileName,Evaluate(Find(".", Reverse(FileName))-1))>
         <CFIF (Find(Ext,"doc,pdf") GT 0)>
            <!--- If it's a web doc, use URL returned --->
            <A target="_blank" HREF="#GetPolicy.URL#">#GetFileFromPath(Key)#</A>
         <CFELSE>
            <!--- It's not a web doc, use file path in KEY from result --->
            <A target="_blank" HREF="#Key#">#GetFileFromPath(Key)#</A>
         </CFIF>
      </TD>
     </TR>
   </CFOUTPUT>
   </TABLE>
<cfif getpolicy.recordcount GT form.maxrows>
<FORM action="search_policy.cfm" method="post">
      <CFOUTPUT>
         <INPUT type="hidden" name="Criteria_1"
          value="#Replace(Form.Criteria_1, """", "'", "ALL")#">
         <INPUT type="hidden" name="MaxRows" value="#Form.MaxRows#">
         <INPUT type="hidden" name="StartRow" value="#Evaluate(Form.StartRow + Form.MaxRows)#">
         <INPUT type="submit" value="     Next   ">
      </CFOUTPUT>
      </FORM></cfif>
 
   </CFIF>
0
Comment
Question by:ahillman
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 19
  • 8
  • 8
35 Comments
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12371274
Can you post the code that creates your verity?
0
 

Author Comment

by:ahillman
ID: 12372089
I do it through the Administrator.
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372131
try changing this criteria = "*#LCASE(Form.Criteria_1)#*" and see what happens
0
Plesk WordPress Toolkit

Plesk's WordPress Toolkit allows server administrators, resellers and customers to manage their WordPress instances, enabling a variety of development workflows for WordPress admins of all skill levels, from beginners to pros.

See why 2/3 of Plesk servers use it.

 

Author Comment

by:ahillman
ID: 12372191
where?
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372205
<CFSEARCH
   name = "GetPolicy"
   collection = "policy"
criteria = "*#LCASE(Form.Criteria_1)#*"
   maxRows = "1000"
   startRow = "#FORM.StartRow#">
0
 

Author Comment

by:ahillman
ID: 12372234
tried that and its still not happening......
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372259
try created the verity useing code instead of CFadmin
http://www.experts-exchange.com/Web/WebDevSoftware/ColdFusion/Q_21158967.html
0
 

Author Comment

by:ahillman
ID: 12372365
I am not using  the local host I am using IIS.  Would I set the directory path to point to the wwwroot\folder to be indexed on the server?

Here is what I have tried and it still didn't seem to work - everytime I tried to run I got a site wide error.

<cflock  type="exclusive" timeout="30" name="policies">

<cfindex
   collection="policy"
   action="refresh"
   type="path"
   key="d:\CFfulton\Policies"
   Extensions=".doc, .pdf"
   recurse="yes"
   language="english">

</cflock>
0
 

Author Comment

by:ahillman
ID: 12372397
okay - now I don't get an error - but it hasn't changed anything - I still can't get the .pdf's to show up and also if there is a policy that I look up that is ie: 5303E.1  it says there isn't such a policy when I know there is.
Any ideas? Thanks!
0
 

Author Comment

by:ahillman
ID: 12372467
Could this have something to do with the way some pdf's were created?  If you try to look up some policies ie:6122 then the pdf shows up - others ie:0330 do not.  What do you think?  and why if you look up 0330E does it come back with not finding anything?
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372530
can you post a link to another pdf that you exisit
0
 
LVL 9

Accepted Solution

by:
CFDevHead earned 800 total points
ID: 12372555
I think the problem is you are trying to search on scaned in docs. which in that case you can not do that because adobe turns them into image not text
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372705
Also when you search it seems to be searching on the whole words not just part of the word
example search 42 and you don't get any pdfs but search for 4200 and you get pdfs.

Good luck
And I hope this help.
0
 

Author Comment

by:ahillman
ID: 12372722
I just created a new pdf for policy 0000 and it didn't find it when I did a search for 0000.  my thought as well was scanned in docs - I do think that is part of the problem - but what about the new one I created - and still no idea why 0330 returns results but 0330E does not?
0
 
LVL 9

Expert Comment

by:CFDevHead
ID: 12372764
after you created the file did you reindex the verity?
If not your search will not find it.
0
 

Author Comment

by:ahillman
ID: 12372787
yep - almost forgot too - but remembered at that last second! :)
0
 

Author Comment

by:ahillman
ID: 12372798
Im goin to up the points - this seems to be a bit harder than I had anticipated.  
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373005
Hi Aimee, good to talk to you again :)

There are a couple issues we can check out:

First, were the PDFs created on a Mac? I have had some issues with this, especially using Acrobat 5 & 5.5

Also, can you check the permissions on the PDFs and make sure Everyone has Full Control?

Thanks
TJ
0
 

Author Comment

by:ahillman
ID: 12373023
No Macs here - :(   Just PC's.  Some were scanned in - so I can eliminate them from the picture, no pun intended, since I already know they won't work.  Let me take a look and see the permissions.
0
 

Author Comment

by:ahillman
ID: 12373052
Well - I took a look and actually I don't want people to have full control - these are policies that have security on them - read and execute only and printing are allowed.  If you go under the Series # you can see the 0000.pdf its only under the search that you don't get it.
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373091
Hi Aimee

In your code lets try temprarily changing    

<CFIF (Find(Ext,"doc,pdf") GT 0)>
            <!--- If it's a web doc, use URL returned --->
            <A target="_blank" HREF="#GetPolicy.URL#">#GetFileFromPath(Key)#</A>
         <CFELSE>
            <!--- It's not a web doc, use file path in KEY from result --->
            <A target="_blank" HREF="#Key#">#GetFileFromPath(Key)#</A>
         </CFIF>

to Just


#GetFileFromPath(Key)#

Are the results different?
0
 

Author Comment

by:ahillman
ID: 12373111
Did that - it returns only the doc still with no pdf? Go figure - its just not finding it for some reason.
0
 

Author Comment

by:ahillman
ID: 12373139
Just what you needed to spice up your day huh TJ.  :)
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373163
hehe, uhhhh ya!

I am beginning to wonder if it isnt the PDFs themselves.

The ones that show up under 0000, do these come from the collection too?
0
 

Author Comment

by:ahillman
ID: 12373226
Well - The items on the left of the page that have a series # are just in folders and are listed out when the page is called.  The search on the right side is the collection.
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373241
ok thats what I thought. I do not believe the PDFs are being indexed for some reason. I am going to create a test here and see what I come up with. You are still using CF 5 if I remember correctly, is this so?
0
 

Author Comment

by:ahillman
ID: 12373260
Nope - its CFMX  - just so ya know I have a meeting to go to at 3:00 and prob. wont be back for the day - I will check things out from home tonight.  Maybe a little time away will make the answer blaze to the front!  Thanks for thinking on your end and I will def. get back to ya.
Much appreciated!
Aimee
0
 

Author Comment

by:ahillman
ID: 12373275
By the way - if they aren't being indexed then why do some of them show up?  ie 4200 and 6122?  Frustrating! :)
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373382
Aimee I think I undestand why now. The text in those files is from a scanned image, therefore not indexable as text. You can see here:

http://www.sanative.net/aimee

My search term is "is". This should bring up many docs because many of the ones I grabbed from your site contain the word "is". Only my document came up. The link is broken intentionally :)
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12373395
the way to fix this, by the way, is to have a REALLY good OCR program render them as text and create the PDF from that, or just recreate the documents as text before converting to PDF
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12374405
I just noticed CFDevHead had this answer before me. I should have read this previously. Sorry CFDevHead! Aimee, you should award the points to him :)
0
 

Author Comment

by:ahillman
ID: 12379883
final question - if the docs are actually images - then why are they showing up at all when a search is done?  like the 0330 one?
0
 

Author Comment

by:ahillman
ID: 12381683
Thanks a bunch CFDevHead for your help and you too TJ!
0
 
LVL 2

Expert Comment

by:KoldFuzun
ID: 12382219
the only ones that come up in the search for me are not scanned images :)
0
 

Author Comment

by:ahillman
ID: 12382527
Gotcha - it is def. the way the individual did things when they created the documents.  Thanks! :)
0

Featured Post

Understanding Web Applications

Without even knowing it, most of us are using web applications on a daily basis. Gmail and Yahoo email, Twitter, Facebook, and eBay are used by most of us daily—and they are web applications. We often confuse these web applications tools for websites.  So, what is the difference?

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Periodically we have to update or add SSL certificates for customers. Depending upon your hosting plan you may be responsible for the installation and/or key generation. In the wake of Heartbleed many sites were forced to re-key. We will concen…
What You Need to Know when Searching for a Webhost Provider
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video. If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, …
Suggested Courses

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question