Solved

Can anyone scan/convert this document into Word?

Posted on 2014-04-29
17
450 Views
Last Modified: 2014-05-05
I've been using http://www.abbyyonline.com/  to convert PDF files for a course into Word to make them searchable - the PDF files are not searchable.

I've never had any problems until the document attached which just doesn't seem to want to convert.

If someone could convert the PDF document into a searchable Word document I'd appreciate it, as it makes looking things up much easier obviously.
0
Comment
Question by:purplesoup
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
  • 3
17 Comments
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 40031093
Graham,
I think that conversion to a searchable form for private use of the individual with no intent to distribute or transmit is permitted under Fair Use. I'm not an intellectual property lawyer, but since the University has provided the document as a PDF file to students, it seems within Fair Use to make it a searchable PDF file or a searchable Word file. If you agree, I will post a searchable PDF file and a searchable Word file, which I have already created, and was about to post when I saw your comment — I certainly do not want to be in violation of copyright law. So please let me know me if you think this comes under the Fair Use provision.

purplesoup,
While we're waiting to hear back from Graham (who is an EE Admin), you should contact the copyright holder (The Open University) and request written permission to create a searchable PDF and/or searchable Word file. If you get written permission, that will remove the need to make an interpretation based on the Fair Use provision.

Regards, Joe
0
 

Author Comment

by:purplesoup
ID: 40031161
Well there have been a number of forum posts on the topic in which students - on the OU website - share converted PDFs, however there aren't any of this particular file. Tutors participate in the forums and have apologised that these documents are so old they aren't searchable and have thanked students who share searchable versions of the documents - which seems a sensible attitude.

I can post copies of the forum posts and links to already converted files, but I expect the "admin" person wouldn't like that.

Alternatively perhaps Joe you could email the files to bilbo039-eeATyahoo.co.uk - a temporary email address I've just set up?
0
 

Author Comment

by:purplesoup
ID: 40031162
Incidentally Graham could you point out exactly which policy you are referring to?
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 40031181
In case Graham is gone for the evening, I'll jump in with an answer to your question. He's talking about the Experts Exchange Terms of Use:
http://www.experts-exchange.com/terms.jsp

Take a look at Article 6, Code of Conduct:
You shall not engage in any of the following activities, which are strictly prohibited under the Code of Conduct.
and then
11. Posting any content that infringes any third party’s intellectual property rights or violates any confidentiality Agreements, contracts of employment, licenses, “Terms of Use”, or copyright.
Regards, Joe
0
 
LVL 76

Accepted Solution

by:
GrahamSkan earned 500 total points
ID: 40031397
The reason that PDF converters would have difficulty with the document is that it comprises an image. The text is not digitised.

This means that it would require a fully functional OCR program to interpret it. I haven't used ABBYY, but, from my experience with Nuance's Omnipage, I suspect that it would be a lengthy process.

The image quality is good, but there are many mathematical formulae with special symbols that would require manual intervention.
0
 

Author Comment

by:purplesoup
ID: 40032138
I've requested that this question be deleted for the following reason:

I believe this question may have infringed the EE third-party copyright policy so please delete it.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 40032176
Graham,
You're welcome...happy to help...and always want to comply with copyright law and, more generally, EE's Terms of Use.

On the technical front, I tried it with three Nuance products, which did well overall, but, of course, were not perfect. I used the latest version of PaperPort Professional (14.5), which has the OmniPage OCR engine under the covers (OP18 capture SDK), to create a searchable PDF. I then tried the latest version of OmniPage Professional, which is really OP19 in the numbering sequence (but Nuance decided to call it OmniPage Ultimate, for whatever reason), to OCR it and create a searchable DOCX file. I then used Nuance's latest-and-greatest piece of software, Power PDF Advanced — also to OCR it and create a searchable DOCX file.

You're right about the formulas. For example, this formula:

formulawas OCRed as follows:

PP14
H = {e,cio} = Cl UooC1

OP19
H = {e, cro} = CiU croCi

PPA
H = {e,cro} = Ci U croCi

I told all three products to ignore errors, so the process was fast, but I'm sure you're right that it would take a long time to intervene in order to get all of the formulas correct. I also have ABBYY FineReader (and some other OCR products), but figured those three tests were enough.

I presume the screen capture of that one formula falls under Fair Use, but if you disagree, feel free to remove it. Regards, Joe
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 40032797
Joe,
Thank for your additional information and support

purplesoup,
Thank you for accepting the situation. I wish you luck with your studies.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 40032818
Graham,
You're welcome...happy to help. Regards, Joe
0
 

Author Comment

by:purplesoup
ID: 40032917
Graham if it was possible perhaps it would be fairer if Joe got the points?

Note on formulas - I've been using the conversions just to search for keywords, I don't trust the formulas and go back to the original PDF if I need to check.
0
 

Author Comment

by:purplesoup
ID: 40032919
Also Graham if you just wanted to remove the attached file and leave this discussion that would be fine too.
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 40041487
Thank you, Netminder.  I'll remember that in future,
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 40043738
Netminder,
Thanks for doing that.

purplesoup,
Don't worry about reassigning the points. I'm fine with Graham getting them.

Regards, Joe
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article discusses the PaperPort 14 Scanner Connection Tool, which Nuance provides at no charge in order to fix scanning problems in Windows 8. Furthermore, users of PaperPort 14 in Windows 7 and Windows 10 have reported that the tool works in t…
Microsoft Word is a program we have all encountered at some point, but very few of us have dug deep into its full scope of features, let alone customized it to suit our needs. Luckily making the ribbon (aka toolbar, first introduced in Word 2007) wo…
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

735 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question