Solved

Convert Powerpoint to Text/HTML from the command line

Posted on 2008-10-25
5
385 Views
Last Modified: 2009-10-06
I didn't know quite how to categorise this question as it's an odd one, so apologies if I've added it to the wrong category. And if I have please point me to the correct category so I can ask the question again.

I have been using catppt in batch files to extract text from powerpoint files on my Windows 2003 server. It works fine EXCEPT there seems to be a bug in it and the -b option which is supposed to allow you to specify a character string to be placed at the end of each slide is unrecognised and the default operation of entering a formfeed character doesn't work either.

Up until now it's been OK because I have only been cataloguing the text for searching purposes. However I've just been handed a new project for creating a text preview for these files and of course I've no way of splitting the text into separate slides.

I've been googling for a couple of days on the subject and have drawn a complete blank. There are plenty of solutions for doing it in the windows GUI but not at the command line. So I thought I would throw it open to the experts.

Does anyone know of a command line DOS/Windows programme which can extract text or HTML from a Powerpoint file?

500 points as ever :-)
0
Comment
Question by:Dave6969
  • 2
  • 2
5 Comments
 
LVL 44

Expert Comment

by:scrathcyboy
ID: 22806136
catppt is the only one I have seen --
http://www.s2services.com/powerpoint-text-extractors.htm

But you can also check this list --
http://www.google.com/search?num=30&q=DOS+extract+text+from+.PPT

However, remember that PPT is graphics like PDF, so yo could convert all of them to PDF and run a text extractor on PDF file.  If less than 50 files, easier to do it by hand probably.
0
 

Author Comment

by:Dave6969
ID: 22807259
Cheers for that.

Yeah catppt would suit my purposes perfectly -- if only it weren't broken in the important respect of reliably  splitting up slides. :-(  I can't believe the guy who write it missed that out when he managed it in the companion apps! And the app is pretty hard to track down (even the link on that page doesn't work) so I'm guessing the guy isn't developing it any further.

It did occur to me that if I could change the ppt into a pdf with a page for every slide I could extract the text that way, but how to convert it to a pdf with a command line app? Again I've looked but all the ones I've seen have a GUI (and not often a very goon one either!)

Unfortunately there are hundreds of the things, so doing it by hand isn't an option I'm afraid.
0
 
LVL 44

Expert Comment

by:scrathcyboy
ID: 22808761
"but how to convert it to a pdf with a command line app?"

It's no easier.  You have to take what you can get, or convert them all manually to PDF and then you can select text on each one in PDF.  You can't call that person's app "broken" just because he decided to do it one way.  That is the decision of independent developers, especailly when you get it for FREE !!!!
0
 
LVL 21

Expert Comment

by:GlennaShaw
ID: 22809021
Here's this one, but I can't attest to it's effectiveness:
http://www.softpedia.com/get/Office-tools/Other-Office-Tools/All2Txt.shtml
0
 

Accepted Solution

by:
Dave6969 earned 0 total points
ID: 22822264
OK. Well thanks.

I did try All2txt but to be honest it was even worse because all the text came out on a single line, so at least catppt breaks it down to one line per line. I guess I'll just have to do the best I can with catppt.

Oh, and I can call that app broken. On all the man pages I've seen for it, it specifically documents a -b option for adding a string to place at the end of each slide. I know it's free and I'm grateful to guy, but why put in a useful option and then not ensure it works?
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

This article describes how to create custom column layout styles for Bootstrap. The article uses 5 columns to illustrate the concept, but the principle can be extended to any number of columns.
Is your Office 365 signature not working the way you want it to? Are signature updates taking up too much of your time? Let's run through the most common problems that an IT administrator can encounter when dealing with Office 365 email signatures.
In this tutorial viewers will learn how to position items using CSS's three positioning types Create a new HTML document with an internal stylesheet.: Create another div in CSS and name it Absolute : Type "position:absolute;" and "top:10px; left:50p…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now