[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 934
  • Last Modified:

i have a small problem to read word file using php

dear friends ,
                        how can i read a wordfile(.doc) using php . i had  excel redear i can read form that all contant

                  can any one help me
0
srinut31
Asked:
srinut31
  • 5
  • 4
  • 4
  • +2
1 Solution
 
den4bCommented:
There are few open source tools on the net which can extract pure text from most of the MS Office formats. You can use them from within PHP to parse *.doc files.

* catdoc: http://wagner.pp.ru/~vitus/software/catdoc/
* word2x: http://word2x.sourceforge.net/

Both of these come as C/C++ source code, but catdoc has a complied version for DOS Real-Time mode, which runs fine on Windows.

* catdoc for DOS: http://ftp.wagner.pp.ru/pub/catdoc/catdoc-0.94.2.zip

(make sure you read the notes, for example: catdoc does not support long names, etc.)
0
 
srinut31Author Commented:
Thanks for your replay  i need to read a .doc file using php code
0
 
GuanoFunCommented:
Pretty sure you can't do that, since php has no way of recognizing Microsoft document filies directly.
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
srinut31Author Commented:
is it possible to ajax to do
0
 
GuanoFunCommented:
well... you can always install something like den4b told you and execute it silently with php
0
 
den4bCommented:
You can use those tools (listed above) in PHP, via program execution functions:

http://php.net/exec

string exec ( string $command [, array &$output [, int &$return_var ]] )
string system ( string $command [, int &$return_var ] )
string shell_exec ( string $cmd )

Example below will print out the textual contents of "example.doc":
exec("catdoc.exe example.doc", $output);
echo $output;

Open in new window

0
 
den4bCommented:
I use this exec method in PHP to index content of PDF, DOC, PPT files. Works perfectly. I doubt that you will ever find something pure-PHP based, that can extract content of *.doc files, because I've been looking for such code for a very long time without any luck.
0
 
Loganathan NatarajanLAMP DeveloperCommented:
You could not straight read and use the Ms-Word documents.... because you cannot identify the structure of the word document... better I would suggest to use XML way to read the data from the word file..   convert it into XML file then read the data ... that will give exact data to read ....
0
 
srinut31Author Commented:
hey logudotcom: can you explain  how can i do using XML .   So I can understand .
0
 
Loganathan NatarajanLAMP DeveloperCommented:
just save as the .doc ... into .xml ... then read the xml file through php...
0
 
Loganathan NatarajanLAMP DeveloperCommented:
if you don't have formatting in the .doc ... just save as .txt file... and read the contents.
0
 
Loganathan NatarajanLAMP DeveloperCommented:
It is better to consider do you really want to read the .doc file ... because it is very risky to read through php and process the .doc details... as php doesn't give full support functions to parse the .doc details... only read / open / close .doc file facility is  available...
0
 
CWS (haripriya)Commented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I will leave the following recommendation for this question in the Cleanup topic area:
  Delete - no points refunded

Any objections should be posted here in the next 4 days. After that time, the question will be closed.

cyberwebservice
Experts Exchange Cleanup Volunteer
0
 
Loganathan NatarajanLAMP DeveloperCommented:
Possible solution was given to this question...
0
 
srinut31Author Commented:
Dear logudotcom,
                                There is problem with reading .doc format its supports COM+  services .Unix does not  have that services i think . still am searching  for that  only.  
                   
0
 
den4bCommented:
The only 100% working solution for extracting content of DOC files is to use command line tools, like catdoc.exe, as demonstrated in my previous posts.
0
 
GuanoFunCommented:
more or less /qft
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 5
  • 4
  • 4
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now