Solved

Remove Non-Words

Posted on 2010-09-16
2
279 Views
Last Modified: 2012-05-10
Hi,

I'm looking for a way to remove non-english dictionary words from a file that has 3 fields:

Example:

word      xxword      x      xa      worlds

I'm looking to output only dictionary words:

word      hello        worlds

I'm pretty sure this would be possible to accomplish by using a dictionary that comes with Unix by overlapping the two files and outputting matches and formatting.


IThank you
0
Comment
Question by:faithless1
2 Comments
 
LVL 8

Accepted Solution

by:
shanikawm earned 450 total points
ID: 33699184
You can use php Pspell functions.

e.g.:

cat file.txt

penn pencil eraser
black bleu red
monitor key muose

php spell.php

pencil eraser
black red
monitor key

<?php
$pspell_link = pspell_new("en");
$lines=file('file.txt');
foreach ($lines as $line)
{
        $words=preg_split('/[ \s]+/',trim($line));
        foreach ($words as $word)
        {
                if(pspell_check($pspell_link,$word))
                {
                        echo $word,' ';
                }
        }
        echo "\n";
}
?> 

Open in new window

0
 
LVL 109

Assisted Solution

by:Ray Paseur
Ray Paseur earned 50 total points
ID: 33701333
See the notes here:
http://us.php.net/manual/en/pspell.installation.php

You can run this script to find out if you've got pSpell:
<?php phpinfo(); ?>

This search may have some good examples if you do not have the extension installed.
http://lmgtfy.com?q=PHP+spell+checking
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question