Solved

discover sentences

Posted on 2001-07-16
2
650 Views
Last Modified: 2006-11-17
Hello.

I have some English text which I want to process.
I seem to be having difficulties splitting the text into sentences.

Can someone here suggest a code for that?
0
Comment
Question by:huitema
  • 2
2 Comments
 
LVL 8

Expert Comment

by:shlomoy
ID: 6288655
Check out a perl module I wrote and is available in CPAN:
http://search.cpan.org/search?dist=Lingua-EN-Sentence

The module's name is Lingua::EN::Sentence

Note that there is another module in CPAN trying to do the same (but in my opinion fails in many places where mine doesn't) - and its name is Text::Sentence
0
 
LVL 8

Accepted Solution

by:
shlomoy earned 300 total points
ID: 6288657
SYNOPSIS

        use Lingua::EN::Sentence qw( get_sentences add_acronyms );


        add_acronyms('lt','gen');               ## adding support for 'Lt. Gen.'
        my $sentences=get_sentences($text);     ## Get the sentences.
        foreach my $sentence (@$sentences) {
                ## do something with $sentence
        }

0

Featured Post

NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question