Solved

Get the first sentence of a paragraph

Posted on 2006-07-24
5
970 Views
Last Modified: 2008-02-26
Shouldn't be too tough, but I just don't have time right now to figure it out.

I need to be able to split a paragraph so that the first sentence goes into one string variable and the rest of the paragraph (sans the first sentence) goes into a second string variable. I need to take into account the common ways a sentence can end ("!","?",".").

Thank you.
0
Comment
Question by:stengelj
  • 3
  • 2
5 Comments
 
LVL 24

Accepted Solution

by:
Justin_W earned 500 total points
ID: 17170152
Determining natural language sentence boundaries is actually a very difficult thing to do. See the following links for additional info and reference:
http://www.cs.umd.edu/Honors/reports/Nilani.pdf
http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/sentence-end.html
http://computing.fnal.gov/docs/products/xemacs/v21_1/lispref.info,.StandardRegexps.html
http://www.codeproject.com/dotnet/RegexTutorial.asp

Your best bet for easily achieving something fairly accurate would be to use .NET's regular expressions to search for something like this:
    "[.?!][]\"')}]*\s*"
     This means a period, question mark or exclamation mark, followed
     optionally by a closing parenthetical character, followed by optional whitespace.

And then you would split the original string based on the index of the first match of the expression.
0
 
LVL 9

Author Comment

by:stengelj
ID: 17170192
Thanks.  I'll try it out.
0
 
LVL 9

Author Comment

by:stengelj
ID: 17170517
Perfect.   I think the only thing that might screw it up would be a punctuated abbreviation but, that should be a problem for what I'm doing. Here's my function:

Protected Function SplitAnno(ByVal myStr As String, ByVal myType As String) As String
        Dim i As Integer
        Dim s As String
        i = Regex.Match(myStr, "[.?!][]\""')}]*\s*").Index
        s = myStr.Substring(0, i + 1) '+1 to get the punctuation
        Select Case myType
            Case "Summary"
                Return Trim(s)
            Case "Body"
                Return Trim(myStr.Substring(s.Length + 1)) '+1 to go past the punctuation
            Case Else
                Return ""
        End Select

Thanks for the quick help!
    End Function
0
 
LVL 9

Author Comment

by:stengelj
ID: 17170523
Oops! I screwed up my function at the end with my thank you.
0
 
LVL 24

Expert Comment

by:Justin_W
ID: 17170571
You're welcome. However, be advised that your function may also fail for strings that don't have any matches. Also, your "'+1 to get the punctuation" strategy doesn't take the optional trailing parenthetical characters or whitespace into account.
0

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Lots of people ask this question on how to extend the “MembershipProvider” to make use of custom authentication like using existing database or make use of some other way of authentication. Many blogs show you how to extend the membership provider c…
One of the pain points with developing AJAX, JavaScript, JQuery, and other client-side behaviors is that JavaScript doesn’t allow for cross domain request for pulling content. For example, JavaScript code on www.johnchapman.name could not pull conte…
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

822 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question