Get the first sentence of a paragraph

Shouldn't be too tough, but I just don't have time right now to figure it out.

I need to be able to split a paragraph so that the first sentence goes into one string variable and the rest of the paragraph (sans the first sentence) goes into a second string variable. I need to take into account the common ways a sentence can end ("!","?",".").

Thank you.
LVL 9
stengeljAsked:
Who is Participating?
 
Justin_WCommented:
Determining natural language sentence boundaries is actually a very difficult thing to do. See the following links for additional info and reference:
http://www.cs.umd.edu/Honors/reports/Nilani.pdf
http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/sentence-end.html
http://computing.fnal.gov/docs/products/xemacs/v21_1/lispref.info,.StandardRegexps.html
http://www.codeproject.com/dotnet/RegexTutorial.asp

Your best bet for easily achieving something fairly accurate would be to use .NET's regular expressions to search for something like this:
    "[.?!][]\"')}]*\s*"
     This means a period, question mark or exclamation mark, followed
     optionally by a closing parenthetical character, followed by optional whitespace.

And then you would split the original string based on the index of the first match of the expression.
0
 
stengeljAuthor Commented:
Thanks.  I'll try it out.
0
 
stengeljAuthor Commented:
Perfect.   I think the only thing that might screw it up would be a punctuated abbreviation but, that should be a problem for what I'm doing. Here's my function:

Protected Function SplitAnno(ByVal myStr As String, ByVal myType As String) As String
        Dim i As Integer
        Dim s As String
        i = Regex.Match(myStr, "[.?!][]\""')}]*\s*").Index
        s = myStr.Substring(0, i + 1) '+1 to get the punctuation
        Select Case myType
            Case "Summary"
                Return Trim(s)
            Case "Body"
                Return Trim(myStr.Substring(s.Length + 1)) '+1 to go past the punctuation
            Case Else
                Return ""
        End Select

Thanks for the quick help!
    End Function
0
 
stengeljAuthor Commented:
Oops! I screwed up my function at the end with my thank you.
0
 
Justin_WCommented:
You're welcome. However, be advised that your function may also fail for strings that don't have any matches. Also, your "'+1 to get the punctuation" strategy doesn't take the optional trailing parenthetical characters or whitespace into account.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.