Text answers processing and categorisation

Posted on 2004-12-01
Last Modified: 2010-04-17
Hi all,

I'm due to program a natural-language categorisation engine soon and could really do with some pointers. The system basically involves people putting in free-text answers to questions and the system must successfully categorise the answers, and then rate them as positive / negative.


Q: What do you like about this website?
A1: I like the colours.
A2: Content Rocks!
A3: Nothing at all.

I need to categorise the answers into specific areas like layout / content - and tally positive and negative feedback.

This is a rather simplistic example, but if I can develop this then I can expand on the categories etc.

How could I implement this solution? Any ideas?

I've done natural language processing before (computational linguistics) and I've also *heard* about some Bayes theorem stuff, but sadly I wouldn't know where to start with the programming for these. A friend has also mentioned about a content management rating system for another website but I simply don't know where to start researching these issues.

I'm a mssql / c# / / vb6 developer with lots of experience so ideally I'd love some pointers in these technologies if possible. I'm not great at the maths end of things, so I'd prefer some practical examples of the ideas if possible.

Some source code or even a step-by-step description of a project implemented using these ideas would be absolutely fantastic.

I'm really depending on you for this one guys! Thanks...

Question by:DaveyByrne
    LVL 7

    Expert Comment

    In regards to "*heard* about some Bayes theorem stuff" you may want to take a look at  It is a OpenSource Python implimentation of the Bayes theorem for classifying your email as either spam (bad) or ham (good).  If you were to impliment a similar method in your web application, you'd have to "train" it first.  

    Hope this helps,
    LVL 1

    Author Comment

    Thanks Jake,

    Do you have any experience with the Bayes Theorem? I just need someone to explain it to me in layman's terms - and I reckoned the best way to do this was to look at someone else's implementation.

    Sadly this isn't as easy as I thought.

    Any chance you could explain it to me in plain english?


    LVL 7

    Accepted Solution

      Sorry I have no experience with the theorem itself, but I do know the spambayes implementation works excelent for determining spam.  Issues I can forsee is that while for email we have "ham" and "spam", while in your specification you have a variable number of categories.  I'll grind some more grey matter out tomorrow..

    LVL 1

    Author Comment

    Thanks for the tip anyway Jake.

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Join & Write a Comment

    Suggested Solutions

    Title # Comments Views Activity
    has77  challenge 9 56
    How to split this in C++ 4 51
    mapShare challenge 13 41
    create an incrementing variable name AutoHotKey 5 31
    INTRODUCTION We all know how to code. But at times you simply want to insert a common code block into your existing code and amend it as per your requirements. This tool not only saves you time but also saves you the pain of typing it all out aga…
    A short article about a problem I had getting the GPS LocationListener working.
    An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
    Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

    728 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now