Text answers processing and categorisation

Posted on 2004-12-01
Medium Priority
Last Modified: 2010-04-17
Hi all,

I'm due to program a natural-language categorisation engine soon and could really do with some pointers. The system basically involves people putting in free-text answers to questions and the system must successfully categorise the answers, and then rate them as positive / negative.


Q: What do you like about this website?
A1: I like the colours.
A2: Content Rocks!
A3: Nothing at all.

I need to categorise the answers into specific areas like layout / content - and tally positive and negative feedback.

This is a rather simplistic example, but if I can develop this then I can expand on the categories etc.

How could I implement this solution? Any ideas?

I've done natural language processing before (computational linguistics) and I've also *heard* about some Bayes theorem stuff, but sadly I wouldn't know where to start with the programming for these. A friend has also mentioned about a content management rating system for another website but I simply don't know where to start researching these issues.

I'm a mssql / c# / vb.net / vb6 developer with lots of experience so ideally I'd love some pointers in these technologies if possible. I'm not great at the maths end of things, so I'd prefer some practical examples of the ideas if possible.

Some source code or even a step-by-step description of a project implemented using these ideas would be absolutely fantastic.

I'm really depending on you for this one guys! Thanks...

Question by:DaveyByrne
  • 2
  • 2

Expert Comment

ID: 12721480
In regards to "*heard* about some Bayes theorem stuff" you may want to take a look at www.spambayes.org.  It is a OpenSource Python implimentation of the Bayes theorem for classifying your email as either spam (bad) or ham (good).  If you were to impliment a similar method in your web application, you'd have to "train" it first.  

Hope this helps,

Author Comment

ID: 12747640
Thanks Jake,

Do you have any experience with the Bayes Theorem? I just need someone to explain it to me in layman's terms - and I reckoned the best way to do this was to look at someone else's implementation.

Sadly this isn't as easy as I thought.

Any chance you could explain it to me in plain english?



Accepted Solution

jacobhoover earned 1000 total points
ID: 12760094
  Sorry I have no experience with the theorem itself, but I do know the spambayes implementation works excelent for determining spam.  Issues I can forsee is that while for email we have "ham" and "spam", while in your specification you have a variable number of categories.  I'll grind some more grey matter out tomorrow..


Author Comment

ID: 12786123
Thanks for the tip anyway Jake.

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will inform Clients about common and important expectations from the freelancers (Experts) who are looking at your Gig.
This article will show how Aten was able to supply easy management and control for Artear's video walls and wide range display configurations of their newsroom.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
Loops Section Overview

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question