Text answers processing and categorisation
Posted on 2004-12-01
I'm due to program a natural-language categorisation engine soon and could really do with some pointers. The system basically involves people putting in free-text answers to questions and the system must successfully categorise the answers, and then rate them as positive / negative.
Q: What do you like about this website?
A1: I like the colours.
A2: Content Rocks!
A3: Nothing at all.
I need to categorise the answers into specific areas like layout / content - and tally positive and negative feedback.
This is a rather simplistic example, but if I can develop this then I can expand on the categories etc.
How could I implement this solution? Any ideas?
I've done natural language processing before (computational linguistics) and I've also *heard* about some Bayes theorem stuff, but sadly I wouldn't know where to start with the programming for these. A friend has also mentioned about a content management rating system for another website but I simply don't know where to start researching these issues.
I'm a mssql / c# / vb.net / vb6 developer with lots of experience so ideally I'd love some pointers in these technologies if possible. I'm not great at the maths end of things, so I'd prefer some practical examples of the ideas if possible.
Some source code or even a step-by-step description of a project implemented using these ideas would be absolutely fantastic.
I'm really depending on you for this one guys! Thanks...