Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 209
  • Last Modified:

How should I use Artificial Intelligence to sort out relevant statements from non-relevant ones?

I'm trying to build a program to sort out a stream of statements into relevant and non-relevant statements with regards to a particular domain name. What algorithms and frameworks would be helpful?

I shall clarify further with an example.

 Let me pick a subject like economics. For a given group of sentences and phrases, I should be able to sort out each of those to determine whether they belong to the field of economics or otherwise. If I see something regarding cooking or the weather, I should put that in the irrelevant category, and if I see something with regards to profits and GDP, I should include that in the relevant category. I understand that I should have some sort of knowledge base for that particular domain ie. economics.

I need pointers to where I can start.
How do I go about collecting the domain data?
What basic process structure should the system have?
I'm planning to use Java for the implementation.

Tutorials would also be very much appreciated.
0
Cynthia Wasonga
Asked:
Cynthia Wasonga
  • 2
1 Solution
 
dpearsonCommented:
To solve that generally is a pretty hard problem.  You'd want to start with a natural language parser (to understand the English text) and then categorize its outputs.

However if you want a simpler short cut you could also look at WordNet (https://wordnet.princeton.edu/) which is a semantic network for words - which means given "profit" you can look up what "type" of word this is (or list of options) and see that it can be related to economics.

Might give you what you need without getting into a full natural language processor.

Doug
1
 
Cynthia WasongaAuthor Commented:
Thanks dpearson,

I've found several tools to use for Natural Language processing at this site: https://opensource.com/business/15/7/five-open-source-nlp-tools

I'll look into those after doing some more research on NLP as a whole. I want to understand the details. Wordnet also looks good. Might come in handy when learning about NLP.

Again, thanks.
0
 
dpearsonCommented:
Yes if you're game to really jump into the solutions those should all get you started (although Lucerne shouldn't really be on the list in my opinion).

If you're not familiar with NLP - basically when you run these tools they'll give you a parse graph - the logical structure of a sentence broken up into grammatical elements.  Then once you have identified the nouns (likely the most important for classifying the sentences according to relevant or not) you can look up the semantic meaning - either as part of the NLP processor (some may include this) or via an external tool like WordNet.

Have fun exploring - it's very relevant stuff to learn about in the modern world.

Doug
1

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now