Solved

How should I use Artificial Intelligence to sort out relevant statements from non-relevant ones?

Posted on 2016-09-05
4
80 Views
Last Modified: 2016-09-07
I'm trying to build a program to sort out a stream of statements into relevant and non-relevant statements with regards to a particular domain name. What algorithms and frameworks would be helpful?

I shall clarify further with an example.

 Let me pick a subject like economics. For a given group of sentences and phrases, I should be able to sort out each of those to determine whether they belong to the field of economics or otherwise. If I see something regarding cooking or the weather, I should put that in the irrelevant category, and if I see something with regards to profits and GDP, I should include that in the relevant category. I understand that I should have some sort of knowledge base for that particular domain ie. economics.

I need pointers to where I can start.
How do I go about collecting the domain data?
What basic process structure should the system have?
I'm planning to use Java for the implementation.

Tutorials would also be very much appreciated.
0
Comment
Question by:Cynthia Wasonga
  • 2
4 Comments
 
LVL 26

Expert Comment

by:dpearson
ID: 41787242
To solve that generally is a pretty hard problem.  You'd want to start with a natural language parser (to understand the English text) and then categorize its outputs.

However if you want a simpler short cut you could also look at WordNet (https://wordnet.princeton.edu/) which is a semantic network for words - which means given "profit" you can look up what "type" of word this is (or list of options) and see that it can be related to economics.

Might give you what you need without getting into a full natural language processor.

Doug
1
 

Author Comment

by:Cynthia Wasonga
ID: 41787283
Thanks dpearson,

I've found several tools to use for Natural Language processing at this site: https://opensource.com/business/15/7/five-open-source-nlp-tools

I'll look into those after doing some more research on NLP as a whole. I want to understand the details. Wordnet also looks good. Might come in handy when learning about NLP.

Again, thanks.
0
 
LVL 26

Accepted Solution

by:
dpearson earned 500 total points
ID: 41787288
Yes if you're game to really jump into the solutions those should all get you started (although Lucerne shouldn't really be on the list in my opinion).

If you're not familiar with NLP - basically when you run these tools they'll give you a parse graph - the logical structure of a sentence broken up into grammatical elements.  Then once you have identified the nouns (likely the most important for classifying the sentences according to relevant or not) you can look up the semantic meaning - either as part of the NLP processor (some may include this) or via an external tool like WordNet.

Have fun exploring - it's very relevant stuff to learn about in the modern world.

Doug
1

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now