Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How should I use Artificial Intelligence to sort out relevant statements from non-relevant ones?

Posted on 2016-09-05
4
Medium Priority
?
197 Views
Last Modified: 2016-09-07
I'm trying to build a program to sort out a stream of statements into relevant and non-relevant statements with regards to a particular domain name. What algorithms and frameworks would be helpful?

I shall clarify further with an example.

 Let me pick a subject like economics. For a given group of sentences and phrases, I should be able to sort out each of those to determine whether they belong to the field of economics or otherwise. If I see something regarding cooking or the weather, I should put that in the irrelevant category, and if I see something with regards to profits and GDP, I should include that in the relevant category. I understand that I should have some sort of knowledge base for that particular domain ie. economics.

I need pointers to where I can start.
How do I go about collecting the domain data?
What basic process structure should the system have?
I'm planning to use Java for the implementation.

Tutorials would also be very much appreciated.
0
Comment
Question by:Cynthia Wasonga
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 28

Expert Comment

by:dpearson
ID: 41787242
To solve that generally is a pretty hard problem.  You'd want to start with a natural language parser (to understand the English text) and then categorize its outputs.

However if you want a simpler short cut you could also look at WordNet (https://wordnet.princeton.edu/) which is a semantic network for words - which means given "profit" you can look up what "type" of word this is (or list of options) and see that it can be related to economics.

Might give you what you need without getting into a full natural language processor.

Doug
1
 

Author Comment

by:Cynthia Wasonga
ID: 41787283
Thanks dpearson,

I've found several tools to use for Natural Language processing at this site: https://opensource.com/business/15/7/five-open-source-nlp-tools

I'll look into those after doing some more research on NLP as a whole. I want to understand the details. Wordnet also looks good. Might come in handy when learning about NLP.

Again, thanks.
0
 
LVL 28

Accepted Solution

by:
dpearson earned 2000 total points
ID: 41787288
Yes if you're game to really jump into the solutions those should all get you started (although Lucerne shouldn't really be on the list in my opinion).

If you're not familiar with NLP - basically when you run these tools they'll give you a parse graph - the logical structure of a sentence broken up into grammatical elements.  Then once you have identified the nouns (likely the most important for classifying the sentences according to relevant or not) you can look up the semantic meaning - either as part of the NLP processor (some may include this) or via an external tool like WordNet.

Have fun exploring - it's very relevant stuff to learn about in the modern world.

Doug
1

Featured Post

Survive A High-Traffic Event with Percona

Your application or website rely on your database to deliver information about products and services to your customers. You can’t afford to have your database lose performance, lose availability or become unresponsive – even for just a few minutes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Read the original post on Monitis Blog. Believe it or not, the most important thing about the website of your business is not what’s on it but how fast it loads. Yes, that’s right!    As you can see on this infographic (an oldie but goodie!), …
This article was initially published on Monitis Blog, you can read it here . When it comes to deciding which approach to website performance monitoring is best for your business, unfortunately, like so many options in life . . . it depends. In th…
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question