Solved

SQL Query to find keywords and count

Posted on 2014-02-27
3
354 Views
Last Modified: 2014-03-30
Looking for a good sql query that will pull out keywords from a paragraph and count how many times they occur.

For example:

KeywordTable:

Word
the
and
a
like

Paragraph:
Conscientiousness is one of the five traits of the Five Factor Model of personality, and is an aspect of what has traditionally been referred to as having character. Conscientious individuals are generally hard working and reliable. When taken to an extreme, they may also be "workaholics", perfectionists, and compulsive in their behavior. People who score low on conscientiousness tend to be more laid back, less goal-oriented, and less driven by success; they also are more likely to engage in antisocial and criminal behavior.

Outcome:
the, 5
and, 3
etc.

SELECT Word, COUNT(Word) AS count FROM KeywordTable WHERE....(any of the words in the keyword table are in the paragraph and their count)

Thanks in advance!!!

D-
0
Comment
Question by:MaxKroy
3 Comments
 
LVL 39

Expert Comment

by:lcohan
ID: 39892886
I guess you could use this sample code below and plug in your keywords to get something working for you - I know is not complete answer but hopefully more than nothing...

DECLARE @Text AS VarChar(MAX)
SET @Text = 'This is a sentence'
PRINT  'Number of words: ' + CAST(LEN(@Text) - LEN(REPLACE(@Text, ' ', '')) +1 AS VarChar(3))
0
 
LVL 35

Expert Comment

by:David Todd
ID: 39896901
Hi,

Generally SQL is the wrong tool for this. You can use a hammer to pound in screws, and a screwdriver to pound nails, but neither is very successful at that.

In what form would the paragraph be? Individual words in a table, or as a one value in a varchar( max ) column or variable?

Maybe a CLR function using the .net libraries for string manipulation would be better here.

HTH
  David
0
 
LVL 40

Accepted Solution

by:
Sharath earned 500 total points
ID: 39927587
try this query.
declare @Paragraph varchar(max)
select @Paragraph = 'Conscientiousness is one of the five traits of the Five Factor Model of personality, and is an aspect of what has traditionally been referred to as having character. Conscientious individuals are generally hard working and reliable. When taken to an extreme, they may also be "workaholics", perfectionists, and compulsive in their behavior. People who score low on conscientiousness tend to be more laid back, less goal-oriented, and less driven by success; they also are more likely to engage in antisocial and criminal behavior.'

;with  cte0 as (select 1 as c union all select 1),        
       cte1 as (select 1 as c from cte0 a, cte0 b),        
       cte2 as (select 1 as c from cte1 a, cte1 b),        
       cte3 as (select 1 as c from cte2 a, cte2 b),        
       cte4 as (select 1 as c from cte3 a, cte3 b),        
       nums as (select row_number() over (order by c) as n from cte4)       
SELECT kt.Word,COUNT(*) cnt
  FROM KeywordTable kt 
       JOIN (SELECT n,LTRIM(SUBSTRING(Paragraph, n, CHARINDEX(' ', Paragraph + ' ', n) - n)) AS Word
               FROM nums 
                    CROSS JOIN (SELECT @Paragraph Paragraph) T1 
              WHERE SUBSTRING(' ' + Paragraph, n, 1) = ' ' 
                AND n < LEN(Paragraph) + 1) p 
         ON kt.Word = p.Word 
 GROUP BY kt.Word 

Open in new window

http://sqlfiddle.com/#!3/59cf12/3
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Suggested Solutions

Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Via a live example, show how to shrink a transaction log file down to a reasonable size.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now