Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

SQL Query to find keywords and count

Posted on 2014-02-27
3
Medium Priority
?
387 Views
Last Modified: 2014-03-30
Looking for a good sql query that will pull out keywords from a paragraph and count how many times they occur.

For example:

KeywordTable:

Word
the
and
a
like

Paragraph:
Conscientiousness is one of the five traits of the Five Factor Model of personality, and is an aspect of what has traditionally been referred to as having character. Conscientious individuals are generally hard working and reliable. When taken to an extreme, they may also be "workaholics", perfectionists, and compulsive in their behavior. People who score low on conscientiousness tend to be more laid back, less goal-oriented, and less driven by success; they also are more likely to engage in antisocial and criminal behavior.

Outcome:
the, 5
and, 3
etc.

SELECT Word, COUNT(Word) AS count FROM KeywordTable WHERE....(any of the words in the keyword table are in the paragraph and their count)

Thanks in advance!!!

D-
0
Comment
Question by:MaxKroy
3 Comments
 
LVL 40

Expert Comment

by:lcohan
ID: 39892886
I guess you could use this sample code below and plug in your keywords to get something working for you - I know is not complete answer but hopefully more than nothing...

DECLARE @Text AS VarChar(MAX)
SET @Text = 'This is a sentence'
PRINT  'Number of words: ' + CAST(LEN(@Text) - LEN(REPLACE(@Text, ' ', '')) +1 AS VarChar(3))
0
 
LVL 35

Expert Comment

by:David Todd
ID: 39896901
Hi,

Generally SQL is the wrong tool for this. You can use a hammer to pound in screws, and a screwdriver to pound nails, but neither is very successful at that.

In what form would the paragraph be? Individual words in a table, or as a one value in a varchar( max ) column or variable?

Maybe a CLR function using the .net libraries for string manipulation would be better here.

HTH
  David
0
 
LVL 41

Accepted Solution

by:
Sharath earned 2000 total points
ID: 39927587
try this query.
declare @Paragraph varchar(max)
select @Paragraph = 'Conscientiousness is one of the five traits of the Five Factor Model of personality, and is an aspect of what has traditionally been referred to as having character. Conscientious individuals are generally hard working and reliable. When taken to an extreme, they may also be "workaholics", perfectionists, and compulsive in their behavior. People who score low on conscientiousness tend to be more laid back, less goal-oriented, and less driven by success; they also are more likely to engage in antisocial and criminal behavior.'

;with  cte0 as (select 1 as c union all select 1),        
       cte1 as (select 1 as c from cte0 a, cte0 b),        
       cte2 as (select 1 as c from cte1 a, cte1 b),        
       cte3 as (select 1 as c from cte2 a, cte2 b),        
       cte4 as (select 1 as c from cte3 a, cte3 b),        
       nums as (select row_number() over (order by c) as n from cte4)       
SELECT kt.Word,COUNT(*) cnt
  FROM KeywordTable kt 
       JOIN (SELECT n,LTRIM(SUBSTRING(Paragraph, n, CHARINDEX(' ', Paragraph + ' ', n) - n)) AS Word
               FROM nums 
                    CROSS JOIN (SELECT @Paragraph Paragraph) T1 
              WHERE SUBSTRING(' ' + Paragraph, n, 1) = ' ' 
                AND n < LEN(Paragraph) + 1) p 
         ON kt.Word = p.Word 
 GROUP BY kt.Word 

Open in new window

http://sqlfiddle.com/#!3/59cf12/3
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
Simulator games are perfect for generating sample realistic data streams, especially for learning data analysis. It is even useful for demoing offerings such as Azure stream analytics, PowerBI etc.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how the fundamental information of how to create a table.

885 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question