Solved

Counting Unique Occurances of Words in a Table

Posted on 2009-04-07
5
581 Views
Last Modified: 2012-05-06
Hey Guys,

I have a table that contains vast quantities of data relating to helpdesk jobs. I'm trying to write a query that will go through a column called Symptom in a table called Faults and count the total number of times each word occurs in the table.

E.g. Say I have five column such as below

Symptoms Column                       |
----------------------------------------
back up reporting failed.
back up reporting failed
back up reporting failed.
drive space c showing 1.45GB
back up reporting failed.

I would like it to step through the table counting each unique occurance of every word it comes across and then produce a view showing this such as below:

back 4
up 4
reporting 4
failed 4
drive 1
space 1
c 1
showing 1

and so on. Anyone got any ideas??
0
Comment
Question by:barrymulholland
  • 3
  • 2
5 Comments
 
LVL 21

Expert Comment

by:JestersGrind
ID: 24089568
You could do something like this.  You need a tally table though.  A tally table is simply a table with ascending sequential numbers.  It's used to avoid looping.  In my example, I called the table Test and the column of text, Textdata.
Greg
 
 

SELECT SUBSTRING(Textdata + ' ', N, CHARINDEX(' ', Textdata + ' ', N)-N) AS Word, COUNT(*) AS Total

FROM Tally a (NOLOCK) CROSS JOIN

	Test b

WHERE N > 0 AND N<LEN(Textdata)+LEN(' ') AND SubString(' ' + Textdata + ' ', N, 1) = ' '

GROUP BY SUBSTRING(Textdata + ' ', N, CHARINDEX(' ', Textdata + ' ', N)-N)

Open in new window

0
 
LVL 4

Author Comment

by:barrymulholland
ID: 24095312
Hi JestersGrind,

Thanks for the prompt response!

I created a tally table called Sequence containing 20000 sequential numbers and then attempted to use your query (copied below), but it doesn't seem to work, I get an error stating that Sequence is an invalid object. Though if I do a simple SELECT * FROM Sequence it works fine.

Any idea's?
SELECT SUBSTRING(Symptom + ' ', N, CHARINDEX(' ', Symptom + ' ', N)-N) AS Word, COUNT(*) AS Total

FROM Faults a (NOLOCK) CROSS JOIN

Sequence b

WHERE N > 0 AND N <LEN(Symptom)+LEN(' ') AND SubString(' ' + Symptom + ' ', N, 1) = ' '

GROUP BY SUBSTRING(Symptom + ' ', N, CHARINDEX(' ', Symptom + ' ', N)-N)

Open in new window

0
 
LVL 21

Accepted Solution

by:
JestersGrind earned 500 total points
ID: 24096218
I just created tables with the same names as your example above and it worked fine for me.  Make sure that the Sequence table is in the same database as Faults or fully qualify it.  And the ascending value column in the Sequence table is called N or replace N in the query above with the correct name of the column.

Lastly to get the exact same output as your example, you have to add ORDER BY Total DESC, Word.

Greg


0
 
LVL 4

Author Comment

by:barrymulholland
ID: 24096863
Hi Jesters,

Thanks for some reason I had to fully qualify that table (but not the other, even though they're in the same database, go figure!) I also had to cast the symptom field as it was ntext which doesn't work with + operator, changed it to nvarchar. The query is running just now, I'll let you know how I got on when it finishes :-)

Thanks,

Barry
0
 
LVL 4

Author Closing Comment

by:barrymulholland
ID: 31567578
Worked like a charm, thank you very much!
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Join & Write a Comment

Suggested Solutions

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now