Solved

Counting Unique Occurances of Words in a Table

Posted on 2009-04-07
5
612 Views
Last Modified: 2012-05-06
Hey Guys,

I have a table that contains vast quantities of data relating to helpdesk jobs. I'm trying to write a query that will go through a column called Symptom in a table called Faults and count the total number of times each word occurs in the table.

E.g. Say I have five column such as below

Symptoms Column                       |
----------------------------------------
back up reporting failed.
back up reporting failed
back up reporting failed.
drive space c showing 1.45GB
back up reporting failed.

I would like it to step through the table counting each unique occurance of every word it comes across and then produce a view showing this such as below:

back 4
up 4
reporting 4
failed 4
drive 1
space 1
c 1
showing 1

and so on. Anyone got any ideas??
0
Comment
Question by:barrymulholland
  • 3
  • 2
5 Comments
 
LVL 21

Expert Comment

by:JestersGrind
ID: 24089568
You could do something like this.  You need a tally table though.  A tally table is simply a table with ascending sequential numbers.  It's used to avoid looping.  In my example, I called the table Test and the column of text, Textdata.
Greg
 
 

SELECT SUBSTRING(Textdata + ' ', N, CHARINDEX(' ', Textdata + ' ', N)-N) AS Word, COUNT(*) AS Total
FROM Tally a (NOLOCK) CROSS JOIN
	Test b
WHERE N > 0 AND N<LEN(Textdata)+LEN(' ') AND SubString(' ' + Textdata + ' ', N, 1) = ' '
GROUP BY SUBSTRING(Textdata + ' ', N, CHARINDEX(' ', Textdata + ' ', N)-N)

Open in new window

0
 
LVL 4

Author Comment

by:barrymulholland
ID: 24095312
Hi JestersGrind,

Thanks for the prompt response!

I created a tally table called Sequence containing 20000 sequential numbers and then attempted to use your query (copied below), but it doesn't seem to work, I get an error stating that Sequence is an invalid object. Though if I do a simple SELECT * FROM Sequence it works fine.

Any idea's?
SELECT SUBSTRING(Symptom + ' ', N, CHARINDEX(' ', Symptom + ' ', N)-N) AS Word, COUNT(*) AS Total
FROM Faults a (NOLOCK) CROSS JOIN
Sequence b
WHERE N > 0 AND N <LEN(Symptom)+LEN(' ') AND SubString(' ' + Symptom + ' ', N, 1) = ' '
GROUP BY SUBSTRING(Symptom + ' ', N, CHARINDEX(' ', Symptom + ' ', N)-N)

Open in new window

0
 
LVL 21

Accepted Solution

by:
JestersGrind earned 500 total points
ID: 24096218
I just created tables with the same names as your example above and it worked fine for me.  Make sure that the Sequence table is in the same database as Faults or fully qualify it.  And the ascending value column in the Sequence table is called N or replace N in the query above with the correct name of the column.

Lastly to get the exact same output as your example, you have to add ORDER BY Total DESC, Word.

Greg


0
 
LVL 4

Author Comment

by:barrymulholland
ID: 24096863
Hi Jesters,

Thanks for some reason I had to fully qualify that table (but not the other, even though they're in the same database, go figure!) I also had to cast the symptom field as it was ntext which doesn't work with + operator, changed it to nvarchar. The query is running just now, I'll let you know how I got on when it finishes :-)

Thanks,

Barry
0
 
LVL 4

Author Closing Comment

by:barrymulholland
ID: 31567578
Worked like a charm, thank you very much!
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question