Link to home
Start Free TrialLog in
Avatar of saturation
saturation

asked on

SQL Query question - get most common words/string

I am looking to get, say the 30 most commonly used strings/phrases on a column called "item" in my database.  How can I write a query that would not only return the top 30 words/string, but also tell me how many of them there are of each?  I'm not a SQL genius, so any help is appreciated.  Here's some sample data:

ITEM
Mustang
Mustang GT
Fiero
Camaro IROC
Firebird
Firebird
Sport TS
Sunbird GTS
Maxima
Altima
Altima GX
Avatar of WayneATaylor
WayneATaylor
Flag of United Kingdom of Great Britain and Northern Ireland image

As long as you want to just look at the whole field contents as a whole then the following will work

select top 30 item,count(*) from TABLE group by item order by count(*) desc

Just replace the table name in TABLE

Wayne
Avatar of saturation
saturation

ASKER

What if I'm looking for any string more than 2 characters in the column rather than the whole column?
Much more complicated!

One way that I have done type of thing in the passed is have some Transact SQL code that goes through the whole table row by row, then goes through each fields contents to pick the words out (over two characters anyway) and then creates a temp table that has two fields a WORD field (Although can't be called word because of naming restrictions!) and a count field (Same naming restriction!) and then either creates an entry if there isn't one with a count of 1, or if there is already an entry it adds one to the count.
At the end of the code you can then just do the same query but without the GROUP so just

select top 30 word,count rom TABLE order by count desc

Does this make sense.  I might have some sample code for the transact SQL if required.

Wayne
I would love some sample code--I need all the help I can get!  Thanks in advance!
As an overview the following code will give you an idea of how to use a cursor to step through the table.

The lines with // are comments...

//Firstly declare a cursor using
declare cursorname CURSOR for select item from TABLE

//Declare a field to use for the field contents.  Set the number to be the max size of the field
declare @fieldcontents as varchar(100)

//Then open the cursor
OPEN cursorname

//Then fetch the first entry
FETCH NEXT FROM cursorname INTO @fieldcontents

WHILE @@fetch_status=0
BEGIN

//You will at this point have the while contents of the field in the @fieldcontents variable
//You will need them to have the code to go through the contents and write to the temp table

      FETCH NEXT FROM cursorname INTO @fieldcontents
END
CLOSE Updategm
DEALLOCATE Updategm


I guess this is about half of what you need, but should give you a good start!

Wayne
In fact I forgot to change the cursor name in the last few lines, you need what is your cursor name

Wayne
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ScottPletcher,

I understand all of this except for the "ss.value"--what is that supposed to represent (by the way, thanks!)?
That's the column name for the values returned by the function.

For example, if the original table had string:

'ab,cde,fghij,k'

then the function would return a table that looks like this:

value_seq, value:
1, ab
2, cde
3, fghij
4, k

Just run the function by itself against the table, with a SELECT * instead of a GROUP BY, and you'll see the new column "value" coming from the function, with a single value from the string.
Oh--this is perfect.  Exactly what I needed--thanks!