Link to home
Start Free TrialLog in
Avatar of andrewh123
andrewh123

asked on

Count distinct words in a text field, group and order by the result

Hi,
I have some articles that I want to examine with regard to frequency of words.  They are held in MS Sql database, and I would be using Query Analyser to get the results, or possibly a stored procedure.

I only need to do this one at a time but I have:
id, article_title, article text  (as int,text,text respectively).  The article_text field can have a long article in of several thousand words.  What I want is to be able to count the number of times each word appears in the article and then rank by the result.

Ideally for bells and whistles I'd like to alos be able to omit a list of common words (eg the, and etc), omit punctuation and have the option to count words of more than 4 characters.  But the answer for the first part only would be excellent.

A word is defined by being separated by a space.

Thanks in advance,

andrew
ASKER CERTIFIED SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ptjcb, You don't even need to export the data to a text file. Scripting languages have the ability to connect to a DB and read in a dataset.
True, I just like less load on the server and something that runs through every word could be fairly intensive (depending on how well the app is written).
I guess it is a matter of preference, but I think it would be better just making a script call because you only need one transfer of data instead of two. (SQL to VBScript  as opposed to SQL to TextFile To VBScript).
Avatar of andrewh123
andrewh123

ASKER

Thanks guys - this has saved me the time of struggling with something that was never going to work and scripting an export to a text file would seem like a good way of doing it.  I'll try and split the points up as you've each given me something helpful, but as angelIII was first, I'll give the lions share there.