Link to home
Start Free TrialLog in
Avatar of matija_
matija_Flag for Croatia

asked on

Classic ASP - group multiple keywords in database by popularity

My database has entries "john doe" "john" "chicago" "new york" "new hampshire" in table "keywords". Each entry in it's own row.
I want to display these entries by popularity like:

2x john
2x new
1x doe
1x chicago
1x york
1x hampshire

How to do it? Thanks
Avatar of FreakyEddie
FreakyEddie

Is the devider a space? i.e. Should an entry with two words, with a space in between, be seen as two seperate keywords?

Do you want it to be a solution in SQL or in ASP?
I would probably use the TOP 100 or something like that and then sum them by number of Occurrance.

For instance, let's say that you have a field (and I think you should) called Occurrance on your table.

Then during insert statement, insert initial record, and then increment the count anytime same name is inserted, you increment like:

DIM intCount
intCount = objRS("Occurrance")
objRS("Occurrance") = intCount + 1

SQL = "SELECT TOP 100 name, SUM(Occurrance) as Occurrance " & _
"FROM YourTable & _
"GROUP BY name " & _
"ORDER BY SUM(Occurrance) DESC"

Then that will give you the result.

You will use your UI to present it as 2x John, etc

Hope this helps
Avatar of matija_

ASKER

@FreakyEddie
Yes, divider is space and entry with two or more words should be seen as separate keywords. It doesn't matter whether solution is in VbScript or SQL query.

@sammySeltzer
Thanks for suggestion, but there are several thousands of records already written to database, and now I'm trying to deal with that.
if you write all the keywords seperately to another table first it makes things much easier.
you coud then "select distinct keyword, count(keyword) from table group by keyword order by keyword desc"
"select distinct keyword, count(keyword) as mycount from table group by keyword order by keyword desc"
Avatar of matija_

ASKER

@Surone1
I am aware of that, thanks, but as I wrote, keywords are already written.

So far I've managed to pull out all keywords from database, separate them by space, place in array, but I cannot count occurances in that array :( Hence my question, there might be an easier way of doing this, while preserving the database structure and records already written.
use a temporary table?
I think the concept is the same.

SQL = "SELECT count(name) as Occurrance " & _
"FROM YourTable & _
"GROUP BY name " & _
"ORDER BY count(name) DESC"

Then on the asp side of things, you can do

do while not rs.EOF
   fname= rs("name")
    else
     response.write "no name found"
  end if
 
      Response.Write"<Font size=""2""> x " & rs("fname") & " </font>"
rs.MoveNext
loop
rs.Close
set rs = nothing
objConn.close
set objConn = nothing


something like this.

not tested of course

sorry put <br> at end
Response.Write"<Font size=""2""> x " & rs("fname") & " </font><BR>"
again, another small mistake.

Remove the
  else
     response.write "no name found"
  end if
Avatar of matija_

ASKER

@sammySeltzer
You've missed the point, I know how to code such simple script to count and group identical entries and display them, but I need the entries to be divided into separate keywords first and then count and group them.
I don't want to make new tables or anything. I wrote snippet that separates the keywords and put them all in one giant Array, but I'm stuck how to group them, hence the original question.
ok got it.
What's the max numbers of spaces you have within one field? Two, three or ten.

With two or three we can do it in a query, with 10 the query is gonna be a bit long.
Avatar of matija_

ASKER

@FreakyEddie
Lets try with 3 spaces, I can easily modify the script if more than that. There is not limit in working example.
i'll do my best
ASKER CERTIFIED SOLUTION
Avatar of FreakyEddie
FreakyEddie

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This should do the trick for 3 keywords.
SELECT     COUNT(*) AS Expr1, field1
FROM         (SELECT     field1
                       FROM          (SELECT     SUBSTRING(strSingleKeyword, 0, CHARINDEX(' ', strSingleKeyword)) AS field1
                                               FROM          tableKeywords
                                               UNION ALL
                                               SELECT     LTRIM(RTRIM(SUBSTRING(strSingleKeyword, CHARINDEX(' ', strSingleKeyword) + 1, CHARINDEX(' ', strSingleKeyword, CHARINDEX(' ', strSingleKeyword) + 1) 
                                                                     - CHARINDEX(' ', strSingleKeyword)))) AS field1
                                               FROM         (SELECT     LTRIM(RTRIM(strSingleKeyword)) AS strSingleKeyword
                                                                      FROM          tableKeywords AS der_table1) AS derivedtbl_1
                                               WHERE     (strSingleKeyword LIKE '%  %')) AS derived_table3
                       UNION ALL
                       SELECT     LTRIM(RTRIM(SUBSTRING(strSingleKeyword, CHARINDEX(' ', strSingleKeyword, CHARINDEX(' ', strSingleKeyword) + 1) + 1, CHARINDEX(' ', strSingleKeyword, CHARINDEX(' ', 
                                             strSingleKeyword, CHARINDEX(' ', strSingleKeyword) + 1) + 1) - CHARINDEX(' ', strSingleKeyword, CHARINDEX(' ', strSingleKeyword) + 1)))) AS field1
                       FROM         (SELECT     LTRIM(RTRIM(strSingleKeyword)) AS strSingleKeyword
                                              FROM          tableKeywords AS der_table1) AS derivedtbl_1_1
                       WHERE     (CHARINDEX(' ', strSingleKeyword, CHARINDEX(' ', strSingleKeyword) + 1) > LEN(strSingleKeyword))) AS derived_table4
GROUP BY field1
ORDER BY field1

Open in new window

Perhaps these articles will help you:
SQLTips
WeAsk
 
It explains and shows examples of the CharIndex function in SQL
 
Good luck!
Avatar of Anthony Perkins
Since you are using MS SQL Server you may want to consider using Full-Text Search.
Avatar of matija_

ASKER

I'm using MS Access database :\
The query didn't work out for you?
Avatar of matija_

ASKER

@FreakyEddie
I have rewritten MSSQL string functions to Access equivalent, but the query is inscrutable to me. What is "strSingleKeyword"? My table has only 1 column named "keywords" which should match "field1" from your query.
strSingleKeyword is the name of the field.
tableKeywords is the name of the table.

So if you change strSingleKey to the fieldname of your own Database and change tableKeyword to the name of your table it should work.
Avatar of matija_

ASKER

I have converted it to Access syntax, but it's not counting right - can you take a look to my query please:

SQL = "SELECT Keyword, COUNT(*) FROM (SELECT MID(strSingleKey, 1, INSTR(strSingleKey,' ')) AS Keyword FROM tableKeywords "
SQL = SQL & "UNION ALL SELECT LTRIM(RTRIM(MID(strSingleKey, INSTR(strSingleKey,' ') + 1, INSTR(INSTR(strSingleKey,' ') + 1, strSingleKey, ' ') - INSTR(strSingleKey,' ')) )) AS Keyword "
SQL = SQL & "FROM (SELECT LTRIM(RTRIM(strSingleKey)) AS strSingleKey FROM tableKeywords) WHERE (strSingleKey LIKE '%  %')) "
SQL = SQL & "GROUP BY Keyword ORDER BY COUNT(*) DESC"
There's a double space between %  % in WHERE (strSingleKey LIKE '%  %')) "
The Where-clause is only there because it will return an error when you substring a string which doesn't have a spece. So you should also include the strings in it which don't have spaces.

Use another substring for that like
Select Keyword FROM tableKeywords WHERE NOT(strSingleKey LIKE '% %')
UNION ALL
other substrings

or something like that
Avatar of matija_

ASKER

I've left double space in "LIKE '%  %'" because it return error otherwise.
I think the problem lies within this line:

SQL = SQL & "UNION ALL SELECT LTRIM(RTRIM(MID(SearchTerm, INSTR(SearchTerm,' ')+1, INSTR(SearchTerm, ' ', INSTR(SearchTerm, ' ')+1) - INSTR(SearchTerm, ' ') ))) AS Keyword "

I can get the results, but it's counting wrong. Eg. I have the follwing rows:
"new york"
"new"
"york"
"york new"
"new york new"

and it returns:
2x "new"
2x ""
1x "york"

I'm a bit lost in this query :(
check this site.
this ones a bit better explained than mine, but it uses - in stead of spaces:

http://stackoverflow.com/questions/630907
But i think you'll manage.
Avatar of matija_

ASKER

Thanks for all the inspiration @FreakyEddie, I've managed to rewrite your code snippet (using small hacks) to work flawlessly with MS Access database.

Final code:
SQL = "SELECT Keyword, COUNT(Keyword) FROM ("
SQL = SQL & "SELECT strColumnName AS Keyword FROM strTable WHERE NOT strColumnName LIKE '% %' "
SQL = SQL & "UNION ALL "
SQL = SQL & "SELECT MID(strColumnName, 1, INSTR(strColumnName, ' ')) AS Keyword FROM strTable WHERE strColumnName LIKE '% %' "
SQL = SQL & "UNION ALL "
SQL = SQL & "SELECT MID(strColumnName, INSTR(strColumnName, ' ') + 1, INSTR(INSTR(strColumnName + ' ', ' ') + 1, strColumnName + ' ', ' ') - INSTR(strColumnName + ' ', ' ')) AS Keyword FROM "
SQL = SQL & "(SELECT strColumnName AS strColumnName FROM strTable WHERE strColumnName LIKE '% %') "
SQL = SQL & ") GROUP BY Keyword ORDER BY COUNT(Keyword) DESC"

It's parsing only first 2 words for now, but with easy modification, more words can be added. Hope it helps someone in the future.
you should really award the points to freakyeddie :-)
Avatar of matija_

ASKER

Why 0 points to FreakyEddie's comment #33647969 when I selected 500pts x A multiplier?
perhaps something went wrong?
at least your intentions are clear, as soon as a zone advisor shows up they can close it correctly
Avatar of matija_

ASKER

I have chosen FreakyEddie's answer as Best Solution which solved the question, entered 500 pts and chose A grade multiplier, but it says 0 points awarded when I close it... wtf?
dont worry you supplied all the information they need to fix it. it will just take a while longer now.
>>I'm using MS Access database :\<<
Which of course begs the question as to why you would accept a solution that is unusable as is in MS Access.
Avatar of matija_

ASKER

@acperkins
It only begs the question if you hadn't read my final comment #33664458. I have awarded the FreakyEddie's comment as it has inspired me to rewrite his MSSQL solution into MS Access, which I have shared with the rest of you. I wanted to accept my own comment #33664458 as assisted solution (with no points given of course), but the EE bugged at that point giving no points to FreakyEddie.