Link to home
Start Free TrialLog in
Avatar of Qsorb
QsorbFlag for United States of America

asked on

Find matching id in comma delimited list (URGENT)

<!--- AreasInfo contains list of all news areas with descriptions, indexed by the area_id (pk int) --->
 <cfquery name="AreasInfo" datasource="news1">
 select *
 from areas
 where area_id = 2
</cfquery>


<!--- CatInfo is a simple list of all possible news category keywords. Each keyword (column named catword) is saved separatly, NOT a comma delimited list --->
<!--- CatInfo.id is pk int --->
<cfquery name="CatInfo" datasource="news1">
 select id, catword
 from category
</cfquery>


<!--- I'm attempting to parse through the comma delimited list below (cKeywords.catword) of saved keywords for each area_id, and show the corresponding area_id for each, to be used as a reference later --->

<cfoutput query="CatInfo">
    <!--- Keywords matching --->
    <cfquery name="cKeywords" datasource="news1"> <!--- cKeywords.catword is saved in comma delimited varchar field)--->
       select area_id,catword
       from category_keywords
       where catword = '#CatInfo.catword#' <!--- catwords saved in comma delimited varchar field)--->
       and area_id = '#AreasInfo.area_id#' <!--- area_id = pk int --->
     </cfquery>

<!--- Show Catinfo.ID for each saved --->
<!--- Not sure if I need a cfloop somewhere to grab each keyword from the comma delimited list (cKeywords.catword) then find and list each area_id for that separate keyword. --->

<!--- Show output here ---><cfdump var="??">

</cfoutput>

Open in new window


I'm attempting to compare a comma delimited list to another table containing matching strings, and get the matching area id.

Please see my code snippet below to show me what I'm missing, most likely I need a cfloop but cannot figure out where.
Avatar of _agx_
_agx_
Flag of United States of America image

Edit: Gah... never mind, I missed the comment about "category" NOT storing a comma delimited list.
Avatar of Qsorb

ASKER

Close but not quite. Looks like I did a horrible job of explaining.

Category table has only one keyword per id.

Table: category
ID, CatWord
  1  | world
  2  | local
  3  | international
  4  | chicago
  5  | sports


Right. Category_keyword has multiple keywords for its catword field, comma delimited. There may be a couple dozen or do stored. But I'm only interested in one at a time that matches the single keyword found in the Category table.

Table: category_keywords
 area_id, catword
  22 | planetary, global, worldwide
  23 | obama, kennedy, political, political debate
  24 | general, taxes,murder
  28 | foo, bar, baz, qux


When parsing through via cfoutput query or cfloop, I need to know what id was found for the category when it matches one comma delimited category_keywords.
Avatar of Qsorb

ASKER

You must have deleted your suggestion just before I saved my reply.

The reason I'm trying to get this to work is that I'd rather story the CATEGORY.ID instead of the CATWORD because I may wish to change the keyword name at some point, so always using the PK INT ID seems to make sense. Make it more flexible. Once I have this working, I can store the CATAGORY.ID as well as the name, and in the future just reference the ID to get the word.

In my lame attempt to get this working I also tried this:

<CFLOOP query="cKeywords">

 <CFQUERY NAME="CatInfo" DATASOURCE="ADDFILES">
   select *
   from category
   where catword = '#cKeywords.catword#'
 </CFQUERY>

  <cfif cKeywords.catword eq CatInfo.catword>
   <cfoutput>#cKeywords.catword#  |  #CatInfo.catword#</cfoutput>
  </cfif>

</CFLOOP>

Some day I'm really going to have to study and learn CF, cfloops, elements, lists, etc. It's my achilles heel. Among other things ...

So, hope this sheds more light on what I'm trying to do.
Avatar of Qsorb

ASKER

To demonstrate my desperation (and ineptitude) I also tried this:

<CFOUTPUT query="CatInfo">

 <CFQUERY NAME="cKeywords" DATASOURCE="ADDFILES">
   select *
   from category_keywords
   where area_id = '#AreasInfo.Area_id#'
   <cfloop condition = "CatInfo.catword EQUAL TO cKeywords.catword"></cfloop>  
 </CFQUERY>

 #cKeywords.catword#  |  #CatInfo.catword#

</CFOUTPUT>


If nothing else, that should  give you more hints at what I'm trying to do.
Sorry our posts clashed...

The reason I'm trying to get this to work is that I'd rather story the CATEGORY.ID instead of the CATWORD because I may wish to change the keyword name at some point, so always using the PK INT ID seems to make sense.

So ultimately are you trying to switch the list of words

         ie "planetary, global, worldwide"

.. to a list of id's instead?  (The end goal is still a little fuzzy.)

         ie   "22,35,66"

Using an ID is better, but you're still stuck with the inherent problems that come with storing lists.  You can't easily query them as you've seen.  Normally you want to store each relationship in a separate record.

But for now let's say you have to stick with your current structure. Using this data as an example:

           ID, CatWord
             1  | world
             2  | local
             3  | international
             4  | chicago
             5  | sports


           area_id, catword
             22 | planetary, global, world
             23 | obama, kennedy, political, political debate
             24 | general, taxes, murder
             28 | foo, bar, baz, qux
             32 | general, local
             36 | athletics, sports, olympics

...  what do you want the results to look like?  That part's not real clear.  Lots of formats are possible.  But it'd help to see the desired output so I know what code to recommend.

Also, what do you want to do if a single "catword" is associated with multiple "area_id's" (which seems possible).
Avatar of Qsorb

ASKER

>Also, what do you want to do if a single "catword" is associated with multiple "area_id's" (which seems possible).

I'm only looking at one AREA_ID at a time so multiple "area_id's" should never be a problem.

> what should the desired result look like?
Not sure what you mean, "look like". If you mean how do I want to output the result, that is, the matching keyword, or the ID, well, that's not important. I just want the ID for the word found in the CATEGORY table that matches one in the other list. That is, does the category word match any of the comma delimited list, and if so, show me the Category ID number. I just don't know how to set it p and cfloop through the keywords to get the match.


Each page always uses the same AREA_ID so assume the area_id is available. The CATEGORY table does not use AREA_ID. It's just a list, no conditions.

So, once I know that the common keyword,  "taxes" for example is found as one single word in a comma delimited list stored in my CATEGORY_KEYWORDS table (CKEYWORD.CATWORD), for any particular AREA_ID, and I also know the "CATEGORY.ID" in the CATEGORY table by using the code you'll hopefully show me tonight, I should be able to figure out anything further. Does that answer your question?
Oooh... duh! Sorry to be slow on the uptake .. it's been a long week!

To find a single category word within a list, I think all you need is a LIKE clause.  LIKE doesn't have great performance, but it's pretty much the only option when storing lists.  Also since you're looping, I'd recommend using cfqueryparam. It can boost performance by helping the db reuse the same plan.

          <1--- ms sql specific syntax --->
         <cfoutput query="CatInfo">
                   <cfquery name="cKeywords" datasource="news1">
                              SELECT area_id,catword
                              FROM    category_keywords
                              WHERE  area_id = '#AreasInfo.area_id#'
                              AND       ','+ catword +','  LIKE  '%,#CatInfo.catword#,%'
                   </cfquery>

                   <!--- do stuff w/results here ...--->
        </cfoutput>

So say the loop searches for "WORLD", the query would find any of these:

             planetary, global,world              <=== ends with  
             world,news                                   <=== starts with
             foo,world,bar                                <===  contains
            world                                                <===  equals
I've got to take off. But the last query is tested w/ms sql and I think it should do the trick.  If not, feel free to open a new question - as it's urgent.  (You can always delete this one later if it didn't end up solving the problem .)
Avatar of Qsorb

ASKER

That is not working for me Agx.

I tried a variation of this a while back but after seeing your suggestion thought I must be doing something wrong. Still must be but using your suggestion this is what I'm getting:


   6  | local,national,world,chicago,sports
  13 | local,national,world,chicago,sports
  43 | local,national,world,chicago,sports
  86 | local,national,world,chicago,sports
  92 | local,national,world,chicago,sports
100 | local,national,world,chicago,sports


As you see, I'm getting all the discovered keywords on one row,  per ID. What I need is just one keyword per id, like this:


 6  | local
13 | national
43 | world
86 | chicago
92 | sports

For any keyword (catword) in my cKeywords table, I'll know the CATEGORY.ID

I also tried removing the comma search string to make the
consition simpler like this:

AND catword LIKE '%#CatInfo.catword#%'

Got the same result.



Here's the code I used:

<cfoutput query="CatInfo">

 <cfquery name="cKeywords" datasource="NEWS1">
   SELECT *
   FROM category_keywords
   WHERE area_id = '#AreasInfo.area_id#'
   AND ','+ catword +',' LIKE '%,#CatInfo.catword#,%'  
</cfquery>

<cfif cKeywords.recordcount GT 0>
 #CatInfo.id# | #cKeywords.catword#<br />
</cfif>
 
</cfoutput>


 Anyone have any suggestions to get just the one keyword like this?
 
 6  | local
13 | national
43 | world
86 | chicago
92 | sports
> What I need is just one keyword per id, like this:

Oh.. that's what I meant when I kept asking for a sample of the desired results. A picture's worth a thousand words (or in this case a data sample). What I posted was designed to produce totally different results. Anyway, were you able to figure this out?
If not, I think you may be approaching it from the wrong angle.  

You said the script only queries a single area_id.  Say you queried "category_keywords" on area_id=22, should it return multiple records - OR- only one? If it's a single record, you should be querying "category_keywords" instead.   First look up the category list for that area:

           <cfquery name="getAreaCategoryList" ...>
            SELECT area_id, catword FROM  category_keywords WHERE area_id = 22

           results:
            ==============
            area_id | catword
             22         | planetary,global,world              

... and use the list to get the individual categories.  

             SELECT    ID, CatWord
             FROM      category
             WHERE    CatWord IN  (  <cfqueryparam  value="#getAreaCategoryList.catword#"  
                                          cfsqltype="cf_sql_varchar"  
                                          list="true">
                               )

.. translated it's the same as using:

               WHERE CatWord IN ('planetary','global','world')

Assuming all of 3 words are in the category table, the query will return this:

           results:
            ==============
             ID, CatWord
             1     | world
             xx   | global
             xx   | planetary
Avatar of Qsorb

ASKER

>Anyway, were you able to figure this out?

No. Still trying desperately to get it working. What do I need to change on the query arguments to get just one word per id?

Are you suggesting I remove this:

AND ','+ catword +',' LIKE '%,#CatInfo.catword#,%'  

and substitute it for this:


AND CatWord IN ('#CatInfo.catword#')

I'm not getting any result there. So, not sure exactly what you suggest I change and how.
I added an edit just before you responded.  Can you take a look and see if it helps?
Avatar of Qsorb

ASKER

The problem is not knowing exactly what you want me to do. So without the exact code, I've tried every combination I can imagine, including this:

 <cfquery name="cKeywords" datasource="NEWS1">
   SELECT *
   FROM category_keywords
   WHERE area_id = '#AreasInfo.area_id#'
   WHERE    CatWord IN  (  <cfqueryparam  value="#Category.catword#"  
                                          cfsqltype="cf_sql_varchar"  
                                          list="true">
</cfquery>


This gives me the error:

Element CATWORD is undefined in CATEGORY.

> You said the script only queries a single area_id.  Say you queried "category_keywords"
> on area_id=22, should it return multiple records - OR- only one?

Yes, should return just one keyword. I'll see if I can turn the query around to query cKeywords as the output query. An example from you tho might be much faster because as it is now, I'm not sure which suggestion to change around, the first example, or your last suggestion.
ASKER CERTIFIED SOLUTION
Avatar of _agx_
_agx_
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
              > You said the script only queries a single area_id.  Say you
               > queried "category_keywords" on area_id=22, should it return
               > multiple records - OR- only one?

              Yes, should return just one keyword.

No, that's not quite what I was asking.  I wanted to know if the category_keywords table *only* contains one record per area_id:

             area_id, catword
             22 | planetary, global, world
 
OR if can it have multiple records for the same area_id.  Notice there's two different lists of keywords for area_id=22?

             area_id, catword
             22 | planetary, global, world
             22 | xxxx, yyyy, aaaa

If it's one record,  the code I posted is fine.  If it's multiple - you'll need slightly different code.
Avatar of Qsorb

ASKER

> the category_keywords table *only* contains one record per area_id:
It contains multiple records in comma delimited format for the same area_id.

I need one keyword record only. I never want to see multiple keywords. I only need to know the category.id record that matches the comma delimited list of keywords (catword) corresponding to each table.

The CATEGORY table just stores all possible keywords (single keywords) and has the pk int ID associated with that one word keyword. The column name for the storing the keyword is named CATWORD.

The CATEGORY_KEYWORDS table holds the same identical associated keywords in comma delimited format for each area_id. They were originally taken from the CATEGORY table. The column which holds those comma delimited words is also named CATWORD. I'll probably change this dumbass move later.


To help you understand what I'm trying to do, I'll give you more detail, which usually seems to confuse many experts or keep them from attempting to help. It's my fault, I know.

Maybe this background will help:

To make future searching easier, and allow me to change the name of the keywords but keeping the same ID record, I decided to create another column in the CATEGORY_KEYWORDS table, next to the CATWORD table column and name it CATNUMBER.  (varchar 1024).

My intention was to store the ID from the CATEGORY table that corresponds to the CATNAME.

The id numbers stored in CATNUMBER would also be in comma delimited format, and correspond exactly to CATEGORY_KEYWORDS.CATWORD. I figure I could look them up and find the exact keyword and id in the CATEGORY table.

I'd still prefer to use comma delimited lists because of simplification of storage. There are a half million pages of news to index and growing all the time.

So, here's what I tried up to this point, using your suggestions. This is all the test code, on one page:

<cfquery name="AreasInfo"  datasource="NEWS1">
 select *
 from areas
 where area_id = 2
</cfquery>


<cfquery name="getAreaCategoryList"  datasource="NEWS1">
            SELECT area_id, catword  
            FROM   category_keywords
            WHERE area_id = '#AreasInfo.area_id#'
</cfquery>



<cfquery name="getIndividualCategories" datasource="NEWS1">
             SELECT    ID, CatWord
             FROM      category
             WHERE    CatWord IN  (  <cfqueryparam  value="#getAreaCategoryList.catword#"  
                                          cfsqltype="cf_sql_varchar"  
                                          list="true">
                               )
</cfquery>


<cfif getIndividualCategories.RecordCount GT 0>
<cfoutput>#getIndividualCategories.id# | #getAreaCategoryList.catword#</cfoutput>
</cfif>

Unfortunately, this is still giving me multiple keywords for each id:

6  | local,national,world,chicago,sports

Not to confuse you, but for further reference, the CATEGORY table will have a couple of other columns besides CATWORD, such as synonym1, synonym2 for words similar to the single keyword. That's another reason for knowing the CATEGORY.ID.

So,  my hope, is that you now understand what I'm attempting to do, and why, and see what has to be changed in the example you gave me, the one which still gives me a multiple list of keywords. Again, the category table contains all the keywords. The other table contains selected keywords per area_id in comma delimited format.

Once this is all working, I'll perform a cfquery  update of the table and insert the category.id's in comma delimited format, into the category_keywords table.  I imagine I'll then need to do some kind of cfloop to retrieve them, show the keyword and its id. But that's another question.
To help you understand what I'm trying to do, I'll give you more detail, which usually seems to confuse many experts or keep them from attempting to help.

Yep, that all makes complete sense.  Honestly, I think we were on the same page once you posted the sample of the results here.


       Unfortunately, this is still giving me more than one keyword for each id:
       <cfoutput>#getIndividualCategories.id# | #getAreaCategoryList.catword#</cfoutput>


That's just because you're outputting the query wrong column.  It should be the column from getIndividualCategories.  

               ie #getIndividualCategories.CatWord#


I need one keyword record only. I never want to see multiple keywords. I find myself repeating this. I only need to know the category.id record that matches the catword for each table.

Yeah, I share your frustration.  Just for future reference the problem with the some of the psuedo examples you posted earlier is they were incomplete or totally different. So the data didn't always illustrate the relationships or most importantly - the end result you're trying reach.  

We understand you can't always post real data, but we need a sample that accurately represents the relationships and problem you're trying to solve so we can help you get there.  *Especially* a sample of the desired results.  

Like I said, a picture's worth a 1000 words.  That's why I kept trying to stick with a single data sample, rather than jumping around,  and kept asking what the desired results should look like.  So we could get to the solution faster.  As soon as you did that,  the answer was pretty clear.   Hopefully next time we'll get there sooner :/
Avatar of Qsorb

ASKER

WORKS!!

Okay, after substituting #getIndividualCategories.CatWord#, I'm now getting just one keyword for each id. Seems to be working, as you said.

Note I had removed the frustrations from my post before you replied. I cannot take the risk of getting you frustrated. You and one or two others are who have saved me through the years here. So, thank you for staying beside me on this.

Any simple way for me to get at each keyword in the list, one at a time, so I can save the id for each in a comm delimited list? If not, this will be another question.
> Note I had removed the frustrations from my post before you replied.

I figured we both did some of that  ;-) No worries.  Even people that often work well together have off days or occasional miscommunications.

        >  Any simple way for me to get at each keyword in the list, one at a time, so I
       > can save the id for each in a comm delimited list?

What about just using valueList? That would give you all the id's from the query in a single comma delimited list

                <cfset idList = valueList(getIndividualCategories.id)>
Avatar of Qsorb

ASKER

Great. Complete. Done.

<cfset idList = valueList(getIndividualCategories.id)>
<cfset CatList = valueList(getAreaCategoryList.catword)>

I created two lists to manage each better but first had to add an ORDER BY CATWORD to both queries to they would match each other.

Not that it's your goal, but that puts you solidly at the top of the Genius Certification list.

Thanks for sticking with me. You and others like you is what makes Experts Exchange so useful.
Great, glad it's finally solved :)

Just FYI if you're going update the new category_keywords column with the #idList# it's better to generate both lists from the getIndividualCat query to guarantee they'll always be in the same order (if that matters).

            ie  <cfset idList = valueList(getIndividualCategories.id)>
                  <cfset CatList = valueList(getIndividualCategories.catword)>