Solved

Coldfusion datastructres to compare, identify and populate duplicate values into a new structre

Posted on 2012-03-24
18
382 Views
Last Modified: 2012-03-26
Hi Experts,

I'm facing a programming challenge that I'm not able to get my head around and am hoping the CF gurus here could help me out.

Scenario:
********
I have a .cfm page that has several controls populated by a query. The data that is sent to the server after the user has made the selections and submits are as following:
   School Name                   Athletics                               Swim Team
**************          ************                        ***********************
1. Hischool_nm_A            student_nm_Joe               student_nm_Bob, student_nm_Billy
2. Hischool_nm_B            student_nm_Joe               student_nm_Bob
3. Hischool_nm_C            student_nm_John              student_nm_Bob, student_nm_Billy
4. Hischool_nm_D            student_nm_John              student_nm_Bob, student_nm_Billy
5. Hischool_nm_E            student_nm_Joe               student_nm_Bob, student_nm_Billy
6. Hischool_nm_F             student_nm_Joe              
7. Midschool_nm_X          student_nm_susan               student_nm_Bob, student_nm_Billy
8. Midschool_nm_Y          student_nm_Joe                    student_nm_Jim, student_nm_Billy
9. Midschool_nm_Z          student_nm_susan               student_nm_Bob, student_nm_Billy

In the above data, the main distinctions are high schools ("Hischool_nm_") and middle schools ((Midschool_nm_").
For the Highschools, you'll see that row 1 & row 5 are identical [the Athletics and Swim teams for both schools have the exact same student names]. This is a duplicate. So are rows 3 & 4 - they are identical as well.
For the middle schools, rows 7 & 9 are identical [school X & school Z have the same students in athletics, and also have the same students on the swim team.

With that in mind, I need to loop through the above structure, pick out the duplicate sets, and populate another structure that contains just the duplicates.I need a different duplicate structure for every set of duplicates. How should I go about accomplishing this?
0
Comment
Question by:roger_v
  • 11
  • 7
18 Comments
 
LVL 39

Expert Comment

by:gdemaria
Comment Utility
The first thing I would do is create a "key" from the 3 values combined.  That way you can easily test to see if they are duplicated.

<cfloop ... as you loop the form fields...>
   <!---- create a single value from all the values --->
  <cfset key = form.school_x & "|" & form.altheletes_x & "|" & form.swimTeam_x>

   Then put the key in a structure, but first test to see if it's already there..
   <cfif structKeyExists(allKeys, key)>
        ... this is a duplicate, key already exists
        <cfset dupKeys[key] = form.primaryKey ? or rowNumber?>
   <cfelse>
       <cfset allKeys[key] = form.PrimaryKey?  or rowNumber?
    </cfif>
</cfloop>

Now you have a structure identifying the unique records (allKeys) and the duplicate records (dupKeys)
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
@GDE,

That's it - key - I knew I needed something like that but wasn't sure what the heck it was! I think I'm following your logic here. Let me give this a whirl and report back.
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
I'm getting an error:

Variable ALLKEYS is undefined

I need to create the structures at some point. Do I have to create these outside the cfloop? And if I do, then the code wouldn't work.
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
Okay, I create the structs just before the loop but now the cfif structkeyexists doesn't work because both the structs are empty when I do cfdump.
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
@GDE, I got your code working but there's a problem: I'm able to capture the first occurence of the duplicate, but I need to display a list of ALL the duplicates. Now, I can only display the last occurence. The code snippet I have now:

<cfif isdefined("form.submitbutton")>
<cfset allKeys = structnew() />
<cfset dupKeys = structnew() />
<cfset uniqKeys = structnew() />
<cfloop index="loopCount" from="1" to="#val(form.myCount)#">
      <cfset variables.schoolName = form["schoolname#LoopCount#"]>
      <cfset variables.athletes = form["sel_athlete_#LoopCount#"]>
      <cfset variables.swimTeam = form["sel_swim_#LoopCount#"]>
      <cfset key = variables.athletes & "|" & variables.swimTeam />
 <cfif structkeyexists(allKeys, key)>
       <h3>inside if...</h3>
   
       <cfset dupKeys[key] = variables.schoolname & "|" & key />

    <cfdump var="#dupKeys#" label="uniques struct">
 <!---dupe not found--->
 <cfelse>
       <h3>inside else...</h3>
    <cfset uniqKeys[key] = variables.schoolname & "|" & key />
       <cfset allKeys[key] = key />
    <cfdump var="#allKeys#" label="uniques struct">
    <!---<cfoutput>#key#</cfoutput>--->
 </cfif>
</cfloop>
</cfif>

<cfif IsDefined("allkeys")>
     <cfdump var="#allkeys#">
</cfif>
<cfif IsDefined("dupkeys")>
     <cfdump var="#dupkeys#">
</cfif>
0
 
LVL 39

Expert Comment

by:gdemaria
Comment Utility
Hmmm, you need to display all the duplicates.. but what makes a duplicate different from another?   In other words, if you were looking for duplicate contact names and displayed them all.. you would just show:   Bob Smith,  Bob Smith,  Bob Smith.  

Do you have an ID or something to differentiate on duplicate from another, or perhaps it's the school name?   If it is school name, then you could do this..

   <cfset dupKeys[key] = listAppend(dupKeys[key],variables.schoolname)>

This would create a list of schools that use that key.

Note that I removed the & Key part of this... it seems redundant to save it again.    
  <cfset dupKeys[key] = variables.schoolname & "|" & key />

Now you get a list of just school names

Does that help?
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
-" but what makes a duplicate different from another?   "

The school name is what makes it different from one another. For eg from the data that I posted the following are dups:

1. Hischool_nm_A            student_nm_Joe               student_nm_Bob, student_nm_Billy
5. Hischool_nm_E            student_nm_Joe               student_nm_Bob, student_nm_Billy

Because columns 2 & 3 are dups. I need to be able to highlight all the dups (any row with a different school name but the same 2nd and 3rd columns). Don't I need to create another struct having this info?
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
-"<cfset dupKeys[key] = listAppend(dupKeys[key],variables.schoolname)>"

I tried this but it errors out. Note: I create key with only the athletics and swim_team fields. That is because those are the only fields that may be duplicated, not the school name.
0
 
LVL 39

Expert Comment

by:gdemaria
Comment Utility
it probably errors if they key doesn't already exist..  

So you need one structure of everything and one with just duplicates (but all duplicates)

Perhaps, we make one structure of everything and just maintain a list of the keys that are duplicated.  That may be the most efficient...  I will play with it..
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 39

Expert Comment

by:gdemaria
Comment Utility
How about this...   result is allKeys which is either one school name or a list of school names for each key  and dupKeys which only has keys for duplicates; the value is a list of school names.   So, allKeys includes dupKeys...

<cfif isdefined("form.submitbutton")>
<cfset allKeys = structnew() />
<cfset dupKeys = structnew() />
<cfset uniqKeys = structnew() />
<cfloop index="loopCount" from="1" to="#val(form.myCount)#">

      <cfset variables.schoolName = form["schoolname#LoopCount#"]>
      <cfset variables.athletes = form["sel_athlete_#LoopCount#"]>
      <cfset variables.swimTeam = form["sel_swim_#LoopCount#"]>

      <cfset key = variables.athletes & "|" & variables.swimTeam />

      <cfif structkeyexists(allKeys, key)>
          <!--- this is a duplicate, add to all keys and copy onto the duplicate key structure  ----->
          <cfset allKeys[key] = listAppend(allKeys[key],variables.schoolname)>
          <cfset dupKeys[key] = allKeys[key]>
      <cfelse>
          <cfset allKeys[key] = variables.schoolname>
      </cfif>
</cfloop>
</cfif>
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
-"So you need one structure of everything and one with just duplicates (but all duplicates)"

Yes, I think. Essentially, by the time the cfloop does it's thing, at the end I need to have a list (or structure) that has a list of any and all duplicates, one after the other sequentially. That way, I can display to the user with a message : "The following schools have duplicates in swim_team and/or athletes"; then highlight all schools with different colors.

So if school_A & school_B have dups, highlight that row with blue, if school_f, school_g and school_h have dups, highlight that row with red, if school_k, school_m, school_n & school_x have dups, highlight them with yellow.
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
OK that seems to work GDE! :)

But now I have another issue. How do I check this struct (dupKeys) when I want to highlight the row:

<table>
<cfoutput query="schoolInfo">
        <tr<!---I need to check if the dupKeys struct exists, then see if this particular school_name is present in the dupKeys struct, and accordingly set the css background-color--->>
               <td>#school_name#</td>
        </tr>
</cfoutput>
0
 
LVL 39

Expert Comment

by:gdemaria
Comment Utility
are you highlighting the school?  Or are you trying to highlight a particular entry within the school?  (ie athlete/team)
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
Highlighting the entire row. Currently this is what I have:

<cfoutput query="schoolInfo">
        <tr<!---I need to check if the dupKeys struct exists, then see if this particular school_name is present in the dupKeys struct, and accordingly set the css background-color--->>
               <td>#school_name#</td>
               <td><select name="sel_athletes_#schoolInfo.currentrow#"><option value="#athletes#">#athlete_name#</option></select></td>
               <td><select multiple="multiple" name="sel_swimteam_#schoolInfo.currentrow#"><option value="#swimteam#">#swim_team_name#</option></select></td>
        </tr>
</cfoutput>
0
 
LVL 39

Expert Comment

by:gdemaria
Comment Utility
> Highlighting the entire row

ok, if a school name is unique to a row, then there is really no need for us to have stored anything more than just the school...  Just knowing the school has a duplicate would be enough, wouldn't it?   We don't need to know which "key" was repeated, only that it exists...

Look at "dupSchools" in the code below... the result would be a comma delimited list of school names...

<cfif isdefined("form.submitbutton")>
  <cfset dupSchools = "">
  <cfloop index="loopCount" from="1" to="#val(form.myCount)#">
      <cfset variables.schoolName = form["schoolname#LoopCount#"]>
      <cfset variables.athletes = form["sel_athlete_#LoopCount#"]>
      <cfset variables.swimTeam = form["sel_swim_#LoopCount#"]>

      <cfset key = variables.athletes & "|" & variables.swimTeam />

      <cfif structkeyexists(allKeys, key)>
          <!--- this is a duplicate, add to all keys and copy onto the duplicate key structure  ----->
          <cfset allKeys[key] = listAppend(allKeys[key],variables.schoolname)>
          <cfset dupSchools = listAppend(dupSchools,variables.schoolname)>
      <cfelse>
          <cfset allKeys[key] = variables.schoolname>
      </cfif>
  </cfloop>
</cfif>


Of course, I would much prefer if we were dealing with school IDs and not names...
0
 
LVL 1

Author Comment

by:roger_v
Comment Utility
GDE, now when I do the list append, it doesn't capture the first occurence of the duplicate. It captures in the list only the rest of the occurences. For eg, if school_a, school_d & school_x have duplicate swim_team and athletes, the list only has school_d & school_x, but doesn't show school_a.
0
 
LVL 39

Accepted Solution

by:
gdemaria earned 500 total points
Comment Utility
Oh right, same as before..  please change it like this..

      <cfif structkeyexists(allKeys, key)>
          <!--- this is a duplicate, add to all keys and copy onto the duplicate key structure  ----->
          <cfset allKeys[key] = listAppend(allKeys[key],variables.schoolname)>
          <cfset dupSchools = listAppend(dupSchools,allKeys[key])>
      <cfelse>
          <cfset allKeys[key] = variables.schoolname>
      </cfif>
0
 
LVL 1

Author Closing Comment

by:roger_v
Comment Utility
Awesome as usual GDE! :)
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
I spent nearly three days trying to figure out how incorporate OAuth in Coldfusion for the Eventful API. Hopefully, this article will allow Coldfusion Programmers to buzz through the API when they need to. Basically, what this script does is authori…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now