Solved

Coldfusion datastructres to compare, identify and populate duplicate values into a new structre

Posted on 2012-03-24
18
389 Views
Last Modified: 2012-03-26
Hi Experts,

I'm facing a programming challenge that I'm not able to get my head around and am hoping the CF gurus here could help me out.

Scenario:
********
I have a .cfm page that has several controls populated by a query. The data that is sent to the server after the user has made the selections and submits are as following:
   School Name                   Athletics                               Swim Team
**************          ************                        ***********************
1. Hischool_nm_A            student_nm_Joe               student_nm_Bob, student_nm_Billy
2. Hischool_nm_B            student_nm_Joe               student_nm_Bob
3. Hischool_nm_C            student_nm_John              student_nm_Bob, student_nm_Billy
4. Hischool_nm_D            student_nm_John              student_nm_Bob, student_nm_Billy
5. Hischool_nm_E            student_nm_Joe               student_nm_Bob, student_nm_Billy
6. Hischool_nm_F             student_nm_Joe              
7. Midschool_nm_X          student_nm_susan               student_nm_Bob, student_nm_Billy
8. Midschool_nm_Y          student_nm_Joe                    student_nm_Jim, student_nm_Billy
9. Midschool_nm_Z          student_nm_susan               student_nm_Bob, student_nm_Billy

In the above data, the main distinctions are high schools ("Hischool_nm_") and middle schools ((Midschool_nm_").
For the Highschools, you'll see that row 1 & row 5 are identical [the Athletics and Swim teams for both schools have the exact same student names]. This is a duplicate. So are rows 3 & 4 - they are identical as well.
For the middle schools, rows 7 & 9 are identical [school X & school Z have the same students in athletics, and also have the same students on the swim team.

With that in mind, I need to loop through the above structure, pick out the duplicate sets, and populate another structure that contains just the duplicates.I need a different duplicate structure for every set of duplicates. How should I go about accomplishing this?
0
Comment
Question by:roger_v
  • 11
  • 7
18 Comments
 
LVL 39

Expert Comment

by:gdemaria
ID: 37760675
The first thing I would do is create a "key" from the 3 values combined.  That way you can easily test to see if they are duplicated.

<cfloop ... as you loop the form fields...>
   <!---- create a single value from all the values --->
  <cfset key = form.school_x & "|" & form.altheletes_x & "|" & form.swimTeam_x>

   Then put the key in a structure, but first test to see if it's already there..
   <cfif structKeyExists(allKeys, key)>
        ... this is a duplicate, key already exists
        <cfset dupKeys[key] = form.primaryKey ? or rowNumber?>
   <cfelse>
       <cfset allKeys[key] = form.PrimaryKey?  or rowNumber?
    </cfif>
</cfloop>

Now you have a structure identifying the unique records (allKeys) and the duplicate records (dupKeys)
0
 
LVL 1

Author Comment

by:roger_v
ID: 37761048
@GDE,

That's it - key - I knew I needed something like that but wasn't sure what the heck it was! I think I'm following your logic here. Let me give this a whirl and report back.
0
 
LVL 1

Author Comment

by:roger_v
ID: 37762950
I'm getting an error:

Variable ALLKEYS is undefined

I need to create the structures at some point. Do I have to create these outside the cfloop? And if I do, then the code wouldn't work.
0
How our DevOps Teams Maximize Uptime

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us. Read the use case whitepaper.

 
LVL 1

Author Comment

by:roger_v
ID: 37762957
Okay, I create the structs just before the loop but now the cfif structkeyexists doesn't work because both the structs are empty when I do cfdump.
0
 
LVL 1

Author Comment

by:roger_v
ID: 37763490
@GDE, I got your code working but there's a problem: I'm able to capture the first occurence of the duplicate, but I need to display a list of ALL the duplicates. Now, I can only display the last occurence. The code snippet I have now:

<cfif isdefined("form.submitbutton")>
<cfset allKeys = structnew() />
<cfset dupKeys = structnew() />
<cfset uniqKeys = structnew() />
<cfloop index="loopCount" from="1" to="#val(form.myCount)#">
      <cfset variables.schoolName = form["schoolname#LoopCount#"]>
      <cfset variables.athletes = form["sel_athlete_#LoopCount#"]>
      <cfset variables.swimTeam = form["sel_swim_#LoopCount#"]>
      <cfset key = variables.athletes & "|" & variables.swimTeam />
 <cfif structkeyexists(allKeys, key)>
       <h3>inside if...</h3>
   
       <cfset dupKeys[key] = variables.schoolname & "|" & key />

    <cfdump var="#dupKeys#" label="uniques struct">
 <!---dupe not found--->
 <cfelse>
       <h3>inside else...</h3>
    <cfset uniqKeys[key] = variables.schoolname & "|" & key />
       <cfset allKeys[key] = key />
    <cfdump var="#allKeys#" label="uniques struct">
    <!---<cfoutput>#key#</cfoutput>--->
 </cfif>
</cfloop>
</cfif>

<cfif IsDefined("allkeys")>
     <cfdump var="#allkeys#">
</cfif>
<cfif IsDefined("dupkeys")>
     <cfdump var="#dupkeys#">
</cfif>
0
 
LVL 39

Expert Comment

by:gdemaria
ID: 37765701
Hmmm, you need to display all the duplicates.. but what makes a duplicate different from another?   In other words, if you were looking for duplicate contact names and displayed them all.. you would just show:   Bob Smith,  Bob Smith,  Bob Smith.  

Do you have an ID or something to differentiate on duplicate from another, or perhaps it's the school name?   If it is school name, then you could do this..

   <cfset dupKeys[key] = listAppend(dupKeys[key],variables.schoolname)>

This would create a list of schools that use that key.

Note that I removed the & Key part of this... it seems redundant to save it again.    
  <cfset dupKeys[key] = variables.schoolname & "|" & key />

Now you get a list of just school names

Does that help?
0
 
LVL 1

Author Comment

by:roger_v
ID: 37766569
-" but what makes a duplicate different from another?   "

The school name is what makes it different from one another. For eg from the data that I posted the following are dups:

1. Hischool_nm_A            student_nm_Joe               student_nm_Bob, student_nm_Billy
5. Hischool_nm_E            student_nm_Joe               student_nm_Bob, student_nm_Billy

Because columns 2 & 3 are dups. I need to be able to highlight all the dups (any row with a different school name but the same 2nd and 3rd columns). Don't I need to create another struct having this info?
0
 
LVL 1

Author Comment

by:roger_v
ID: 37766662
-"<cfset dupKeys[key] = listAppend(dupKeys[key],variables.schoolname)>"

I tried this but it errors out. Note: I create key with only the athletics and swim_team fields. That is because those are the only fields that may be duplicated, not the school name.
0
 
LVL 39

Expert Comment

by:gdemaria
ID: 37766919
it probably errors if they key doesn't already exist..  

So you need one structure of everything and one with just duplicates (but all duplicates)

Perhaps, we make one structure of everything and just maintain a list of the keys that are duplicated.  That may be the most efficient...  I will play with it..
0
 
LVL 39

Expert Comment

by:gdemaria
ID: 37766948
How about this...   result is allKeys which is either one school name or a list of school names for each key  and dupKeys which only has keys for duplicates; the value is a list of school names.   So, allKeys includes dupKeys...

<cfif isdefined("form.submitbutton")>
<cfset allKeys = structnew() />
<cfset dupKeys = structnew() />
<cfset uniqKeys = structnew() />
<cfloop index="loopCount" from="1" to="#val(form.myCount)#">

      <cfset variables.schoolName = form["schoolname#LoopCount#"]>
      <cfset variables.athletes = form["sel_athlete_#LoopCount#"]>
      <cfset variables.swimTeam = form["sel_swim_#LoopCount#"]>

      <cfset key = variables.athletes & "|" & variables.swimTeam />

      <cfif structkeyexists(allKeys, key)>
          <!--- this is a duplicate, add to all keys and copy onto the duplicate key structure  ----->
          <cfset allKeys[key] = listAppend(allKeys[key],variables.schoolname)>
          <cfset dupKeys[key] = allKeys[key]>
      <cfelse>
          <cfset allKeys[key] = variables.schoolname>
      </cfif>
</cfloop>
</cfif>
0
 
LVL 1

Author Comment

by:roger_v
ID: 37766976
-"So you need one structure of everything and one with just duplicates (but all duplicates)"

Yes, I think. Essentially, by the time the cfloop does it's thing, at the end I need to have a list (or structure) that has a list of any and all duplicates, one after the other sequentially. That way, I can display to the user with a message : "The following schools have duplicates in swim_team and/or athletes"; then highlight all schools with different colors.

So if school_A & school_B have dups, highlight that row with blue, if school_f, school_g and school_h have dups, highlight that row with red, if school_k, school_m, school_n & school_x have dups, highlight them with yellow.
0
 
LVL 1

Author Comment

by:roger_v
ID: 37767049
OK that seems to work GDE! :)

But now I have another issue. How do I check this struct (dupKeys) when I want to highlight the row:

<table>
<cfoutput query="schoolInfo">
        <tr<!---I need to check if the dupKeys struct exists, then see if this particular school_name is present in the dupKeys struct, and accordingly set the css background-color--->>
               <td>#school_name#</td>
        </tr>
</cfoutput>
0
 
LVL 39

Expert Comment

by:gdemaria
ID: 37767110
are you highlighting the school?  Or are you trying to highlight a particular entry within the school?  (ie athlete/team)
0
 
LVL 1

Author Comment

by:roger_v
ID: 37767295
Highlighting the entire row. Currently this is what I have:

<cfoutput query="schoolInfo">
        <tr<!---I need to check if the dupKeys struct exists, then see if this particular school_name is present in the dupKeys struct, and accordingly set the css background-color--->>
               <td>#school_name#</td>
               <td><select name="sel_athletes_#schoolInfo.currentrow#"><option value="#athletes#">#athlete_name#</option></select></td>
               <td><select multiple="multiple" name="sel_swimteam_#schoolInfo.currentrow#"><option value="#swimteam#">#swim_team_name#</option></select></td>
        </tr>
</cfoutput>
0
 
LVL 39

Expert Comment

by:gdemaria
ID: 37767362
> Highlighting the entire row

ok, if a school name is unique to a row, then there is really no need for us to have stored anything more than just the school...  Just knowing the school has a duplicate would be enough, wouldn't it?   We don't need to know which "key" was repeated, only that it exists...

Look at "dupSchools" in the code below... the result would be a comma delimited list of school names...

<cfif isdefined("form.submitbutton")>
  <cfset dupSchools = "">
  <cfloop index="loopCount" from="1" to="#val(form.myCount)#">
      <cfset variables.schoolName = form["schoolname#LoopCount#"]>
      <cfset variables.athletes = form["sel_athlete_#LoopCount#"]>
      <cfset variables.swimTeam = form["sel_swim_#LoopCount#"]>

      <cfset key = variables.athletes & "|" & variables.swimTeam />

      <cfif structkeyexists(allKeys, key)>
          <!--- this is a duplicate, add to all keys and copy onto the duplicate key structure  ----->
          <cfset allKeys[key] = listAppend(allKeys[key],variables.schoolname)>
          <cfset dupSchools = listAppend(dupSchools,variables.schoolname)>
      <cfelse>
          <cfset allKeys[key] = variables.schoolname>
      </cfif>
  </cfloop>
</cfif>


Of course, I would much prefer if we were dealing with school IDs and not names...
0
 
LVL 1

Author Comment

by:roger_v
ID: 37767467
GDE, now when I do the list append, it doesn't capture the first occurence of the duplicate. It captures in the list only the rest of the occurences. For eg, if school_a, school_d & school_x have duplicate swim_team and athletes, the list only has school_d & school_x, but doesn't show school_a.
0
 
LVL 39

Accepted Solution

by:
gdemaria earned 500 total points
ID: 37767534
Oh right, same as before..  please change it like this..

      <cfif structkeyexists(allKeys, key)>
          <!--- this is a duplicate, add to all keys and copy onto the duplicate key structure  ----->
          <cfset allKeys[key] = listAppend(allKeys[key],variables.schoolname)>
          <cfset dupSchools = listAppend(dupSchools,allKeys[key])>
      <cfelse>
          <cfset allKeys[key] = variables.schoolname>
      </cfif>
0
 
LVL 1

Author Closing Comment

by:roger_v
ID: 37767659
Awesome as usual GDE! :)
0

Featured Post

Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

jQuery is a JavaScript library that greatly simplifies JavaScript programming. AJAX is an acronym formed from "Asynchronous JavaScript and XML."  AJAX refers to any communication between client and server, when the human client does not observe a…
This article demonstrates how to create a simple responsive confirmation dialog with Ok and Cancel buttons using HTML, CSS, jQuery and Promises
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question