Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Condense DB

Posted on 2014-04-02
3
Medium Priority
?
199 Views
Last Modified: 2014-04-02
I have a database of race participants.  In the past every entry was seen as a unique entity even though many were already in the database from a race they ran earlier.  Further, every participant is associated with several other tables (not all of which are FK-PK related).  I want to identify the duplicates, change the unique id in the related tables, then delete the participant from the participant table).

I would to do this via my classic asp portal so that I can re-use the utility in the future (although I am re-writing my code to check for existence when entering participants).

What's the best way to do this?  Here is my thought:
1) Do it one letter at a time (last name).
2) Order by gender, last name, first name.
3) Write existing participants to an array sorted as above.
4) Cycle through the array looking for matches (I can select the fields to look for matches on and I can see the list of participants).
5) When a match is found call a function that changes the participant id on the related tables.
6) Delete the duplicate entry from the participant table.

I will also write the utility to compare one-at-a-time and condense manually.  I just want a way to make a couple of passes taking care of the obvious ones.

What am I missing?  Is there an easier way?
0
Comment
Question by:Bob Schneider
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 54

Accepted Solution

by:
Scott Fell,  EE MVE earned 2000 total points
ID: 39972846
Finding duplicates is a lot harder than it seems.   Think about mis spellings, different spaces, one has a period in the name, the other does not, two people with the same name.

For deduping addresses for small databases, I will typically look through the data and test some ideas out.   Perhaps the first 3 letters of the  last name, the first 7 characters of the address, city and zip and concatenate to a new field as a key.  It's not perfect, but by taking just the first few characters, we eliminate a lot of spelling errors and by using multiple fields like address and zip it helps ensure we have the right person.

I would also test by using all small upper case.  You can use lower(mydata) to get that.


It does sound like you have a db design issue.  You shouldn't have to keep running these dedupes.

I would have 1 file of contacts and a transaction file for each race.  The transaction table would only have the ID, ContactID, RaceID, Time, Timestamp updated, Timestamp created.  When adding people to a race, you would choose contacts from the contact table and a race id from the scheduled races.

You can think of races as being just like an ecommerce transaction.  An invoice header and invoice detail.   The header in this case is the ID (raceID), event name, scheduled order, anything else.  Then the race transaction would contain the race id for linking, the contact id from the contact table and times.
0
 

Author Closing Comment

by:Bob Schneider
ID: 39973244
I agree with all points, including the one on db design.  I started this 12 years ago when I knew a lot less then I know now...not that I am an expert now.  :)  The insight is helpful.  I am piecing a script together that seems to be effective but...

Thanks a ton!
0
 
LVL 54

Expert Comment

by:Scott Fell, EE MVE
ID: 39973253
Looking at my own old code makes me cringe....
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This demonstration started out as a follow up to some recently posted questions on the subject of logging in: http://www.experts-exchange.com/Programming/Languages/Scripting/JavaScript/Q_28634665.html and http://www.experts-exchange.com/Programming/…
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
This tutorial will teach you the special effect of super speed similar to the fictional character Wally West aka "The Flash" After Shake : http://www.videocopilot.net/presets/after_shake/ All lightning effects with instructions : http://www.mediaf…
In this video you will find out how to export Office 365 mailboxes using the built in eDiscovery tool. Bear in mind that although this method might be useful in some cases, using PST files as Office 365 backup is troublesome in a long run (more on t…

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question