lbarnett419
asked on
Identify more than 1 person at same address and assign family id
The table contains:
I need to create family ids where 1 or more students live at the same address (streetaddress) and have the same motheremail.
For example, there are 3 students with the same last name living at 111 Main Street.
Once I identify them, I need to assign them a familyid such as 145-1
There are also 3 students who do not have the same last name but live at 121 Spruce Street. I still need to assign them a familyid such as 145-2.
Familyid is not in the table, but I could add it as a column, if need be.
I added some sample data in the spreadsheet.
Thank you for your help!
EESampleDataFamilyID.xls
unique student id
lastname
firstname
streetaddress
motheremail
lastname
firstname
streetaddress
motheremail
I need to create family ids where 1 or more students live at the same address (streetaddress) and have the same motheremail.
For example, there are 3 students with the same last name living at 111 Main Street.
Once I identify them, I need to assign them a familyid such as 145-1
There are also 3 students who do not have the same last name but live at 121 Spruce Street. I still need to assign them a familyid such as 145-2.
Familyid is not in the table, but I could add it as a column, if need be.
I added some sample data in the spreadsheet.
Thank you for your help!
EESampleDataFamilyID.xls
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi Dvz:
(1) No address can exist in the system if no one currently lives there. It is a requirement.
(2) A person or persons(students) can belong to only one location. These are not students in a college setting and are all under 18, so there is no movement out to a dorm part of the time, etc. It doesn't matter if it is called FamilyID or LocationId.
(3) If 3 students live in the same household, 2 may be from a previous marriage (Smith) and 1 from the current marriage (Jones).
(4) You are right, the family surname & streetaddress cannot uniquely differentiate between data sets. I was hoping if we added the motheremail field to the mix, it might help. The motheremail is the same for all students in the same household, regardless of the students' last name.
Unfortunately, I get asked to do things that take a long time manually and was trying to find a way to automate.
I know the most basic way is to order the data by either address or motheremail, eyeball the student names, then assign the numbers manually, but I was hoping for a workaround.
Thank you for your consideration.
(1) No address can exist in the system if no one currently lives there. It is a requirement.
(2) A person or persons(students) can belong to only one location. These are not students in a college setting and are all under 18, so there is no movement out to a dorm part of the time, etc. It doesn't matter if it is called FamilyID or LocationId.
(3) If 3 students live in the same household, 2 may be from a previous marriage (Smith) and 1 from the current marriage (Jones).
(4) You are right, the family surname & streetaddress cannot uniquely differentiate between data sets. I was hoping if we added the motheremail field to the mix, it might help. The motheremail is the same for all students in the same household, regardless of the students' last name.
Unfortunately, I get asked to do things that take a long time manually and was trying to find a way to automate.
I know the most basic way is to order the data by either address or motheremail, eyeball the student names, then assign the numbers manually, but I was hoping for a workaround.
Thank you for your consideration.
That helps -- somewhat :) You've clarified that there must be one and only one mothermail per housing unit (address, whatever). Can the mothermail ever be modified, as in changing carriers?
Rather than a hyphenated string, I'm thinking of a compound key (two or more columns).
A big missing piece is how to differentiate between family and non-family in the unit? What fact is unique?
Rather than a hyphenated string, I'm thinking of a compound key (two or more columns).
A big missing piece is how to differentiate between family and non-family in the unit? What fact is unique?
SQL> select * from family;
STUDENTID LASTNAME FIRSTNAME STREETADDRESS MOTHEREMAIL
---------- --------- --------- -------------------- -------------------
19854 Sampson King 121 Spruce Street mamasmith@sst.org
45123 Jones Howard 111 Main Street mamajones@sst.org
49800 Bowen Brandon 121 Spruce Street mamasmith@sst.org
65122 Jones Susie 111 Main Street mamajones@sst.org
79652 Duck Daffy 1435 False Court mamaduck@sst.org
85121 Smith Johnny 121 Spruce Street mamasmith@sst.org
85121 Jones Mary 111 Main Street mamajones@sst.org
98985 Mouse Mickey 4569 True Lane mamamouse@sst.org
You said >>The motheremail is the same for all students in the same household<<
Then perhaps something like this works
SQL> select studentid, lastname, firstname, streetaddress, motheremail,
2 dense_rank() over (order by motheremail) * 100||'-'||
3 row_number() over (partition by motheremail order by studentid) familyid
4 from family;
STUDENTID LASTNAME FIRSTNAME STREETADDRESS MOTHEREMAIL FAMILYID
--------- --------- --------- -------------------- -------------------- --------
79652 Duck Daffy 1435 False Court mamaduck@sst.org 100-1
45123 Jones Howard 111 Main Street mamajones@sst.org 200-1
65122 Jones Susie 111 Main Street mamajones@sst.org 200-2
85121 Jones Mary 111 Main Street mamajones@sst.org 200-3
98985 Mouse Mickey 4569 True Lane mamamouse@sst.org 300-1
19854 Sampson King 121 Spruce Street mamasmith@sst.org 400-1
49800 Bowen Brandon 121 Spruce Street mamasmith@sst.org 400-2
85121 Smith Johnny 121 Spruce Street mamasmith@sst.org 400-3
STUDENTID LASTNAME FIRSTNAME STREETADDRESS MOTHEREMAIL
---------- --------- --------- -------------------- -------------------
19854 Sampson King 121 Spruce Street mamasmith@sst.org
45123 Jones Howard 111 Main Street mamajones@sst.org
49800 Bowen Brandon 121 Spruce Street mamasmith@sst.org
65122 Jones Susie 111 Main Street mamajones@sst.org
79652 Duck Daffy 1435 False Court mamaduck@sst.org
85121 Smith Johnny 121 Spruce Street mamasmith@sst.org
85121 Jones Mary 111 Main Street mamajones@sst.org
98985 Mouse Mickey 4569 True Lane mamamouse@sst.org
You said >>The motheremail is the same for all students in the same household<<
Then perhaps something like this works
SQL> select studentid, lastname, firstname, streetaddress, motheremail,
2 dense_rank() over (order by motheremail) * 100||'-'||
3 row_number() over (partition by motheremail order by studentid) familyid
4 from family;
STUDENTID LASTNAME FIRSTNAME STREETADDRESS MOTHEREMAIL FAMILYID
--------- --------- --------- -------------------- -------------------- --------
79652 Duck Daffy 1435 False Court mamaduck@sst.org 100-1
45123 Jones Howard 111 Main Street mamajones@sst.org 200-1
65122 Jones Susie 111 Main Street mamajones@sst.org 200-2
85121 Jones Mary 111 Main Street mamajones@sst.org 200-3
98985 Mouse Mickey 4569 True Lane mamamouse@sst.org 300-1
19854 Sampson King 121 Spruce Street mamasmith@sst.org 400-1
49800 Bowen Brandon 121 Spruce Street mamasmith@sst.org 400-2
85121 Smith Johnny 121 Spruce Street mamasmith@sst.org 400-3
ASKER
(1) Yes, the motheremail may change, but it will attach to all the children at that one address.
(2) I don't care about the hyphen....just so ONE familyid is assigned to all the children at that one address.
Which means, the family id would be 4001 (or whatever works) for all children at the 121 Spruce Street address. It doesn't matter the format--i used the hyphen as an example. You mentioned locationid earlier...that kind of thing.
Thanks!
(2) I don't care about the hyphen....just so ONE familyid is assigned to all the children at that one address.
Which means, the family id would be 4001 (or whatever works) for all children at the 121 Spruce Street address. It doesn't matter the format--i used the hyphen as an example. You mentioned locationid earlier...that kind of thing.
Thanks!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Awking: Thanks, works perfectly!
Dvz: Thanks for the analysis.
Dvz: Thanks for the analysis.
I just noticed that Mary Jones and Johnny Smith have the same studentID. Is that a typo or can that actually occur?
ASKER
I need to assign every student a family id. Therefore if 3 students live at the same address, they all have the same familyid.
If one student lives at an address, he/she needs to be assigned a family id.
Thanks