Link to home
Start Free TrialLog in
Avatar of BHUC
BHUCFlag for United States of America

asked on

Check to see if two fields match in php loop

I have a page that displays all appointments for a day. Some people have two procedures done by the same person and some people have two procedures done by different people.

I need the system to look and see if someone is having two procedures done by the same person and if so, only display their info once. If the someone is having two procedures by different people, both rows should appear.

code is here: two fields to compare are PATNUM and SURGEON - if PATNUM and SURGEON are the same in two columns, only show patient once. If PATNUM is same, but SURGEON is different - show the patient each time.

$nurse_query = "SELECT * FROM patientinfo WHERE SCHDT = '$currentdate' ";

$nurse_query_result = mysql_query($nurse_query);

if (mysql_num_rows($nurse_query_result) > 0)


{
      while ($nurse_row = mysql_fetch_assoc($nurse_query_result))

      {

                $patient_id = $nurse_row['ID'];
            $PATNUM = $nurse_row['PATNUM'];
            $SCHDT = $nurse_row['SCHDT'];
            $SCHTM = $nurse_row['SCHTM'];
            $SURGEON = $nurse_row['SURGEON'];

echo "$PATNUM - $SURGEON<br />\n";
}
}
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Wow, a lot of issues here!

It sounds like the design pattern you want may be a resource scheduling calendar.  It may also be that you want to use the GROUP BY clause in your query.  I'm guessing that you would want to ORDER BY surgeon and GROUP BY surgeon.  And it's not part of the answer, but it's important for you to get off the MySQL extension.  PHP is doing away with MySQL, but there are alternatives available.

We can probably give you a more specific answer if you show us the CREATE TABLE statement for patientinfo
Avatar of BHUC

ASKER

Thanks for the comments Ray. I will start reading the article on the MySQL Extension.

Not following what you mean by show us the create table statement.

The office has an excel sheet of all cases for the next day. They upload it to the database at the end of the day so it displays for the employees.

I have attached an excel file with some sample rows. In this example - Patient A would appear on the list twice - once for Dr. Burgess and once for Dr. Dykstra, then patient B and C would both appear.
Sample-Data.xlsm
OK, let me look at the spread sheet - it should probably have enough information.  Most database tables mirror spreadsheets closely.  Back in a few minutes.  If you want to see the CREATE TABLE statement you can run a query like SHOW CREATE TABLE patientinfo.
https://dev.mysql.com/doc/refman/5.0/en/show-create-table.html
Avatar of BHUC

ASKER

Ok.. Yes - all the table fields match what is in the excel document.
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of BHUC

ASKER

Thank you Ray. I will try this and let you know how it goes - also tons to learn from this code. Greatly appreciate it.
Yeah, most of it is setup for running the query :-)  I just wanted to make sure I had a good test before I posted something half-witted!
Avatar of Julian Hansen
I'm a bit of a late comer to this thread but wouldn't a simple DISTINCT do the trick?

SELECT DISTINCT patnum, surgeon FROM patientinfo WHERE SCHDT = '$currentdate'

Open in new window

wouldn't a simple DISTINCT do the trick?
Maybe, this is one of those 6+3=9, 7+2=9 things.  There are always several paths to the same solution.  You can test it using my code sample here.
Avatar of BHUC

ASKER

I am going to test this today - didn't have time yesterday. With the DISTINCT - what if I need all the columns from the table and not just those two?
SELECT * will get all the columns.
Avatar of BHUC

ASKER

Ray, thanks for the fast reply, and I have the SELECT * FROM in their now. My question was can I use SELECT DISTINCT patnum, surgeon FROM patientinfo   and still get all the columns? I can't do both can I?

I will be using your code Ray as I need to learn to get it in the new format. Greatly appreciate all the help.
You could experiment with DISTINCT using the script I posted.  I did not find any quick or easy solution using DISTINCT, but I already have a tested and working solution, so I'm not going to run that to ground.  Feel free to experiment with the code -- that's why I set up test cases -- so I can get to the "fail fast" point in my testing!  You want to be able to test many ideas quickly.  This strategy works better with software than skydiving.
http://www.forbes.com/sites/quora/2014/12/16/should-innovative-companies-really-move-fast-and-break-things/
http://www.explainxkcd.com/wiki/index.php/1428:_Move_Fast_and_Break_Things
What other columns do you need?

Here is the thing - if you are going to eliminate rows that are duplicates based on the values in two columns - then you are going to be throwing away the other information that exists for those discarded rows.

Is it therefore safe to assume that the discarded information would be duplicated anyway in the row that is retained and the one that is discarded.

If it is then you can use distinct in front of all the fields you want to retrieve.
If not you have to answer the question of what is the side effect of choosing one row over another when the data that is needed is not part of the criteria for determining if the row is a duplicate or not?
In a perfect universe, everyone would use normalized data base tables, and all the queries would be full of pristine relationships, but this planet is not in that universe.  No data is being thrown away -- it's just being organized in a way that makes sense for the Author's needs.  You're right in that there is duplicated information in some of the rows.  Some of them have the same patient number and surgeon because the surgeon is performing more than one procedure for the patient.  As I understand the question, that's the point of the exercise - find those and list them once instead of multiple times.  When I run the code example I tested, this is what I get.
bool(true)
MySQLI INSERTED A ROW CONTAINING Patient A Burgess WITH AUTO_INCREMENT ID = 1
MySQLI INSERTED A ROW CONTAINING Patient A Burgess WITH AUTO_INCREMENT ID = 2
MySQLI INSERTED A ROW CONTAINING Patient B Dykstra WITH AUTO_INCREMENT ID = 3
MySQLI INSERTED A ROW CONTAINING Patient C Dykstra WITH AUTO_INCREMENT ID = 4
MySQLI INSERTED A ROW CONTAINING Patient A Dykstra WITH AUTO_INCREMENT ID = 5


QUERY: SELECT *, CONCAT(patnum,surgeon) AS my_pair FROM patientinfo GROUP BY my_pair ORDER BY patnum, surgeon 
FOUND 4 ROWS OF DATA 

USING MySQLi_Result::Fetch_Object(): 
stdClass Object
(
    [id] => 1
    [patnum] => 12345
    [patname] => Patient A
    [surgeon] => Burgess
    [xwhen] => 2015-06-18 13:04:47
    [my_pair] => 12345Burgess
)

stdClass Object
(
    [id] => 5
    [patnum] => 12345
    [patname] => Patient A
    [surgeon] => Dykstra
    [xwhen] => 2015-06-18 13:04:47
    [my_pair] => 12345Dykstra
)

stdClass Object
(
    [id] => 3
    [patnum] => 12346
    [patname] => Patient B
    [surgeon] => Dykstra
    [xwhen] => 2015-06-18 13:04:47
    [my_pair] => 12346Dykstra
)

stdClass Object
(
    [id] => 4
    [patnum] => 12347
    [patname] => Patient C
    [surgeon] => Dykstra
    [xwhen] => 2015-06-18 13:04:47
    [my_pair] => 12347Dykstra
)

Open in new window

Avatar of BHUC

ASKER

Thanks for your help Ray! Appreciate it.
No data is being thrown away
I think you missed my point

Consider the following

       Patnum  Surgeon field_1 field2
Row 1: 1       Bob     123     456
Row 2: 1       Bov     789     abc

Open in new window


In the two rows above where patnum and surgeon are the same you have field1 and field 2 with different values in the row. If the data is as you say then based on an arbitary selection of a row to discard due to duplicate Patnum / Surgeon - which one do you choose? If the choice is arbitary then values for field1 and field2 don't matter in which case why include them in the result set

In the case where they are not arbitary then they would have to be the same in order for their selection to make sense. Therefore with data
       Patnum  Surgeon field_1 field2
Row 1: 1       Bob     123     456
Row 2: 1       Bov     123     456

Open in new window

The following query will yield the correct results
SELECT DISTINCT PATNUM, Surgeon, field_1, field_2 FROM table

Open in new window

^^ This might list surgeon = Burgess and patient = Patient A twice.  That's what we want to avoid.  It's brittle because the other columns are not predictable.  You can test this stuff, just copy my script and experiment with it.  Try adding different columns from the patientinfo table into the "field_1, field_2" part of the query to see where it breaks down.
because the other columns are not predictable
Ray - did you read my post? If the other columns are not predictable and you are discarding duplicates based on just 2 columns - where is the logic to decide which of the duplicate rows to keep?

Regarding your code
1. All projected fields are the same with respect to the selection criteria - with your example a DISTINCT will return the same results.
2. Your CONCAT() statement is superfluous as a GROUP BY patnum,surgeon returns identical results

Your code does not deal with the situation where two rows have the same patnum / surgeon data but DIFFERENT values in the required supplementary fields. This is the point I am trying to make.

a) If those supplementary fields contain different values per like row (on patnum / surgeon) then what is the selection criteria for which values to use

b) If they are the same then DISTINCT works and the code reduces to a simple SELECT with a loop to display the results.

Consider the following.

Query: [Your posted query]
SELECT *, CONCAT(patnum,surgeon) AS my_pair FROM patientinfo GROUP BY my_pair ORDER BY patnum, surgeon;

Open in new window

Results:
"1"	"12345"	"Patient A"	"Burgess"	"2015-06-19 08:41:36"	"12345Burgess"
"5"	"12345"	"Patient A"	"Dykstra"	"2015-06-19 08:41:36"	"12345Dykstra"
"3"	"12346"	"Patient B"	"Dykstra"	"2015-06-19 08:41:36"	"12346Dykstra"
"4"	"12347"	"Patient C"	"Dykstra"	"2015-06-19 08:41:36"	"12347Dykstra"

Open in new window

Query:[Same query but without the CONCAT]
SELECT * FROM patientinfo GROUP BY patnum,surgeon ORDER BY patnum, surgeon;

Open in new window

Results:
"1"	"12345"	"Patient A"	"Burgess"	"2015-06-19 08:41:36"
"5"	"12345"	"Patient A"	"Dykstra"	"2015-06-19 08:41:36"
"3"	"12346"	"Patient B"	"Dykstra"	"2015-06-19 08:41:36"
"4"	"12347"	"Patient C"	"Dykstra"	"2015-06-19 08:41:36"

Open in new window

Query:[Using DISTINCT]
SELECT DISTINCT patnum,patname, surgeon, xwhen FROM patientinfo;

Open in new window

Results:
"12345"	"Patient A"	"Burgess"	"2015-06-19 08:41:36"
"12346"	"Patient B"	"Dykstra"	"2015-06-19 08:41:36"
"12347"	"Patient C"	"Dykstra"	"2015-06-19 08:41:36"
"12345"	"Patient A"	"Dykstra"	"2015-06-19 08:41:36"

Open in new window

Results are identical with one exception - the DISTINCT query does not return the record id - which highlights my point exactly - if record id was important - what logic is used to choose record 1 instead of record 2 - how do we know that we did not want record 2? Given we don't know - the field is arbitary and should be excluded from the query - which then implies that the GROUP BY and DISTINCT queries in this context are identical.

In the context of the question asked - the asker is looking for DISTINCT rows.

There are situations where the GROUP BY makes more sense - for instance if the earliest appointment time was required as part of the output then a MIN() function with GROUP BY would yield the correct results - however - neither the question posed nor solution offered suggests this is the case.