Avatar of Pawel_Kowalski
Pawel_Kowalski
 asked on

Getting Data From 3 mySQL Tables

Hello, I have a mySQL database with the following tables:

zSS_ndc_medid
RMIID1_MED
zSS_formulary_list

The database structure looks a bit like this:

zSS_ndc_medid:

MEDID      |NDC
-------------
1      |2222
1      |3333
1      |4444
2      |5555

RMIID1_MED:

MEDID      |MED_MEDID_DESC
-----------------------
1      |Tylenol
2      |Bayer

zSS_formulary_list

NDC      |formulary_status      |list_id      |provider      |MEDID
----------------------------------------------------------------------
22222      |2                  |FSL101            |PBM            |1
22222      |3                  |FSL101            |PBM            |1


So far with some great help from here I have the following query in my PHP script:

SELECT 
    a.NDC, a.MEDID, a.BN, a.GNI, b.MED_STRENGTH, b.MED_STRENGTH_UOM, b.MED_MEDID_DESC, b.MEDID, b.MED_REF_FED_LEGEND_IND, b.GENERIC_MEDID, c.NDC, c.provider, c.list_id, c.formulary_status
FROM zSS_ndc_medid a
JOIN RMIID1_MED b ON a.MEDID = b.MEDID
LEFT JOIN zSS_formulary_list c ON c.NDC = a.NDC
   AND c.list_id = 'FSL101' AND c.provider = 'RXHUBPBM' 

Open in new window


This works fine most of the time. However, when I have a case such as the one above where there are multiple NDC values with different status in the zSS_formulary_list table I can only get one status, unfortunately I need to get all of them.

Looping through each record isn't really an option since I am dealing with large amounts of data. Is there a way to devise a SQL query that can do this all in a single query? Any help would be much appreciated. Thank you.
DatabasesPHPMySQL Server

Avatar of undefined
Last Comment
Pawel_Kowalski

8/22/2022 - Mon
Guy Hengel [angelIII / a3]

what about using GROUP BY and GROUP_CONCAT ?
not sure what you actually get and what you need?
Pawel_Kowalski

ASKER
Thanks for the tip. I am looking at that now.

What I need to do is loop through the zSS_ndc_medid table and for each NDC I need to look for all associated NDCs in the zSS_formulary_list table and select the formulary_status column for each NDC found. I also need to go in the RMIID1_MED table based on the MEDID from the zSS_ndc_medid table and select various columns in that. Hope that clarifies what I'm trying to do. Thanks.
SOLUTION
AntonioAlmeida

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
dqmq

With your sample data, your query returns 0 rows because of the c.provider = 'RXHUBPBM'  
condition.  Remove that condition and it returns both rows.  

Not sure what you are expecting?????


BTW, zSS_formulary_list has three relationships that appear suitable for the join on clause.  Of course, we don't know the meaning of those relationships so there is not much to say except that that's sort of unusual and perhaps not the best idea. I hope you understand those relationships and are using the intended one.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
dqmq

For example:

SELECT  
    a.NDC, a.MEDID, a.BN, a.GNI, b.MED_STRENGTH, b.MED_STRENGTH_UOM, b.MED_MEDID_DESC, b.MEDID, b.MED_REF_FED_LEGEND_IND, b.GENERIC_MEDID, c.NDC, c.provider, c.list_id, c.formulary_status
FROM zSS_ndc_medid a
JOIN RMIID1_MED b ON a.MEDID = b.MEDID
LEFT JOIN zSS_formulary_list c ON c.NDC = a.NDC
WHERE c.list_id = 'FSL101' AND c.provider = 'PBM'  
Pawel_Kowalski

ASKER
Hi Antonio,

Is there a way to combine a left and a right join so you get both the results from table a and all the results on table b? I'm not sure your query did that.

First let me apologize that I had to change the tables I am using around a bit, this came to my attention this morning. Instead of using zSS_ndc_medid I eliminated that and will instead be looking at RNDC14_NDC_MSTR which looks like this:

NDC	|GCN_SEQNO
------------------
111	|8743
112	|8743
113	|8754

Open in new window


GCNSEQNO is the exact same thing as MEDID in principle but is a different value I have to search on.

In my zSS_formulary_list table I also have a GCNSEQNO column (instead of the MEDID column before). Everything else is pretty much the same but just as an example here is a structure again:

NDC      |formulary_status      |list_id      |provider      |GCNSEQNO
----------------------------------------------------------------------
22222      |2                  |FSL101            |PBM            |1
22222      |3                  |FSL101            |PBM            |1

Open in new window


I will also be doing a join on a table called RMINDC1_NDC_MEDID which I don't think is all that relevent here but looks like this:

NDC	|MEDID
------------------
111	|2774
112	|2774
113	|4684

Open in new window



I took the query you posted above and modified it for this new table layout:

SELECT 
    a.NDC, a.GCN_SEQNO, b.MEDID, c.list_id, c.provider, c.formulary_status
FROM RNDC14_NDC_MSTR a
JOIN RMINDC1_NDC_MEDID b ON a.NDC = b.NDC
LEFT JOIN zSS_formulary_list c ON c.GCNSEQNO = a.GCN_SEQNO
WHERE c.list_id = 'FSL101' AND c.provider = 'RXHUBPBM'  

Open in new window


Yet my results only show all the results from the RNDC14_NDC_MSTR table, the results from zSS_formulary_list are returned only once for each record eventhough for one record in the RNDC14_NDC_MSTR table there might be 10 in the zSS_formulary_list table that have that GCNSEQNO #.

Here is a sample result:

Array (
        [NDC] => 00002060440 
        [GCN_SEQNO] => 9319 
        [MEDID] => 254034 
        [list_id] => FSL101 
        [provider] => RXHUBPBM 
        [formulary_status] => 2 
) 

Open in new window


The goal I am trying to accomplish would actually have multiple values for formulary_status here since again the zSS_formulary_list table will have multiple rows returned for each GCN_SEQNO number.

Thanks for all your help and again my apologies for changing the table layout on you.
ASKER CERTIFIED SOLUTION
dqmq

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Pawel_Kowalski

ASKER
Thank you, seems to be getting closer. But how do the results on this work? My RNDC14_NDC_MSTR table has 131,233 records in it. By running the above query I actually get an output of 366,596 rows which matches what I have in the zSS_formulary_list table. But really the output I am looking for is the 131,233 rows from the RNDC14_NDC_MSTR table with the associated formulary_status column(s) from the zSS_formulary_list table that match the GCNSEQNO number. I guess this makes sense since mySQL returns rows but that makes it a bit hard for me to process on the program side.

If I could just pick out the status column from zSS_formulary_list which had the highest value out of all of them would this simplify things and return the expected 131,233 records? I wanted to get all of the values for future uses but this isn't actually a requirement and I can get away with just picking the highest value out. Is there a query that could do that? I looked at the aggregate functions for GROUP BY but am having a hard time writing such a query with this many different tables.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
dqmq

I'm confused.  I thought you wanted records from table B, even if there were no matches in A.  That alone, increases the row count beyond Table A alone.  Also, if table a has 131,233 records and some of them are associated with more than one formulary status, then you either get more rows or "throw away" all but one of the statuses.

Anyway, for all rows in A times all rows in B, consolidating C:
SELECT  
    a.NDC, a.GCN_SEQNO, b.MEDID, c.list_id, c.provider, c.formulary_status
FROM RNDC14_NDC_MSTR a 
LEFT JOIN 
RMINDC1_NDC_MEDID b ON a.NDC = b.NDC 
LEFT JOIN
(Select max(c.formulary_status) formulary_status, max(list_id) List_ID, max(provider) Provider , gcnseqno, 
from zSS_formulary_list
where  c.list_id = 'FSL101' and c.provider = 'RXHUBPBM'
group by gcnseqno) c 
ON c.GCNSEQNO = a.GCN_SEQNO

Open in new window



For all rows in A, consolidating rows in B and C:
SELECT  
    a.NDC, a.GCN_SEQNO, b.MEDID, c.list_id, c.provider, c.formulary_status
FROM RNDC14_NDC_MSTR a 
LEFT JOIN
(Select ndc, max(medid) medid from RMINDC1_NDC_MEDID group by ndc) b ON a.NDC = b.NDC 
LEFT JOIN
(Select max(c.formulary_status) formulary_status, max(list_id) List_ID, max(provider) Provider , gcnseqno, 
from zSS_formulary_list
where  c.list_id = 'FSL101' and c.provider = 'RXHUBPBM'
group by gcnseqno) c 
ON c.GCNSEQNO = a.GCN_SEQNO

Open in new window



Pawel_Kowalski

ASKER
Thanks again, but I'm getting a SQL error:

#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'from zSS_formulary_list where c.list_id = 'FSL101' and c.provider = 'RXHUBPBM' ' at line 8

Open in new window


What I am trying to do is basically this.

Loop through each record in RNDC14_NDC_MSTR (131,233 total records).

For each record go in to the RMINDC1_NDC_MEDID table and pick out an MEDID based on the NDC from the previous table (there will only be one match on the NDC)

- For each record from RNDC14_NDC_MSTR I also need to access the zSS_formulary_list and look for any record that has a matching GCN_SEQNO #, provider=RXHUBPBM, and list_id=FSL101

- In zSS_formulary_list there might be multiple matches for GCN_SEQNO for each row in RNDC14_NDC_MSTR

- I want to take the highest match in zSS_formulary_list and return only those.

- To clarify the GCN_SEQNO column is named  GCN_SEQNO in RNDC14_NDC_MSTR while it is named GCNSEQNO in zSS_formulary_list

This process should hopefully produce the exact amount of records found in RNDC14_NDC_MSTR which is 131,233.

If your query already does that my apologies for spelling all this out again I just wanted to make sure I am clear on what I am trying to do, I know my questions have been confusing since I rethinked what I wanted to do half way through. Again I can't thank you enough for the great help.
Pawel_Kowalski

ASKER
To clarify, when I say "I want to take the highest match in zSS_formulary_list and return only those" I mean the highest match on the column formulary_status.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
dqmq

Now we have a good explantion to work from. :>)

So, here's my stab at it.  Since I'm only deskchecking this, don't be alarmed by any syntax errors--we can work through those.  The crux is the in line select which is bounded by dashed lines.  You may want to get just that part working first.  That part should return the formulary status row with the max status.

SELECT a.NDC, a.GCN_SEQNO, b.MEDID, c.list_id, c.provider, c.formulary_status 
FROM RNDC14_NDC_MSTR a  
LEFT JOIN 
RMINDC1_NDC_MEDID b ON a.NDC = b.NDC  
LEFT JOIN
(
------------
Select v1.formulary_status, v1.list_id, v1.provider, v1.gcnseqno   
  from zSS_formulary_list v1
 where v1.list_id = 'FSL101' and v1.provider = 'RXHUBPBM' 
   and v1.formulary_staus in 
    (select max(formulary_status) from zSS_formulary_list v2
     where v2.GCNSEQNO = v1.gcnseqno
      and v2.list_id = v1.list_id 
      and v2.provider = v1.provider)
group by v1.formulary_status, v1.list_id, v1.provider, v1.gcnseqno
-------------
) c on c.gcnseqno = a.GCN_SEQNO

Open in new window


Have a go...
Pawel_Kowalski

ASKER
There was a small typo which I fixed and the query is executing. However, I am not actually able to process it since it's running really slow.

Would you have any tips on setting up the proper indexes on this data?
Pawel_Kowalski

ASKER
Thanks for all the help, I got a solution in another question about this. I'm sorry I didn't make what I was trying to do clearer in this question.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.