Solved

massive inner query- your feedback please

Posted on 2013-11-09
7
220 Views
Last Modified: 2013-11-13
Hi,

Can you suggest/ review the below code if it is optimal or if there are any recommendations that will benefit.

i will inserted in the comments the # of records in each table for the query.

EDP_ITEM_FACT & EDP_ITEM_PERM are almost same table (fields wise.. but differ a little..EDP_ITEM_FACT has more data in it than EDP_ITEM_PERM ).. these are the 2 tables in main focus here....

Both EDP_ITEM_FACT & EDP_ITEM_PERM related to WAREHOUSE table based on WAREHOUSE_KEY, hence i joined only one of them.
Both EDP_ITEM_FACT & EDP_ITEM_PERM relate to LOCATION table, based on LOCATION KEY, but the FKs are different.
ITEM_HIST is the history table and it is huge. I joined only one of the tables, since that should suffice through ITEM_KEY.
Both EDP_ITEM_FACT & EDP_ITEM_PERM relate to EDP_PICK through PICK_KEY (i joined both tables to this EDP_PICK)
Both EDP_ITEM_FACT & EDP_ITEM_PERM relate to DOT table through DOT_KEY, but the FKs are different- so joined both tables to DOT table.
Only EDP_ITEM_PERM has a necessary relation with EDP_BUSI_PERM.

the main aim is to see where there is a minute difference between records in
EDP_ITEM_FACT & EDP_ITEM_PERM
for the same ITEM KEY. (as seen in the last condition - the OR statement within the bracket)

The expected results (# of records returned by the SELECT) will/expected to be small (generally lower 3 digit #)

I would appreciate any feedback on the efficiency of the query as you see it. thanks.
0
Comment
Question by:25112
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
7 Comments
 
LVL 5

Author Comment

by:25112
ID: 39636523
the query is

SELECT
c.WAREHOUSE_CODE,
c.WAREHOUSE_NAME,
f.ITEM_ID,
e.LOCATION_ID,
e.LOCATION_NAME,
h.PICK_CODE,
h.PICK_DESC,
h.DOHTEM_CODE,
h.DOHTEM_DESC,
d.LOCATION_ID,
d.LOCATION_NAME,
g.PICK_CODE,
g.PICK_DESC,
g.DOHTEM_CODE,
g.DOHTEM_DESC,
f.EXERFM,
f.EXERLM

 FROM
 
dbo.EDP_ITEM_FACT a /* EDP_ITEM_FACT has approx 1/2 Mil records */
JOIN dbo.EDP_ITEM_PERM b on a.ITEM_KEY=b.ITEM_KEY  /* EDP_ITEM_PERM has approx 100K records */
JOIN dbo.WAREHOUSE c ON c.WAREHOUSE_KEY=b.WAREHOUSE_KEY  /* WAREHOUSE has 990 records */
JOIN dbo.LOCATION d ON a.PICK_LOCATION_KEY = d.LOCATION_KEY  /* LOCATION has approx 4k records */
JOIN dbo.LOCATION e ON b.ITEM_LOCATION_KEY = e.LOCATION_KEY
JOIN dbo.ITEM_HIST f on a.ITEM_KEY = f.ITEM_KEY /* ITEM_HIST has 62 Mil records */
JOIN dbo.EDP_PICK g on a.PICK_KEY = g.PICK_KEY/* EDP_PICK has approx 1000 records */
JOIN dbo.EDP_PICK h on b.PICK_KEY = h.PICK_KEY
JOIN dbo.DOT i on a.SSRSDT_DOT_KEY = i.DOT_KEY /* DOT has approx 6000 records */
JOIN dbo.DOT j on b.PERM_WAGE_DOT_KEY = j.DOT_KEY
JOIN dbo.EDP_BUSI_PERM k on b.BUSI_PERM_KEY = k.BUSI_PERM_KEY /* EDP_BUSI_PERM has 2008 records */

WHERE
a.REF_DATE = '2012-05-31' and
f.REF_DATE_HIST =  '2012-05-30' and  
g.DOHTEM_CODE <> 'A6' and
h.DOHTEM_CODE <> 'A6'  and
f.MFG_CODE <> 'HP' and
(d.LOCATION_ID<>e.LOCATION_ID or g.PICK_CODE <>h.PICK_CODE or g.DOHTEM_CODE <> h.DOHTEM_CODE)/*This line is the heart of the code */


ORDER BY
c.WAREHOUSE_NAME,
e.LOCATION_ID,
h.PICK_DESC,
h.DOHTEM_DESC,
f.EXERFM,
f.EXERLM
0
 
LVL 35

Assisted Solution

by:David Todd
David Todd earned 334 total points
ID: 39636595
Hi,

I don't' see the point of table alias i - k.

Generally table aliases are a little more relevant than a - m.

Generally, these columns should have some form of indexing a.item_key, b.item_key, a.pick_location_key, b.item_location_key, a.pick_key, b_pick_key.

Is this what you wanted to know?

Regards
  David

I'm guessing that this is a star schema, but am wondering about the difference and apparent duplication between edp_item_fact and edp_item_perm
0
 
LVL 13

Assisted Solution

by:magarity
magarity earned 166 total points
ID: 39637243
The only potential trouble spot is item hist. You've specified a single date in the where clause but is there only one history record per item per day? If not, this will cause multiples and potentially make the working set quite large. You may need a subquery with an aggregate function to get one per day but I can't tell from here.
Also, a lot of the record counts are unequal so make sure none of them ought to be left joins instead.
Otherwise dtodd is correct that this is a pretty straightforward case of just indexing the various key fields.
0
Monthly Recap

May was a big month for new releases from Linux Academy! Take a look at what our team built recently in our blog. You can access the newest releases from our blog.

 
LVL 5

Author Comment

by:25112
ID: 39640482
David, do you mean one non clustered index in the below?: (thank you- i am changing the alias to more meaningful; yes duplication to some extent because of warehouse, but in this case DOHTEM_CODE, LOCATION_ID, PICK_CODE vary and this query is mainly to see which ones differ)

EDP_ITEM_FACT.ITEM_KEY
EDP_ITEM_FACT.PICK_KEY
EDP_ITEM_FACT.PICK_LOCATION_KEY
EDP_ITEM_FACT.SSRSDT_DOT_KEY

EDP_ITEM_PERM.ITEM_KEY
EDP_ITEM_PERM.PICK_KEY
EDP_ITEM_PERM.WAREHOUSE_KEY
EDP_ITEM_PERM.ITEM_LOCATION_KEY
EDP_ITEM_PERM.PERM_WAGE_DOT_KEY
EDP_ITEM_PERM.BUSI_PERM_KEY

WAREHOUSE.WAREHOUSE_KEY

LOCATION.LOCATION_KEY

ITEM_HIST.ITEM_KEY

EDP_PICK.PICK_KEY

DOT.DOT_KEY

EDP_BUSI_PERM.BUSI_PERM_KEY
0
 
LVL 5

Author Comment

by:25112
ID: 39640485
magarity,
the result set is generally 100-140.. so it is OK and expected.

i did not understand the below.. can you pl clarify:
"
Also, a lot of the record counts are unequal so make sure none of them ought to be left joins instead.
"
0
 
LVL 35

Accepted Solution

by:
David Todd earned 334 total points
ID: 39640622
Hi,

I don't think that item_key is appropriate for the clustered index. Generally to avoid duplication, or rather, allow de-duplication, need to have a separate ID. But given the joins on item_key etc, these fields imho should be indexed.

Regards
  David
0
 
LVL 5

Author Comment

by:25112
ID: 39644798
thanks.
0

Featured Post

Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question