Swaminathan K
asked on
Dataware house query tuning
Hi Team,
Need help on the below query . I attached the excel sheet which contains the data and the expected output. For the requirement in the excel sheet , I have written the below query , I need help on other best alternatives of writing the below query , since the fact table contains millions of rows.
Requirement for the data in the excel sheet:
Write a parameterized SQL query to find which store made highest sales of ‘Iphone-6’ till end of Oct-2016 as per the given format in ‘Expected Result’.FACT_SALES’ has millions of records, hence expecting the SQL query is fine-tuned and shouldn’t be give any performance issues.
below is the query i have written for the above requirement, any help in this regard is really appreciated.
Need help on the below query . I attached the excel sheet which contains the data and the expected output. For the requirement in the excel sheet , I have written the below query , I need help on other best alternatives of writing the below query , since the fact table contains millions of rows.
Requirement for the data in the excel sheet:
Write a parameterized SQL query to find which store made highest sales of ‘Iphone-6’ till end of Oct-2016 as per the given format in ‘Expected Result’.FACT_SALES’ has millions of records, hence expecting the SQL query is fine-tuned and shouldn’t be give any performance issues.
below is the query i have written for the above requirement, any help in this regard is really appreciated.
With sales_data as (
Select
fs.store_id, fs.prod_id , fs.zone_id,sum(qty_sold) qty_sold,
dense_rank() over (order by max(qty_sold) desc ) rk
from fact_sales fs INNER JOIN DIM_PROD dp on (dp.prod_id=fs.prod_id)
where to_Char(fs.sale_date,'MON-YYYY') <= to_char(to_date('&1','YYYYMMDD'),'MON-YYYY')
and dp.prod_name='&2'
group by fs.store_id, fs.prod_id , fs.zone_id
)
Select dp.prod_name, ds.store_name , dz.zone_code , sd.qty_sold * dp.Unit_Price_Rs HIGHEST_SALE ,(dp.Unit_Price_Rs * (sd.qty_sold * dp.Unit_Price_Rs)) Total_Unit_Price_Rs
from sales_data sd
JOIN DIM_STORES ds on (ds.store_id=sd.store_id and ds.zone_id=sd.zone_id)
JOIN DIM_ZONE dz on (dz.zone_id=ds.zone_id)
JOIN DIM_PROD dp on (dp.prod_id=sd.prod_id)
where rk=1;
ASKER
Hi Portlet Paul,
Can you help me on the below condition , how to change it or better way of writing the condition below
where to_Char(fs.sale_date,'MON- YYYY') <= to_char(to_date('&1','YYYY MMDD'),'MO N-YYYY')
Can you help me on the below condition , how to change it or better way of writing the condition below
where to_Char(fs.sale_date,'MON-
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The data parameter wouldbe entered as OCT-2016 i.e MON-YYYY in this format
IPhone_Sales.xlsx
IPhone_Sales.xlsx
just change the text pattern of to_date to match your parameter
If your parameter is always in that format:
WHERE fs.sale_date < add_months(to_date('OCT-20 16', 'MON-YYYY'),1)
which INCLUDES October 2016 in the data to search
===
nb: I do NOT recommend using language dependent month names but that is your decisions
, and if you do change, just change to_date to suit
If your parameter is always in that format:
WHERE fs.sale_date < add_months(to_date('OCT-20
which INCLUDES October 2016 in the data to search
===
nb: I do NOT recommend using language dependent month names but that is your decisions
, and if you do change, just change to_date to suit
ASKER
Will this condition have better performance on the fact table with million rows.
I cannot test any assertion I make here - only you can do that.
I also can only make assumptions about the indexing in your data, and this is critical to the execution efficiency.
So I cannot KNOW it will be quicker, but it SHOULD be quicker.
You should also learn to use explain plans. These are the best tool to use to investigate what indexes are being used (or not) and to identify performance problems.
So a long answer. Yes it will be better, but maybe there are still problems.
I also can only make assumptions about the indexing in your data, and this is critical to the execution efficiency.
So I cannot KNOW it will be quicker, but it SHOULD be quicker.
You should also learn to use explain plans. These are the best tool to use to investigate what indexes are being used (or not) and to identify performance problems.
So a long answer. Yes it will be better, but maybe there are still problems.
ASKER
awesome. thank you
Great. Pleased it helped.
Keep this in mind: avoid using functions on data to suit a where clause condition
Keep this in mind: avoid using functions on data to suit a where clause condition
NEVER use function on data to suit the where clause; instead adjust the where clause to suit the data
ALSO, when dealing with dates don't compare them to char values, compare dates to dates