Avatar of MariaHalt
MariaHaltFlag for United States of America asked on

Plan statistics - Am I reading this correctly???

Hi Experts, I hope you're looking for a challenge today.  
This is my query.  The values in brackets are dummies to protect the innocent :)  

with 
cte_1 as (
select [a few fields] 
from [a really big table] 
where status = [some status] 
and [date_field]  >= sysdate - 365
), 
cte_2 as (
select [a few fields] 
from [the same really big table] 
where status = [some other status] 
and [date_field]  >= sysdate - 365
)
select /*+ gather_plan_statistics */
[a field], count(*) as rcdcnt
from cte_1 
inner join cte_2 on cte_1.<field xyz> = cte_2.<field xyz>
where cte_2.[date_field]  <  cte_1.[date_field] 
group by [a field]
order by [a field]

Open in new window


This runs painfully s-l-o-w.  The status and date_field are indexed separately.  An extended statistic has been created between [status] and [date field].  

The execution plan, with my commentary, looks like this:

I believe execution starts with lines id 3 and 4, sends those results to line id 2 and then goes to line 5 and sends those back to 2 as well.  And then up the tree from there.  

My first question is how do I know it's using the extended statistic?  Should I have used a hint for that in my query?  By the way, this is what the extended statistic looks like:

My second question is what can I do to get the estimated rows down?  Please feel free to educate me.  
Oracle Database* Oracle DBA

Avatar of undefined
Last Comment
MariaHalt

8/22/2022 - Mon
ASKER
MariaHalt

Anyone???
slightwv (䄆 Netminder)

In the plan lines 3 & 5 are passed up to the hash join.  You can tell this by them being indented at the same level.

https://blogs.oracle.com/oraclemagazine/how-to-read-an-execution-plan

>>My first question is how do I know it's using the extended statistic?

I believe the extended statistics allow you to see the Actual rows and a few other things.  I've only looked a them a few times in roughly 30 years with Oracle.  So, I've not found them to be much use.  Others that spend careers out of tuning might use them a lot more.

I would look at rewriting the query to only access [a really big table] once.

If you can dummy up a table with dummy data and expected results that represents what you are trying to get from your query, we can try to help you rewrite the query.

For the expected results, it would be great if you took your original query and used it with the dummy table so we can actually see it run to produce the results.
Alex [***Alex140181***]

I totally agree with slightwv: you should definitely rewrite your query! You're hitting that big table with both CTEs, which is obviously no good ;-) Try to "merge" them into one CTE..
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
ASKER
MariaHalt

I got it working in one pass of [a really big table] but performance got way worse!  Some more information, more than half the records in the table meet the criteria for [some other status].  Any thoughts about this?

--file_date is a date datatype
with cte as (
select
  a.[field a]  
, a.[field xyz]
, a.status
, a.file_date
, case      
    when a.status = [some status] then 'NOT ARCHIVED'
    when a.status = [some other status] and a.file_date >= sysdate - 365 then 'ARCHIVED'
    else null
  end archived_state
from [a really big table] a
where a.status in ([some status], [some other status])
)
select a.[field a], count(*) as rcdcnt
from cte ap
inner join cte aa on ap.[field xyz] = aa.[field xyz]
where ap.file_state = 'NOT ARCHIVED'
and aa.file_state = 'ARCHIVED'
and ap.file_date < aa.file_date
group by [field a]
order by [field a];

Open in new window

slightwv (䄆 Netminder)

How many rows are in the big table?

Post the SQL you have.

A single full table scan is probably about the best you can do but we'll see.
ASKER
MariaHalt

21 million records +
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
slightwv (䄆 Netminder)

21 million isn't that large to some.

How fast does it run and how fast are you trying to make it?  Some tuning just isn't possible to achieve without physical changes and even then some aren't possible.

also need the SQL you currently have.
Alex [***Alex140181***]

inner join cte aa on ap.[field xyz] = aa.[field xyz]

Open in new window

You're (still) hitting the table twice due to this self join! I don't think that's necessary ;-)

Statistics are up to date (table, columns and indexes)?
Are you on EE? If yes, you might consider partitioning the table ;-)
slightwv (䄆 Netminder)

Oops, I missed the SQL.

Yes, using the CTE twice almost always means hitting the table twice.  The execution plan should show this.

Can you please mock up a simple test case so we can help you rewrite the SQL?

>>Are you on EE? If yes, you might consider partitioning the table ;-)

Enterprise Edition is only part of that.  Partitioning is an extra cost option.

If you are licensed for it, it might help but might hurt other queries.

Right now, I'm not sure it is needed.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
ASKER
MariaHalt


Number of rows = 22,644,472
Blocks = 3,751,485
Statistics last collected at 2:45pm this afternoon.
Average row length = 1,175
There are no partitions.

The two statuses I'm looking at account for 22,440,543 records.  So almost all.
slightwv (䄆 Netminder)

Stats are fine but don't help much in this case.

If you generate the execution plan for your latest query you'll probably see two accesses of the big table.  So just because it looks like one in the SQL, it really isn't.

We've both mentioned that you probably don't need two but without a test case we cannot say for sure.

I'm prettyy confident that you can get it down to a single table access but don't know your data and requirements.

Without the sample test case I have no way to help.
Alex [***Alex140181***]

There are no partitions.
And you don't consider making use of this (if possible)?!
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

where a.status in ([some status], [some other status])

Open in new window

What's the approx. amount expected here?! Rather "all" rows or just some (in relation to all rows)?

What does the following SQL return, regarding the counts of "[some status], [some other status]"?
select a.status, 
       count(*) 
  from [a really big table] a
 group by a.status

Open in new window

ASKER
MariaHalt

Here you go, [some status] = 9 and [some other status] = 10
 
statuscount(*)
0117
11998
29276
321726
422216
545259
691002
7120418
8217418
910967107
1011650616

ASKER
MariaHalt

There are several indexes on the table.  Are partition seamless?  What are the down sides?  I'm not a dba.  Could it adversely affect production?
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
slightwv (䄆 Netminder)

>>There are several indexes on the table.

Doesn't matter is the status' you are pulling are roughly more than 20% of the total rows.  The optimizer will opt for a Full Table Scan.

>>Are partition seamless?  

Sort of but requires rebuilding the tables and indexes.

The most important issue is you are licensed to use them.  Partitioning is a pretty expensive add-on to your licenses.

If you don't have the license, no need to even consider it.

Your DBA should be able to tell you if you have the license.

>>What are the down sides?

It is more to consider.  Especially with indexing.

>> I'm not a dba.  Could it adversely affect production?

It could.  Depends on how other queries access the table.  Most probably won't notice the difference if everything is set up properly.  Some might.

Partitioning isn't a silver bullet.  It's a tool.  When used properly, it is great.  When used improperly, not so much.


Is there a reason you aren't mocking up a test case for us?

It doesn't have to be real data.  Just a table that has similar columns and a few rows then expected results from that table.

Just make it close enough that when we provide SQL based on it, you can apply that SQL against your actual data.

Should only take a few minutes.
Alex [***Alex140181***]


The optimizer will opt for a Full Table Scan. 
Yes... Way "too many" rows for those 2 status, no need for the CBO to look for any other way than a full table scan...

Bottom line still remains:
Try to "simplify" your query! Try not to hit big tables more than necessary! And I guess in your case it is NOT ;-)
It may be worth trying to create a materialized view (maybe semi-auto refreshable), but I'm no fortune teller though ;-)
ASKER
MariaHalt

Hi slightwv, If I told you the table had 700+ columns, mostly empty, would you want that in the mock up?  Or just the handful I'm returning/filtering on?  What format does the data need to be?  Is a csv file ok?  


Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

700+ columns
Wow!! You're almost hitting the cap at 1000 columns!! May I ask you -> why? This looks like a design "error"...
ASKER
MariaHalt

All I can say is the table ingests lots of data from various sources.  It's just the way it is.  And my team has to make it work.  I'm reading up on materialized views.  
slightwv (䄆 Netminder)

>> Or just the handful I'm returning/filtering on?  What format does the data need to be?  Is a csv file ok?  

We just need enough information that you can take out SQL and apply it to your system.

If you can do that with 5 column or 50, whatever it takes.

CSV is fine.  Just something we can use to create the table and load the data.

Don't forget the expected results with whatever data you provide so we can ensure the SQL works.  Then you can take that and test against your system.

>>I'm reading up on materialized views.  

Just like partitioning:  They can be a blessing or a curse.  If not use properly, they can cause as many issues as they solve.

>>If I told you the table had 700+ columns, mostly empty

I would tell you to look for a replacement for the person that designed that.  There is no way that can be normalized.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
ASKER
MariaHalt

LOL!!!  It wasn't me :)
ASKER
MariaHalt

The csv is bare bones, just the required fields and a mere 1000 records of each status. The query should return one record:  dept_id = 219 and rcdcnt = 5.  Thanks!  

with
cte_1 as (
select dept_id, dept_stuff, status, file_date
from ee
where status =  9
and file_date >= sysdate - 365
),
cte_2 as (
select dept_id, dept_stuff, status, file_date
from ee
where status = 10
and file_date  >= sysdate - 365
)
select
cte_1.dept_id, count(*) as rcdcnt
from cte_1
inner join cte_2 on cte_1.dept_id = cte_2.dept_id and cte_1.dept_stuff = cte_2.dept_stuff
where cte_2.file_date  <  cte_1.file_date
group by cte_1.dept_id
order by cte_1.dept_id

Open in new window


ee.csv
slightwv (䄆 Netminder)

Thanks for the data however the query you posted doesn't return any data for me after I loaded the csv.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

I suppose, this is just a tiny part of the whole data, that's why the query doesn't return any rows. The data's just missing...
Nevertheless, have you tried the following?!
select a.dept_id,
       count(*)
  from ee a
 where a.status in (9, 10)
   and a.file_date >= sysdate - 365
   and exists (select null
          from ee x
         where x.dept_id = a.dept_id
           and x.dept_stuff = a.dept_stuff
           and x.file_date < a.file_date)
 group by a.dept_id;

Open in new window

Otherwise, we need more/the correct data to test it...
slightwv (䄆 Netminder)

>>The data's just missing...

She posted "The query should return one record:  dept_id = 219 and rcdcnt = 5" so I kind of expected the provided data to return that.
ASKER
MariaHalt

Ugh, sorry.  Try this one.

ee.csv
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
ASKER
MariaHalt

Hi Alex, The file_date only applies status = 10.  Records with status 9 need not be filtered for file_date >= sysdate- 365.  I changed it to this, and it's just as slow as the original:

select a.dept_id,
       count(*)
  from ee a
 where a.status = 10
 and  a.file_date >= sysdate - 365
   and exists (select null
          from ee x
         where x.status = 9
           and x.dept_id = a.dept_id
           and x.dept_stuff = a.dept_stuff
           and x.file_date < a.file_date)
 group by a.dept_id;
slightwv (䄆 Netminder)

>>Records with status 9 need not be filtered for file_date >= sysdate- 365.

But you do that in your original query.  Does it hurt anything to do it?

While we wait for updates in the status 9 filter issue, this returns the same results in the same data and only access the table once.  It does use an additional GROUP BY so I'm not sure how that will affect your performance on 21+ million rows but give it a try and see how it does:
select dept_id, count(*)
from (
	select dept_id, dept_stuff,
		max(case when status=9 then file_date end) status_9_date,
		max(case when status=10 then file_date end) status_10_date
	from ee
	where status in (9,10)
		and file_date >= sysdate - 365
	group by dept_id, dept_stuff
)
where status_10_date < status_9_date
group by dept_id
;

Open in new window

ASKER
MariaHalt

I apologize, I did have that in my original query.  I should not have. 
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

Hi Alex, The file_date only applies status = 10.  Records with status 9 need not be filtered for file_date >= sysdate- 365.                                  
Then even your CTE query is wrong ;-)
Alex [***Alex140181***]

select dept_id,
       count(*)
  from (select dept_id,
               dept_stuff,
               max(case
                     when status = 9 then
                      file_date
                   end) status_9_date,
               max(case
                     when status = 10 then
                      file_date
                   end) status_10_date
          from ee
         where (status = 10 and file_date >= sysdate - 365)
            or status = 9
         group by dept_id,
                  dept_stuff)
 where status_10_date < status_9_date
 group by dept_id;

Open in new window

Alex [***Alex140181***]

select a.dept_id,
       count(*)
  from ee a
 where ((a.status = 10 and a.file_date >= sysdate - 365) or a.status = 9)
   and exists (select null
          from ee x
         where x.dept_id = a.dept_id
           and x.dept_stuff = a.dept_stuff
           and x.file_date < a.file_date)
 group by a.dept_id;

Open in new window


You should just check our suggested queries against your live system and its indexes and stats, so may choose what's suits you more ;-)
Your help has saved me hundreds of hours of internet surfing.
fblack61
ASKER
MariaHalt

Hi Alex, I just did.  It's picking up the correct record, plus 3 more. It's taking about the same time as the original.  Will checkout the statistics.  Thanks for trying.
slightwv (䄆 Netminder)

Which one did you try?  The second one he posted also accesses the table twice.

The first one he posted seems to be based off mine but removes the status-9 date check.

I'm away from my main machine right now so couldn't adjust mine.

This one here: #a43231051

It should only access the table once.
Alex [***Alex140181***]

I'm away from my main machine right now so couldn't adjust mine.
I've just adjusted yours ;-) (with the status 9 check)
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
MariaHalt

I tried the second one.
slightwv (䄆 Netminder)

Try the first one (adjusted from mine).  It only hits the table once....................
ASKER
MariaHalt

I tried Alex's first query and it runs 40% faster, which is good!  However, it's picking up additional records, where dept_stuff is null.  Tried a few things to filter out the null dept_stuff records, but it slows down again.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
ASKER
MariaHalt

Any thoughts?

slightwv (䄆 Netminder)

nulls weren't part of the test data or any of your requirements.

Not sure what you tried but try this:
select dept_id, count(*)
from (
	select dept_id, dept_stuff,
		max(case when status=9 then file_date end) status_9_date,
		max(case when status=10 then file_date end) status_10_date
	from ee
	where
		(
		(status = 10 and file_date >= sysdate - 365)
		or status = 9
		)
		and dept_stuff is not null
	group by dept_id, dept_stuff
)
where status_10_date < status_9_date
group by dept_id
;

Open in new window


Before you tried you 'null' code, get an execution plan on it.  Then the other SQL.

Compare and contrast the plans to see where the differences are.
slightwv (䄆 Netminder)

How many rows with status of 9 and 10 have null dept_stuff?

With only 2000 rows, this is pretty much the same execution plan.  Not idea if it will be better or worse against the large table.
with cte as (
	select dept_id, dept_stuff, status, file_date,
		lead(file_date) over(partition by dept_id, dept_stuff order by status) next_file_date
	from ee
	where 
		(
			(status = 10 and file_date >= sysdate - 365)
			or status = 9
		)
		and dept_stuff is not null
)
select dept_id, count(*)
from cte
where next_file_date < file_date
group by dept_id
;

Open in new window

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
slightwv (䄆 Netminder)

Here's my favorite.  Probably not any faster but who knows?

It is a whole lot cleaner to look at!

select dept_id, count(*)
from ee
pivot (
	max(file_date)
	for status in (9 stat9_date,10 stat10_date)
)
where dept_stuff is not null
	and stat10_date >= sysdate-365
	and stat10_date < stat9_date
group by dept_id
;

Open in new window

ASKER
MariaHalt

The dev environment had no nulls, when I tried it out in prod, I picked up the nulls.  
I tried a43231186 and variations of it with no improved performance from the original.  
I also tried a43231190, I like the use of the analytical function, lead(), but it also ran slow.  
The nulls aren't that many:
status 9: 1621 out of 11666012 are null
status 10: 19292 out of 10988338 are null


ASKER
MariaHalt

The query with the pivot is slow as well.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
slightwv (䄆 Netminder)

What about the pivot?  Will probably be about he same but it is my favorite.

>>variations of it with no improved performance from the original.  

Which original?  It has to be faster than the original posted in the question.

You mean the same as the slowdown after the 40% increase?

See how this is getting confusing?

We, well at least, me, would like to see the execution plans.

Run the last one I posted using the pivot and post it's execution plan.


I also never saw an answer to:
How fast does it run and how fast are you trying to make it?  Some tuning just isn't possible to achieve without physical changes and even then some aren't possible.
slightwv (䄆 Netminder)

>>The query with the pivot is slow as well.

Sorry, we cross posted.  I didn't refresh after I was done typing.

Please post the plan and define "slow" vs what you are thinking it should be.
ASKER
MariaHalt

I canceled the pivot query, it is still in the process of canceling.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

So, everytime you add the where-clause for the null check, the query slows down, independently of what SQL you use (pivot, mine, slightwv)?!
Could you please post the execution plans for the pivot thing, one with the null check, the other without?!
If some index kicks in due to the null check, you may want to try to "disable":
select dept_id, count(*)
from ee
pivot (
   max(file_date)
   for status in (9 stat9_date,10 stat10_date)
)
where dept_stuff || '' is not null
   and stat10_date >= sysdate-365
   and stat10_date < stat9_date
group by dept_id
;

Open in new window


ASKER
MariaHalt

Alex, Correct.  The pivot was painful, sqldeveloper froze up on me, I'd prefer not to do that during the workday.  At the moment, I'm testing a partition to see if it's better.  Just worried I might be swapping one problem for many others :(

slightwv (䄆 Netminder)

You can generate plans w/o actually running the query.  They take a second to generate.

You should do that to get an idea if something is "better" or at least not worse.

I would like to see the plans for these three:
#a43231198
#a43231190
and
#a43231186
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
slightwv (䄆 Netminder)

I guess you have confirmed you are licensed for partitioning.

What are you looking to partition on?
Alex [***Alex140181***]

What are you looking to partition on?                                  
I strongly advise to solve this step by step and NOT to be too hasty here.
Just do as slightwv suggested: compare those execution plans beforehand ;-)

But please add this one to the list:
https://www.experts-exchange.com/questions/29205815/Plan-statistics-Am-I-reading-this-correctly.html#a43231338                                   

ASKER
MariaHalt

I made a copy of my table and partitioned it on status, since the 9 and 10 are almost evenly split.  
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

I'd taken "date_field" as the "main"/"root" partition qualifier. Maybe status as sub-partition...
ASKER
MariaHalt

Alex [***Alex140181***]

Please do NOT execute the queries we mentioned, just generate the execution plans with "explain plan for...", like it's shown here:
https://blogs.oracle.com/optimizer/how-to-generate-a-useful-sql-execution-plan
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
slightwv (䄆 Netminder)

>>I made a copy of my table and partitioned it on status, since the 9 and 10 are almost evenly split.  

You need to think about your goal, which we still do not know because you haven't provided your tuning goals after I've asked for them a few times.  Remember "faster" is never a goal.

So, you partition on status.  Using the numbers you posted above you are now looking at 22,617,723 instead of 23,147,153.

Are you really expecting that to make a difference?  I'll predict no now and could have saved you however much time you have taken to try it.

Also, when creating partitions, you need to plan on using local or global indexes...  Not that indexes will help.  Maybe as Alex suggested, on the date field but even with that, you need to understand the cardinalities in the data.

Again, you can do all that long before actually "trying" anything.  If more than 20% (general rule of thumb) of the data will be accessed, indexes don't do any good.
ASKER
MariaHalt

There is a tweak I've been meaning to mention:
and stat10_date < stat9_date 
 should be the reverse
and stat9_date < stat10_date 

Open in new window

 I ran autotrace in my dev environment.  DEV_AUTOTRACE.jpg
ASKER
MariaHalt

I nixed the partition idea...it was a good exercise though.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
slightwv (䄆 Netminder)

Autotrace must execute the query.

What we've been asking for is an execution plan that is generated and DOES NOT execute the query.  It takes under a second to generate and display.

The ways to do that are in the link Alex posted.

Here is a simple way to do it and how I do it:
explain plan for
select ....the rest of your query
;

--then to see the plan:
select * from table(dbms_xplan.display);

Open in new window


The text "explain plan for" before the statement is required.

From the attached image, I see a single table access which is good,  Now what?

What about the plans for the others we have asked for.

AGAIN, it has taken me longer to type this reply than it takes to generate the plans....  why aren't you providing them?

You also still haven't provided your tuning goals.

Please provide them.
slightwv (䄆 Netminder)

Something else I'm curious about:
Out of the 700+ columns in the table, how many is your query accessing?

Not just selecting but using in total (includes the where clause).

For example, using one of my queries above:
select dept_id, count(*)
from (
	select dept_id, dept_stuff,
		max(case when status=9 then file_date end) status_9_date,
		max(case when status=10 then file_date end) status_10_date
	from ee
	where
		(
		(status = 10 and file_date >= sysdate - 365)
		or status = 9
		)
		and dept_stuff is not null
	group by dept_id, dept_stuff
)
where status_10_date < status_9_date
group by dept_id
;

Open in new window


It accesses 4 columns:  dept_id, dept_stuff, file_date, status
ASKER
MariaHalt

I have more time today to do these things.  The actual query accesses 5 columns, dept_stuff_2.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
slightwv (䄆 Netminder)

>>The actual query accesses 5 columns, dept_stuff_2.

Once we nail down the specific logic and what is likely going to be best select, we can try looking at existing indexes.

With only 5 columns used, we might be able to come up with a new index or modify an existing one using all 5 columns in such a way that your query can avoid accessing the table and only use the index.

There are concerns doing that:  It will affect DML on the table and can affect other queries and system processes.

OVER INDEXING is a VERY BAD practice in general.

The system I work on now has this issue.  Total index size is twice the actual data size.  It is WAY over indexed and what I just proposed is the reason.....

Too many years of developers needing to fix "their query" so they create an index for it.

Well, with every developer doing that over the years, you get the problem I have now.
ASKER
MariaHalt

I spoke to soon about having time today!  But I got it done.  With reference to the query in #a43231338 I ran the explain plan with and without the null check.  See images.

PIVOT WITHOUT NULL CHECK.jpg

PIVOT WITH NULL CHECK.jpg
slightwv (䄆 Netminder)

That is great for what Alex asked for.

In the future, can you copy/paste the plans in a code block here?  Images are hard to deal with.  They are also more work for you....

I think we already know the PIVOT is a bad way to go from your previous test of it.  Notice the 10G and 12G of temp space?  Shouldn't need that much but without the other plans we have no way of knowing.

What about everything I've asked for.  Some several times......

Until I get the information I've asked for, there isn't anything else I can do to assist you.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
MariaHalt

ok....gotcha...working on it.

slightwv (䄆 Netminder)

It took longer for me to open my SQLDeveloper than it took to run.

I took all the SQL I wanted to see the plans for, pasted the following in the SQLDeveloper worksheet.

Clicked "Execute Script".
Task completed in  0.772 seconds.

The results were in the script output window.

Click in that window, Ctrl-A to highlight it all, Ctrl-C to copy  it.  Open a code block here and the reuslts are below the worksheet code.

Again, MUCH longer to type all this that to actually do it......

explain plan for 
select dept_id, count(*)
from ee
pivot (
	max(file_date)
	for status in (9 stat9_date,10 stat10_date)
)
where dept_stuff is not null
	and stat10_date >= sysdate-365
	and stat10_date < stat9_date
group by dept_id
;

select * from table(dbms_xplan.display);

explain plan for 
with cte as (
	select dept_id, dept_stuff, status, file_date,
		lead(file_date) over(partition by dept_id, dept_stuff order by status) next_file_date
	from ee
	where 
		(
			(status = 10 and file_date >= sysdate - 365)
			or status = 9
		)
		and dept_stuff is not null
)
select dept_id, count(*)
from cte
where next_file_date < file_date
group by dept_id
;

select * from table(dbms_xplan.display);

explain plan for 
select dept_id, count(*)
from (
	select dept_id, dept_stuff,
		max(case when status=9 then file_date end) status_9_date,
		max(case when status=10 then file_date end) status_10_date
	from ee
	where
		(
		(status = 10 and file_date >= sysdate - 365)
		or status = 9
		)
		and dept_stuff is not null
	group by dept_id, dept_stuff
)
where status_10_date < status_9_date
group by dept_id
;

select * from table(dbms_xplan.display);

Open in new window



My plans:
Explained.


PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3203541740
 
-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  2002 | 26026 |     8  (13)| 00:00:01 |
|   1 |  HASH GROUP BY         |      |  2002 | 26026 |     8  (13)| 00:00:01 |
|   2 |   VIEW                 |      |  2002 | 26026 |     8  (13)| 00:00:01 |
|*  3 |    FILTER              |      |       |       |            |          |
|   4 |     HASH GROUP BY PIVOT|      |  2002 |   111K|     8  (13)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL | EE   |  2002 |   111K|     7   (0)| 00:00:01 |

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(MAX(CASE  WHEN ("STATUS"=10) THEN "FILE_DATE" END 
              )>=SYSDATE@!-365 AND MAX(CASE  WHEN ("STATUS"=10) THEN "FILE_DATE" END 
              )<MAX(CASE  WHEN ("STATUS"=9) THEN "FILE_DATE" END ))
   5 - filter("EE"."DEPT_STUFF" IS NOT NULL)
 
Note

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
   - dynamic statistics used: dynamic sampling (level=2)

24 rows selected. 


Explained.


PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 743967200
 
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1998 | 61938 |     8  (13)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|      |  1998 | 61938 |     8  (13)| 00:00:01 |
|*  2 |   VIEW               |      |  1998 | 61938 |     8  (13)| 00:00:01 |
|   3 |    WINDOW SORT       |      |  1998 |   111K|     8  (13)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| EE   |  1998 |   111K|     7   (0)| 00:00:01 |
-----------------------------------------------------------------------------

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("NEXT_FILE_DATE"<"FILE_DATE")
   4 - filter("DEPT_STUFF" IS NOT NULL AND ("STATUS"=9 OR "STATUS"=10 
              AND "FILE_DATE">=SYSDATE@!-365))
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

22 rows selected. 


Explained.


PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 14290771
 
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |  1998 | 25974 |     8  (13)| 00:00:01 |
|   1 |  HASH GROUP BY        |      |  1998 | 25974 |     8  (13)| 00:00:01 |
|   2 |   VIEW                |      |  1998 | 25974 |     8  (13)| 00:00:01 |
|*  3 |    FILTER             |      |       |       |            |          |
|   4 |     HASH GROUP BY     |      |  1998 |   111K|     8  (13)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| EE   |  1998 |   111K|     7   (0)| 00:00:01 |

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(MAX(CASE "STATUS" WHEN 10 THEN "FILE_DATE" END )<MAX(CASE 
              "STATUS" WHEN 9 THEN "FILE_DATE" END ))
   5 - filter("DEPT_STUFF" IS NOT NULL AND ("STATUS"=9 OR "STATUS"=10 
              AND "FILE_DATE">=SYSDATE@!-365))
 
Note

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
   - dynamic statistics used: dynamic sampling (level=2)

24 rows selected. 

Open in new window

slightwv (䄆 Netminder)

and MORE important:
I'm waiting for existing execution time and your tuning goals.
Your help has saved me hundreds of hours of internet surfing.
fblack61
ASKER
MariaHalt

This is my query
with
cte_1 as (
    select dept_id, dept_stuff_1, dept_stuff_2, file_date
    from   [my big table]
    where  status = 10
    and    file_date  >= sysdate - 365
),
cte_2 as (
    select dept_id, dept_stuff_1, dept_stuff_2, file_date
    from   [my big table]
    where  status = 9
)
select dept_id, count(*) as rcdcnt
from cte_1
inner join cte_2 on cte_1.[dept_stuff_1] = cte_2.[dept_stuff_1] and cte_1.[dept_stuff_2] = cte_2.[dept_stuff_2]
where cte_2.[file_date]  <  cte_1.[file_date]
group by dept_id
order by dept_id

Open in new window


This is the explain plan for it
 
 Id   Operation            Name     Rows   Bytes TempSpc Cost (%CPU) Time     
0 SELECT STATEMENT                  212268          146K (14) 00:00:06 
1   SORT GROUP BY                    212268          146K (14) 00:00:06 
*  2       HASH JOIN                           39M  4058M    99M   137K  (8) 00:00:06 
*  3         TABLE ACCESS FULL MY BIG TABLE   1634K    81M        65861   (7) 00:00:03 
*  4         TABLE ACCESS FULL MY BIG TABLE   2004K   107M        66199   (8) 00:00:03 

 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("A"."DEPT_STUFF_1"="A"."DEPT_STUFF_1" AND  "A"."DEPT_STUFF_2"="A"."DEPT_STUFF_2")
       filter("A"."FILE_DATE"<"A"."FILE_DATE")
   3 - filter("A"."STATUS"=9 AND "A"."DEPT_STUFF_1" IS NOT NULL)
   4 - filter("A"."STATUS"=10 AND "A"."DEPT_STUFF_1" IS NOT NULL AND
              "A"."FILE_DATE">=SYSDATE@!-365)

Open in new window


There are separate indexes on dept_id, file_date, dept_stuff_1, and status.
Note, dept_stuff_2 is part of a composite index with other columns (that are not part of this query).
slightwv (䄆 Netminder)

You seem to be not understanding what I'm asking for and providing random information.

Didn't Alex and I both agree that your original query wasn't the best approach.  Why did you provide the plan for that when it isn't what I've been asking for?  It serves no purpose.

I gave you three links that has the three queries I would like plans for.

I have also asked several times for your tuning goals.

Without the information I'm asking for, I cannot help you.

This is my last post until you provide the information I've requested.

If you cannot provide it, hopefully Alex can help you.
ASKER
MariaHalt

I was hoping to make this alternative query work since it only does 1 table scan, but it ran slower than the query in the previous comment.
 select dept_id,
       count(*)
  from (select dept_id,
               dept_stuff_1,
               dept_stuff_2,
               max(case when status = 9  then file_date end) status_9_date,
               max(case when status = 10 then file_date end) status_10_date
          from [my big table]
         where (status = 9 or (status = 10 and file_date >= sysdate - 365))
         and dept_stuff_1 is not null --> needed otherwise nulls are picked up
         group by dept_id,
                  dept_stuff_1,
                  dept_stuff_2)
 where status_9_date < status_10_date
 group by dept_id
 ;

Open in new window

 
 Id   Operation              Name     Rows   Bytes TempSpc Cost (%CPU) Time     
0 SELECT STATEMENT               21168        71843   (7) 00:00:03 
1  HASH GROUP BY                 21168        71843   (7) 00:00:03 
2   VIEW                         86318   674K        71843   (7) 00:00:03 
*  3      FILTER                                                                 
4       HASH GROUP BY              86318  4720K   119M 71843   (7) 00:00:03 
*  5         TABLE ACCESS FULL MY BIG TABLE   1726K    92M        65910   (7) 00:00:03 
Predicate Information (identified by operation id): 
   3 - filter(MAX(CASE "STATUS" WHEN 9 THEN "FILE_DATE" END )<MAX(CASE "STATUS"
              WHEN 10 THEN "FILE_DATE" END ))
   5 - filter("DEPT_STUFF_1" IS NOT NULL AND ("STATUS"=9 OR "STATUS"=10 AND
              "FILE_DATE">=SYSDATE@!-365))

Open in new window

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
MariaHalt

I must have missed that post with the 3 links, my apologies.  
My tuning goals are to make it run quicker.  I realize that is vague, the current execution time is painful > 5 minutes.  
Please know, I appreciate your help!  

#a43231198(please note that line 9 needs to changed to stat9_date < stat10_date) 
 Id   Operation               Name     Rows   Bytes TempSpc Cost (%CPU) Time     
0 SELECT STATEMENT                21210          273K  (7) 00:00:11 
1  HASH GROUP BY                  21210          273K  (7) 00:00:11 
2   VIEW                          10425   101K          273K  (7) 00:00:11 
*  3     FILTER                                                                  
4     HASH GROUP BY PIVOT         10425    10M    10G   273K  (7) 00:00:11 
*  5       TABLE ACCESS FULL MY BIG TABLE  4169K  4310M        76517  (20) 00:00:03 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(MAX(CASE  WHEN ("STATUS"=10) THEN "FILE_DATE" END )>=SYSDATE@!-365 
              AND MAX(CASE  WHEN ("STATUS"=9) THEN "FILE_DATE" END )<MAX(CASE  WHEN 
              ("STATUS"=10) THEN "FILE_DATE" END ))
   5 - filter("MY BIG TABLE"."DEPT_STUFF" IS NOT NULL)

Open in new window


#a43231190  (NOTE:  this queries the EE table which was the smaller dataset)
 Id   Operation             Name  Rows   Bytes  Cost (%CPU) Time     
0 SELECT STATEMENT           6132     4  (25) 00:00:01 
1  SORT GROUP BY NOSORT      6132     4  (25) 00:00:01 
*     2    VIEW                     149732934     4  (25) 00:00:01 
3    WINDOW SORT             149783832     4  (25) 00:00:01 
*     4      TABLE ACCESS FULL EE   149783832     3   (0) 00:00:01 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("NEXT_FILE_DATE"<"FILE_DATE")
   4 - filter(("STATUS"=9 OR "STATUS"=10 AND
              "FILE_DATE">=SYSDATE@!-365) AND "DEPT_STUFF" IS NOT NULL)

Open in new window



 #a43231186(NOTE:  this queries the EE table which was the smaller dataset)
 Id   Operation              Name  Rows   Bytes  Cost (%CPU) Time     
0 SELECT STATEMENT            648     4  (25) 00:00:01 
1  HASH GROUP BY              648     4  (25) 00:00:01 
2   VIEW                      75600     4  (25) 00:00:01 
*     3     FILTER                                                       
4     HASH GROUP BY           754200     4  (25) 00:00:01 
*     5       TABLE ACCESS FULL EE   149783832     3   (0) 00:00:01 


Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(MAX(CASE "STATUS" WHEN 10 THEN "FILE_DATE" END )<MAX(CASE
              "STATUS" WHEN 9 THEN "FILE_DATE" END ))
   5 - filter(("STATUS"=9 OR "STATUS"=10 AND
              "FILE_DATE">=SYSDATE@!-365) AND "DEPT_STUFF" IS NOT NULL)

Open in new window

ASKER CERTIFIED SOLUTION
Alex [***Alex140181***]

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
slightwv (䄆 Netminder)

>> (NOTE:  this queries the EE table which was the smaller dataset)

Doesn't help.  Need to see the stats on the problem table.  Apples and Oranges.

How am I supposed to know which one might be better if you don't show me the plans?

It equates to me asking you to redesign my living room but not letting you look at it but feel free to go look at my neighbors living room that isn't the same size or shape for ideas.

>>My tuning goals are to make it run quicker.  I realize that is vague,

I've already said "faster" is not a goal.  Pretty sure at least one of the queries we've posted is faster than the original one you posted.  Since it is faster, we are done, right?

>>the current execution time is painful > 5 minutes.
ASKER
MariaHalt

I followed the links to the 3 queries you mentioned.  It makes sense that you would want it against my table.  See below.

 #a43231190
 Id   Operation             Name     Rows   Bytes TempSpc Cost (%CPU) Time     
0 SELECT STATEMENT              21462        99882       (6) 00:00:04 
1  SORT GROUP BY NOSORT         21462        99882       (6) 00:00:04 
*  2    VIEW                          1726K    36M        99882       (6) 00:00:04 
3    WINDOW SORT                  1726K    92M   119M 99882       (6) 00:00:04 
*  4      TABLE ACCESS FULLMY BIG TABLE  1726K    92M        66098       (7) 00:00:03 
Predicate Information (identified by operation id):
   2 - filter("FILE_DATE"<"NEXT_FILE_DATE")
   4 - filter("DEPT_STUFF" IS NOT NULL AND ("STATUS"=9 OR "STATUS"=10 AND
              "FILE_DATE">=SYSDATE@!-365))

Open in new window

 
 #a43231186 
 Id   Operation             Name     Rows   Bytes TempSpc Cost (%CPU) Time     
0 SELECT STATEMENT              21462        99882      (6) 00:00:04 
1  SORT GROUP BY NOSORT         21462        99882      (6) 00:00:04 
*  2    VIEW                          2853K    59M        99882      (6) 00:00:04 
3    WINDOW SORT                  2853K   152M   197M 99882      (6) 00:00:04 
*  4      TABLE ACCESS FULLMY BIG TABLE  2853K   152M        66098      (7) 00:00:03 
Predicate Information (identified by operation id):
   2 - filter("FILE_DATE"<"NEXT_FILE_DATE")
   4 - filter("DEPT_STUFF" IS NOT NULL AND ("STATUS"=9 OR "STATUS"=10 AND
              "FILE_DATE">=SYSDATE@!-365))

Open in new window


In all honesty, my tuning goal is to make the client happier.  
I'm afraid the queries provided were not faster. This one was quicker, #a43231051, but when I ran it production it picked up nulls.  When I adjusted it for the nulls, it slowed down.

Thanks for your feedback and time.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
ASKER
MariaHalt

In response to Alex's comment.

 #a43232528 
 Id   Operation             Name     Rows   Bytes TempSpc Cost (%CPU) Time     
0 SELECT STATEMENT              21462        99882   (6) 00:00:04 
1  SORT GROUP BY NOSORT         21462        99882   (6) 00:00:04 
*  2    VIEW                          2853K    59M        99882   (6) 00:00:04 
3    WINDOW SORT                  2853K   152M   197M 99882   (6) 00:00:04 
*  4      TABLE ACCESS FULLMY BIG TABLE  2853K   152M        66098   (7) 00:00:03 
Predicate Information (identified by operation id):
   2 - filter("FILE_DATE"<"NEXT_FILE_DATE")
   4 - filter("DEPT_STUFF_1" IS NOT NULL AND ("STATUS"=9 OR "STATUS"=10 AND
              "FILE_DATE">=SYSDATE@!-365))

Open in new window

The statistics for the dev table were last analyzed 26-JAN-21 10:28:49 PM.

Thanks for the links, I will try to read them today.  
slightwv (䄆 Netminder)

Thanks for that.

Aside from the PIVOT one, the others seem to be about the same plans.

So, if the closest to your results let's focus on this one:  #a43232528 moving forward.

For example, it added dept_stuff_2 to the query.  Do you need null checks on that as well?

Now can you provide the actual indexes with the columns that currently exist on the table.

Want to see that is currently there before we just start suggesting new indexes.

>>In response to Alex's comment.

It doesn't look like it went parallel.  Parallel query can greatly help with Full Table Scans.

Can you provide the results of:
select name, value from v$parameter where name like 'parallel%';
Alex [***Alex140181***]

If she's on EE, then parallelism is included, but yes, asking for the output of the param makes sense ;-)
Especially in this case...

@slightwv: just out of curiosity: You say "I've already said "faster" is not a goal." -> why and what are goals then from your POV?
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
slightwv (䄆 Netminder)

>>If she's on EE, then parallelism is included

I don't recall implying it wasn't.

>>why and what are goals then from your POV?

If faster is the goal, then we are done here.  One FTS has to be faster than two.

If faster was the goal, how to do know when you are done?

I mentioned earlier that if the query takes 5 minutes and the goal is 10 seconds then the goal likely isn't possible without some MAJOR re-engineering and trying to rewrite queries is a complete waste of time.
Alex [***Alex140181***]

Ok, I meant this one, sorry:
Remember "faster" is never a goal.
Can you elaborate on this a bit more?

I don't recall implying it wasn't.
I didn't say that ;-) I just thought "hey, if she tells us, she's on EE, there is no need to look at the v$parameter" ;-)
slightwv (䄆 Netminder)

>>Can you elaborate on this a bit more?

I did elaborate.

>>I just thought "hey, if she tells us, she's on EE, there is no need to look at the v$parameter" ;-)

Sure there is.  The plan she posted didn't show it going parallel even though the hint was there.

So, either the SQL she provided the plan for didn't have the hint
or
something prevented it from going parallel.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
slightwv (䄆 Netminder)

>>Can you elaborate on this a bit more?

Just for fun....  even in the 8i docs:
https://docs.oracle.com/cd/A87860_01/doc/server.817/a76992/ch2_meth.htm

Set Clear Goals for Tuning

Never begin tuning without having first established clear objectives: you cannot succeed without a definition of "success."

"Just make it go as fast as you can" may sound like an objective, but it is very difficult to determine whether this has been achieved. It is even more difficult to tell whether your results have met the underlying business requirements. A more useful objective is: "We need to have as many as 20 operators, each entering 20 orders per hour, and the packing lists must be produced within 30 minutes of the end of the shift."

Keep your goals in mind as you consider each tuning measure. Consider its performance benefits in light of your goals.

Also remember that your goals may conflict. For example, to achieve best performance for a specific SQL statement, you may need to sacrifice the performance of other SQL statements running concurrently on your database.
Alex [***Alex140181***]

Well, you stated:
Remember "faster" is never a goal.
... which is -imho- way too general...
ASKER
MariaHalt

We're on EE.  Is there a specific parallel value you are interested in?
There's a PK on the table and these indexes.
INDEX_NAMETYPEUNIQUENESSCOLUMN
IDX_DEPTIDNORMALNONUNIQUEDEPT_ID
IDX_FILEDATENORMALNONUNIQUEFILE_DATE
IDX_DEPTSTUFF1NORMALNONUNIQUEDEPT_STUFF_1
CIDX_DEPTID_DEPTSTUFF2NORMALNONUNIQUEDEPT_STUFF_2
CIDX_DEPTID_DEPTSTUFF2NORMALNONUNIQUEDEPT_ID

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
MariaHalt

I should make you aware that there is a lot of delete activity on this table.  

slightwv (䄆 Netminder)

>>Is there a specific parallel value you are interested in?

Just like to take a look at all that are set.  Curious why the plan didn't show it using the parallel hint.
ASKER
MariaHalt

 
parallel_min_percent0
parallel_min_servers32
parallel_max_servers64
parallel_instance_group
parallel_execution_message_size16384
parallel_degree_policyMANUAL
parallel_adaptive_multi_userFALSE
parallel_threads_per_cpu1
parallel_min_time_thresholdAUTO
parallel_degree_limitCPU
parallel_force_localFALSE
parallel_servers_target64
parallel_min_degree1


Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
slightwv (䄆 Netminder)

Just for grins, post the plan for this:
explain plan for select /*+ parallel */ from user_objects;
select * from table(dbms_xplan.display);

Open in new window

slightwv (䄆 Netminder)

Cool.   See the "PX COORDINATOR" stuff?  That means it went parallel.

Please confirm the query in #a43232528 still doesn't.

You might try removing the table name and degree and just use the simple /*+ parallel */ hint.
Alex [***Alex140181***]

And maybe you put the parallel hint onto the outermost select from...
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
slightwv (䄆 Netminder)

>>And maybe you put the parallel hint onto the outermost select from...

Shouldn't matter. My guess is she left the '[]' in the hint or had some other syntax issue with the hint.



I loaded up my ee table with over a million junk rows and added 13 other dummy columns with data to make it larger.

This index is used when I run the SQL in #a43232528

drop index test_idx;
create index test_idx on ee(dept_id, status, file_date, dept_stuff_1, dept_stuff_2);

Open in new window


Parallel can only help more.
ASKER
MariaHalt

We have a winner!!!!
with parallel (on prod):  46.607 seconds
without parallel (on prod): 295.669 seconds

May I ask a stupid question, why didn't the optimizer go parallel, without the hint?  It's obviously faster.
ASKER
MariaHalt

With the parallel hint on the outer most select, it went down to 33.241 seconds.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
ASKER
MariaHalt

Another stupid question.  I'm sure the dba will grill me.....how can I defend the use of the parallel hint?  
slightwv (䄆 Netminder)

I would be interested to see the results with my suggested index.

Just moving the hint to the outer query shouldn't have mattered but happy it worked.

>>why didn't the optimizer go parallel, without the hint?  

Can be several reasons.    My guess is the tables were never set up with the option?

You can read about parallelism here:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/vldbg/degree-parallel.html#GUID-1005E1A3-089F-4027-8B2A-EC7297543624

>>I'm sure the dba will grill me.....how can I defend the use of the parallel hint?

The response times speak for themselves.
ASKER
MariaHalt

I don't have permission to add an index to production.  I'll try in the dev environment, but the execution times will be different.

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
slightwv (䄆 Netminder)

>> I'll try in the dev environment, but the execution times will be different.

Get times and execution plans before and after the index.  Make sure the after plan uses the index.

You should be able to gauge the effectiveness.
ASKER
MariaHalt

There was barely any difference on dev with both parallel and the index versus just parallel.  I'm super happy!  It has been a crappy week.  This made me forget that.  Thank you both very much!
slightwv (䄆 Netminder)

Did the plan show it used the index on dev?

How many rows are in dev?
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
slightwv (䄆 Netminder)

As much as I appreciate the points, it was Alex that suggested parallel first.
Alex [***Alex140181***]

@slightwv: Thank you very much :-) That is indeed a very honorable behavior, wow!

@Maria: If you're interested, I'd give you some insight on the parallel stuff...
ASKER
MariaHalt

You were both soooo helpful!  I'm new to Oracle after many years of MS SQL Server.  If either of you are on Wyzant, or similar platform, I'd be interested in learning more.  
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Alex [***Alex140181***]

@Maria: Thank you very much for re-distributing the point ;-) But it would have been fine for me either way...

Here are some more resources on the subject:
https://datacadamia.com/db/oracle/parallel_enable
http://www.dba-oracle.com/t_parallel_hint.htm
http://www.dba-oracle.com/t_parallel_automatic_degree.htm

https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/PARALLEL_DEGREE_POLICY.html#GUID-BF09265F-8545-40D4-BD29-E58D5F02B0E5
!! Before fiddling with this parameter, I strongly advise to test its behavior thoroughly ;-)

>>why didn't the optimizer go parallel, without the hint?  
                                   
                                    Can be several reasons.    My guess is the tables were never set up with the option?
                                   
Yes, there are quite a few pre-requisites to fulfill, in order to gain true auto-parallelism...
In the end, it's up to the CBO (Optimizer) to choose the path to the dark or light side ;-)
Having Oracle RDBMS set up properly, this little CBO Jedi should be right in virtually all case...

May the force be with you...
ASKER
MariaHalt

The dbas totally shot me down.  No parallel hints.  I'll be adding a composite index instead.  Just wanted to vent.  Thanks again, maybe one day I'll get to use it.
slightwv (䄆 Netminder)

>>The dbas totally shot me down.  No parallel hints.

Then the poor performance isn't your problem, now is it?

Feel free to inform them that over-indexing can be bad for performance as well.

Then again, if they actually cared about performance, they wouldn't have a table like that in the first place!!!!!

If they disagree, feel free to have them come talk to me.......  ;)

>> I'll be adding a composite index instead

Just make sure the query uses it!

Execution plans are your best friend!

BUT just like besties, you can't always believe them...  Trust but verify.  Some times the optimizer is wrong.
Your help has saved me hundreds of hours of internet surfing.
fblack61
Alex [***Alex140181***]

If they disagree, feel free to have them come talk to me.......  ;)
... and if they want some more, send them to me ;-)

It's a pity that DBAs act like this :-( So sorry for you...
ASKER
MariaHalt

:)