Performance optimization, Sybase

I have to improve the query performance of the below query as it is taking more than 4 minutes to return data

LabResult has 46 million records!

trnnx_IE1LabResult       is on two columns (PatientId, Test) and it is clustered

set showplan on
select      CollectionDate,
from LabResult

LastChangedDate >=  '10/01/2011'
and LastChangedDate <=  getdate()
and ResultValue is not null
and CollectionDate is not null

and patientid in



    STEP 1
        The type of query is DECLARE.

Total estimated I/O cost for statement 1 (at line 0): 0.

Optimized using Serial Mode

    STEP 1
        The type of query is SELECT.

    2 operator(s) under root

       |ROOT:EMIT Operator (VA = 2)
       |   |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0)
       |   |
       |   |   |SCAN Operator (VA = 0)
       |   |   |  FROM TABLE
       |   |   |  LabResult
       |   |   |  Using Clustered Index.
       |   |   |  Index : trnnx_IE1LabResult
       |   |   |  Forward Scan.
       |   |   |  Positioning by key.
       |   |   |  Keys are:
       |   |   |    PatientId ASC
       |   |   |  Using I/O Size 16 Kbytes for data pages.
       |   |   |  With LRU Buffer Replacement Strategy for data pages.

Total estimated I/O cost for statement 2 (at line 2): 10178.

Let me know how can I proceed.
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Tomas Helgi JohannssonCommented:

For this query create an index on these columns
(LastChangedDate, patientid, ResultValue, CollectionDate)

And you will see some performance improvements.

Remember. Either use the columns in an index in your queries where clauses or
create index that matches those columns.

     Tomas Helgi
Joe WoodhousePrincipal ConsultantCommented:


Put the ~200 patient IDs in a table (temp table is ok) and join to that table. Put a clustered index on the temp table too.


Make sure you've run update index statistics on LabResult. On 46 million rows this won't be fast but it won't take all weekend either.


Per the answer above, if you create that second index, make sure it is nonclustered.


Why would ResultValue or CollectionDate ever be NULL? If they are, are they really adding any information here to this table? Consider making this into two tables, one containing only fully useful data, and the other containing rows that have a NULL in either of these columns. You could always build a VIEW that does a UNION ALL on the two tables if you really do need to be able to see both data sets at the same time.


What's your performance target?
Ryan McCauleyData and Analytics ManagerCommented:
Clustered indexes should only be placed on columns that are perpetually ascending - I don't know much about your database, but I suspect that PatientID isn't a good candidate for clustering. If the clustering key isn't perpetually ascending, it can cause a large amount of data movement as you add values later, or can lead to a large amount of index fragmentation as values are added to the "middle" of the index, because they'll assumably spread across the PateintID range.

Though this article is Microsoft SQL Server specific, the considerations about clustered indexes hold across any database system, and you should consider the suggestions when choosing the cluster key:

That said, the first expert recommended clustering starting with LastChangedDate, which is presumably much more upward-marching than just PatientID alone. While still not an ideal cluster key, it would help keep the data much more organized than the current key. In addition to better organization, the recommended key would quicken this query quite a bit because it makes filtering on PatientID quick as well.
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

rbhargawFounderAuthor Commented:

I have recently joined this company 1 week back and this is the task I have been assigned to.
Actually this query is part of bigger problem. Basically there is one application which  calls a query to get the list of patient id and then each patient id will call with another query(labresult) to get the data.This is taking long time , around 8 minutes. So I suggested to combine 2 queries into Stored procedure so to get all data with one hit to database , rather than 200 times and then rewriting the poorly written 2nd query properly(labresult)

The second query (labresult) was used under  union at 5 places,so I can combine all 5 labresult table union into one temp table and then iterate through lesser number of rows, To get less rows is the query I have placed in this question!! So I was thinking of non clustered indexes on all columns as suggested and putting patientid in temp table and looping the query to get the lab results faster.So I am trying to get idea of ways to optimize the queries without touching much business logic . I need to check about NULLS in check value/Collection date, since this is old code so probably they had null values. I will check on Monday about the index statistics as I don't have weekend access :) Let me know if I need to post Stored proc too?
rbhargawFounderAuthor Commented:

 I am assuming patientid is incremental but will let you know on Monday as I cant check in weekend and thanks for the link, I will go though them!
Ryan McCauleyData and Analytics ManagerCommented:
PatientID is probably incremented with each new patient, but in a table that contains test results, new results can belong to any patient, so the new rows aren't always added to the end of the table. In the "Patients" table, it might make sense to cluster based on PatientID since it's constantly increasing in that context and new rows aren't ever added using smaller PatientID values.
Joe WoodhousePrincipal ConsultantCommented:
Recall, too, that clustered indexes are the best for supporting joins, and I am guessing PatientID is the most frequent join column here.
Tomas Helgi JohannssonCommented:

If I would have recommended to change the clustering index
I would have said so in my first comment.  
I only recommended adding an index (non clustering) on these
columns (LastChangedDate, patientid, ResultValue, CollectionDate)  to speed up this and hopefully other queries.
So I think ryanmccauley has misunderstood my first comment and I would have to
agree with Joe_Woodhouse's last comment.

    Tomas Helgi
rbhargawFounderAuthor Commented:
Even though I have created non clustered index on (LastChangedDate, patientid, ResultValue, CollectionDate), running them inside the stored procedure is calling the clustered index instead and giving below information.

 -- Optimized at runtime using Deferred Compilation.

If I run just the select query, I can use non clustered index, do I need to move the small query to another procedure?
Tomas Helgi JohannssonCommented:
Either in create procedure or execute procedure
you should use the parameter "with recompile" if you have created
the procedure before the index to instruct the Adaptive Server to compile a new plan
which is used in subsequent executions.

      Tomas Helgi
rbhargawFounderAuthor Commented:

I do have "with recompile" option in the create stored procedure
Joe WoodhousePrincipal ConsultantCommented:
Try the other suggestions we've made in this thread. So long as update index statistics has been run recently on the table, ASE is telling you this is the best index to use. You're almost certainly looking at a SQL rewrite...
rbhargawFounderAuthor Commented:
As per the Database guys , update index statistics is run on every weekend.

I have asked them to put non clustered index, lets see if that improves any performance, will know by tomorrow, as the changes will be done in night
rbhargawFounderAuthor Commented:
Hey Guys, The index on these columns did not make much difference, Do you want me to put down the stored procedure?
Ryan McCauleyData and Analytics ManagerCommented:
The index may not change performance much, but can you get a new copy of the execution plan so you can see if it's even being used? I'm not sure about the specifics of Sybase, but I know SQL Server is particular about the tipping points where it shifts from an index seek/scan into a table scan, which can lead to some unexpected results. I'd be curious to see if the execution plan is the same with the index in place, meaning the DB is ignoring it - maybe a query hint or a slight restructure of the query could improve things.
rbhargawFounderAuthor Commented:
The newly created index is getting used, however I am checking the stored proc and problem could be in different area. There is one more table with 26 million rows which could be the bottleneck. I initially assumed that has only thousands of rows, should have checked in production!Sorry new to this environment and taking time.I will get back on this issue.
Ryan McCauleyData and Analytics ManagerCommented:
a table with 10,000x as many rows as you expected could definitely cause some problems :) That said, even billions of rows isn't too many as long as your query is selecting them appropriately - in this case, doing an index seek for the rows it wants as opposed to doing a index/table scan to find rows that match the criteria given.
Tomas Helgi JohannssonCommented:

I agree, joining two large tables together requires that suitable indexes are on both tables (A and B ) otherwise you will end up having a tablescan on the one that doesn't have the right index matching your query (let say it is the table B).
In the above case you have a tablescan in table B for each row that is fetched in table A
resulting in poor performance.

    Tomas Helgi
rbhargawFounderAuthor Commented:
I have attached the Stored procedure, can you please have a look and let me know if this can be improved.

#temp_Patient  will contain around 150-200 rows
trntb_LabResult  contain 46 million rows
trntb_TxEventOrder  contain 26 million rows

Index on trntb_TxEventOrder

trnnx_IF1220TxEventOrder       PatientId      clustered
trnnx_IF1025TxEventOrder       ChecklistInstanceItemId      nonclustered
trnnx_IF1219TxEventOrder       TxEventId      nonclustered
trnnx_PKTxEventOrder       TxEventOrderId      nonclustered, unique
trnnx_TxEventOrderDate       OrderDate      nonclustered

Index on trntb_LabResult

trnnx_IE1LabResult       PatientId, Test      clustered
trnnx_IF1021LabResult       TxEventOrderId      nonclustered
trnnx_PKLabResult       LabResultId      nonclustered, unique
trnnx_TestAbbrevCollectionDate       TestAbbreviation, CollectionDate      nonclustered
trnnx_TestCollectionDate       Test, CollectionDate      nonclustered
trnnx_LastChangedDate       LastChangedDate      nonclustered
IDX_LastChangedDate       LastChangedDate, PatientId, ResultValue, CollectionDate      nonclustered
rbhargawFounderAuthor Commented:
Attaching the stored procedure
Joe WoodhousePrincipal ConsultantCommented:
General comments:

Don't use derived tables when you don't need to. (SELECT FROM (SELECT FROM ...) That forces a join order (optimiser must resolve the derived table first) which may not be ideal. Every time the code does that, it doesn't need to.

Which version of ASE is this? If you're using anything earlier than ASE 15.0.2, the optimiser isn't going to have much information about those temp tables and the indexes on them. If you're in an old version there are workarounds we should talk about.

Lots of hard-coded IN lists... I don't know how much that hurts, but I know it's a serial lookup on each row and for the rowcounts you're talking about, saving even 0.01s per row is something you want.

Can those UNIONs be made into UNION ALLs? Is there any possibility of duplicates between each arm of the UNIONs?

Suggest CREATE TABLE #temp table... INSERT SELECT be instead turned into SELECT INTO.

Those are the quick easy comments without knowing what the proc does and why. :)

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Tomas Helgi JohannssonCommented:

Do you have an index like this (PatientId, TxEventId) on trntb_TxEvent.
If not then I suggest you add one to improve the query that you use in
the insert in #temp_patient.

And if you are going to use this query

select      tlb.CollectionDate,
from trntb_LabResult tlb, #temp_Patient tp
where  tlb.Patientid = tp.patientID
and tlb.LastChangedDate >=  @startDate
and tlb.LastChangedDate <=  getdate()
and tlb.ResultValue is not null
and tlb.CollectionDate is not null

 instead of the query that you first provided then the index which I suggested earlier should be a little bit modified
( patientid,LastChangedDate, ResultValue, CollectionDate) nonclustered

And on the table trntb_TxEventOrder there should be an index like this

(PatientId,OrderResultCollectionDate, OrderResultValue,CheckListTestCode) nonclustered

    Tomas Helgi
rbhargawFounderAuthor Commented:
The above recommendation helped a lot in reducing couple of minutes! I could not proceed much with the issue as this went into backlog from business but got a fair amount of idea from EE to work later on the issue. Thanks to everyone involved!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.