A discussion on remote database administration (remote dba) & other news items that catch my attention

Current Articles | RSS Feed RSS Feed

The Sin of Band-Aids in Database Tuning

Posted by Michael Corey on Mon, Jun 23, 2008 @ 08:37 AM

I read an interesting blog entry titled “The Sin of Band-Aids” by the Oracle Alchemist. It started like this…..

Let’s take a trip to the past…we weren’t actually there, but we can pretend. Let’s look into a conversation between a DBA, developers, and management at some time ‘way back when.’

    Management: Our system has been performing poorly for months now.
    Developers: We didn’t change anything.
    DBA: I didn’t change anything.
    Management: We need a fix.
    Developers: We don’t have time to go through all our code and tune it.
    DBA: *frantically types: alter system set optimizer_index_caching = 35; alter system set optimizer_index_cost_adj = 25;* Fixed!


God does that sound familiar. How many times as a DBA, have we made an adjustment to the database as a Band-Aid fix to an application problem. We tell management it’s a band-aid, that eventually wont work, but thed don’t here that part.

The blog entry goes on to say…

Fast forward…

Queries are not performing as expected. Sometimes a query will run fine, sometimes it runs horribly. Sometimes a combination of queries and users results in massive buffer busy waits or ‘latch: cache buffers chains’ waits. Nothing is working and so finally, a consultant is called.

You’ve probably guessed that I was the consultant. Move to the head of the class! In particular, there were three or four main offenders; queries that would sometimes perform well, then all of a sudden ‘turn horrible.’ The core of the problem was truly unstable execution plans.

Using ASH I was able to go through different times of day and compare reports from times where the system performed well with times the system performed horribly. Doing so allowed me to compare SQL_IDs between different times of day to see which queries seemed to be going through dramatic changes. Once a SQL_ID was found that experienced such a dramatic change, I used DBMS_XPLAN.DISPLAY_AWR to find all the different iterations of the query.

In case you’ve never used it, DBMS_XPLAN.DISPLAY_AWR is very useful. It allows you to input only a SQL_ID, and Oracle will show you the explain plans for that SQL_ID as recorded in the AWR. For instance:
select * from TABLE(dbms_xplan.display_awr(’95t7cmj84u4jz’));

The next step was to look through the results to find any inconsistencies large enough to result in huge differences in performance. In this situation, I found that ‘good’ explain plans were using decent indexes, NESTED LOOPs, and low overall bytes. The ‘bad’ explain plans were using index joins, where an index fast full scan would join with an index range scan into a view (e.g. index$_join$_003). These plans resulted in many more block touches and much more CPU used.

Using Grid Control, I looked at the timeline for each of the hash values of the SQL_ID and saw that the poorly performing plans lined up with their bad performance periods.

Of course, now the burning question: what can be done about this?

Well it’s the Oracle Alchmist blog entry not mine, only right that you go to the source to see the answer to the question and also his conclusions.

http://www.oraclealchemist.com/oracle/the-sin-of-band-aids/

 

Posted by Michael Corey

www.ntirety.com

 

 

 

Tags: ,

COMMENTS

On Plaxo, Mark Farnham Made the following comment on this blog Entry. I think its dead on right... 
 
I agree with most of the sentiments in this article, but I would tweak the title a bit. I think the real sin is calling the band-aid a fix, as if you're done. (And sometimes the band-aid is more like a tourniquet.) So at the point where the declaration "FIXED" was made, if we continue the ER analogy, the Doctor should be saying, okay, we've stopped the bleeding for now. We need to discuss options for preventing infections and saving the leg. As for how to proceed after that to provide future performance stability and predictability, there are books full of advice out there. Capacity and scaling planning and workload management can help you navigate away from getting wounds in the first place, and Method-R (Millsap, et. al) is probably the best way to choose which wounds to fix and what will be the most effective way to fix them when you have an active performance problem (wound). 

posted @ Monday, June 30, 2008 11:17 AM by Michael Corey


Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics

Receive email when someone replies.