High Availbility

OS & Virtualization

Sunday, August 24, 2008

Understanding Statspack

This section shows a great deal of information, in a very small amount of space. We can see how much REDO is generated on average every second and for every transaction. Here, I can see that I generate about 5 to 6 KB of redo per second. My average transaction generates just 13 KB of redo. The next bit of information has to do with logical and physical I/O. I can see here that about 1 percent of my logical reads resulted in physical I/O – that is pretty good. I can also see that on average, my transactions perform almost 4,000 logical reads. Whether that is high or not depends on the type of system you have.
In my case, there were some large background jobs executing, so a high read count is acceptable.
Now for the really important information: my parse-related statistics. Here I can see that I do about 16 parses per second and about 0.17 of those are hard parses (SQL that never existed before). Every six
seconds or so, my system is parsing some bit of SQL for the very first time. That is not bad. However, I would prefer a count of zero in this column in a finely tuned system that has been running for a couple of days. All SQL should be in the shared pool after some point in time.

The next section in the above shows us some interesting numbers. The % Blocks Changed per Read shows us that in this case, 99 percent of the logical reads we do are for blocks that are only read, not updated. This system updates only about 1 percent of the blocks retrieved. The Recursive Call % is very high – over 97 percent. This does not mean that 97 percent of the SQL executed on my system is due to 'space management' or parsing. If you recall from our analysis of the raw trace file earlier from SQL_TRACE, SQL executed from PL/SQL is considered 'recursive SQL'. On my system, virtually all work is performed using PL/SQL, other than mod_plsql (an Apache web server module) and an occasional background job, everything is written in PL/SQL on my system. I would be surprised if the Recursive Call % were low in this case.
The percentage of transactions that rolled back (Rollback per transaction %)is very low, and that is a good thing. Rolling back is extremely expensive. First, we did the work, which was expensive.
Then, we undid the work and again, this is expensive. We did a lot of work for nothing. If you find that most of your transactions roll back, you are spending too much time doing work and then immediately undoing it. You should investigate why you roll back so much, and how you can rework your application to avoid that. On the system reported on, one out of every 345 transactions resulted in a
rollback – this is acceptable.

In my mind, the most important ratios are the parse ratios – they get my attention immediately. The soft parse ratio is the ratio of how many soft versus hard parses we do. 99 percent of the parses on this system are soft parses (reused from the shared pool). That is good. If we see a low soft parse ratio, this would be indicative of a system that did not use bind variables. I would expect to see a very high ratio in this field regardless of tools or techniques used. A low number means you are wasting resources and introducing contention. The next number to look at is the Parse CPU to Parse Elapsd. Here, I show about 88 percent. This is a little low; I should work on that. In this case for every CPU second spent parsing we spent about 1.13 seconds wall clock time. This means we spent some time waiting for a resource – if the ratio was 100 percent, it would imply CPU time was equal to elapsed time and we processed without any waits. Lastly, when we look at Non-Parse CPU, this is a comparison of time spent doing real work versus time spent parsing queries. The report computes this ratio with round(100*(1- PARSE_CPU/TOT_CPU), 2). If the TOT_CPU is very high compared to the PARSE_CPU (as it should be), this ratio will be very near 100 percent, as mine is. This is good, and indicates most of the work performed by the computer was work done to execute the queries, and not to parse them.
All in all, in looking at the above section, my recommendation would be to reduce the hard parses even further. There obviously are a couple of statements still not using bind variables somewhere in the system (every six seconds a new query is introduced). This in turn would reduce the overall number of parses done because a hard parse has to perform a lot of recursive SQL itself. By simply removing a single hard parse call, we'll reduce the number of soft parses we perform as well. Everything else in that section looked acceptable. This first section we just reviewed is my favorite part of the StatsPack report, at a glance it gives a good overview of the relative 'health' of your system. Now, onto the rest of the report:

This little snippet gives us some insight into our shared pool utilization. The details shown above are:
❑ Memory Usage – The percentage of the shared pool in use. This number should stabilize in mid-70 percent to less than 90 percent range over time. If the percentage is too low, you are wasting memory. If the percentage is too high, you are aging components out of the shared pool, this will cause SQL to be hard parsed if it is executed again. In a right-sized system, your shared pool usage will stay in the 75 percent to less than 90 percent range.

❑ SQL with executions>1 – This is a measure of how many SQL statements were found in
the shared pool that have been executed more than once. This number must be considered carefully in a system that tends to run in cycles, where a different set of SQL is executed during one part of the day versus another (for example, OLTP during the day, DSS at night).
You'll have a bunch of SQL statements in your shared pool during the observed time that were not executed, only because the processes that would execute them did not run during the period of observation. Only if your system runs the same working set of SQL continuously will this number be near 100 percent. Here I show that almost 80 percent of the SQL in my shared pool was used more than once in the 13 minute observation window. The remaining 20 percent was there already probably – my system just had no cause to execute it.

❑ Memory for SQL w/exec>1 – This is a measure of how much of the memory the SQL you
used frequently consumes, compared to the SQL you did not use frequently. This number will in general be very close to the percentage of SQL with executions greater than one, unless you have some queries that take an inordinate amount of memory. The usefulness of this particular value is questionable.
So, in general you would like to see about 75 to 85 percent of the shared pool being utilized over time in a steady state. The percentage of SQL with executions greater than one should be near 100 percent if the time window for the StatsPack report is big enough to cover all of your cycles. This is one statistic that is affected by the duration of time between the observations. You would expect it to increase as the amount of time between observations increases.