May 4, 2014

Scaling application with paginated query using rownum

There was a case where one of the application modules getting slower day by day as the usage and data increased. The module is a mail like system where we show 20 items per page. The initial design was not that good, we fetched all the items from database and then paginated in application layer - 20 list item per page. This caused slower response for loading the inbox, specially for those people who had a large number of items.

Thought this can be improved if we fetch selective number of records from database. We show 20 records per page, so we should only try to bring 20 records, not all at the very beginning. For the first page, we will bring records from 1 to 20, for the next page (2nd page), we will bring records from 21 to 40 and so on. This helped a lot. The initial response time decreased from several minutes to just few seconds! I loved the response time graphs (see below) after we made the changes.


/** Old List query **/
SELECT mbox.id mbox_id, mbox.mailbox_type mbox_mailbox_type, mbox.read_date mbox_read_date, mbox.status mbox_status,...
FROM mailbox mbox
LEFT JOIN message msg on (mbox.message_id = msg.id)
LEFT JOIN user u ON (u.id = msg.sender_id )
LEFT JOIN recipient r on (r.message_id = msg.id AND r.idx=0)
LEFT JOIN user ur ON (ur.id = r.target_user_id)
WHERE mbox.user_id= :user
AND mbox.folder_id= :folder
AND msg.test = :test
AND msg.updated >= :dt
ORDER BY msg_updated DESC;



/** New paginated List query **/
SELECT *
  FROM (select /*+ FIRST_ROWS(20) */
  a.*, ROWNUM rnum
      from ( SELECT mbox.id mbox_id, mbox.mailbox_type mbox_mailbox_type, mbox.read_date mbox_read_date, mbox.status mbox_status,...
              FROM mailbox mbox
              LEFT JOIN message msg on (mbox.message_id = msg.id)
              LEFT JOIN user u ON (u.id = msg.sender_id )
              LEFT JOIN recipient r on (r.message_id = msg.id AND r.idx=0)
              LEFT JOIN user ur ON (ur.id = r.target_user_id)
              WHERE mbox.user_id= :user
              AND mbox.folder_id= :folder
              AND msg.test = :test
              AND msg.updated >= :dt
              ORDER BY msg_updated DESC ) a
      where ROWNUM <= 20 )
WHERE rnum  >= 1;


Similarly, the Count queries were improved fixing the count to 100. In that case, if there is more than 100 new items in inbox, we say 100+.

/** Old Count query **/
SELECT COUNT(id) inbox_lo
FROM mailbox mbx 
JOIN message msg ON (mbx.message_id = msg.id)
LEFT JOIN user l ON (l.id = msg.sender_id)
WHERE mbx.login_id = :login
AND mbx.folder_id = :folder
AND mbx.mailbox_type = :mailbox
AND mbx.status= :status
AND msg.test = :test
AND mbx.created >= :dt;


/** New Count query **/
SELECT /*+ FIRST_ROWS(100) */ COUNT(id) inbox_lo
FROM mailbox mbx 
JOIN message msg ON (mbx.message_id = msg.id)
LEFT JOIN user l ON (l.id = msg.sender_id)
WHERE mbx.login_id = :login
AND mbx.folder_id = :folder
AND mbx.mailbox_type = :mailbox
AND mbx.status= :status
AND msg.test = :test
AND mbx.created >= :dt
AND ROWNUM <= 100;


The response time improved amazingly - users now feel the difference!

***

Apr 28, 2014

TNS-12599: TNS:cryptographic checksum mismatch

After upgrading to 11.2.0.3.0, we had been receiving this error "TNS-12599: TNS:cryptographic checksum mismatch" quite frequently.

Found related information in Oracle support. The thing is - when Enterprise Manager logs in or out from a 11.1 (11gR1) or 11.2 (11gR2) target database or as the OMS database repository, it causes several TNS-12599 errors in the alert.log of the respective database.

This is for the bug 9953045.

Workaround:

1. On the target database side, set SQLNET.ENCRYPTION_TYPES_SERVER= (3DES168)  in sqlnet.ora

2. Patches for bug 9953045 will fix the problem.



Apr 2, 2014

Replacing SQL literals with Global Temporary Table backfired at the beginning

I always emphasize on understanding data and business domain for query optimization. One must have technical skills in combination with the domain knowledge. Other than the blend, who could tune at this level!

We planned to tune some of the queries where we use literals and can't use bind variables as those are inside 'IN clause', and we don't exactly know how many values would be there (dynamically added by the application). We decided to replace the literals in 'IN clause' by Global Temporary Table (GTT). That change made every other queries perform better or at least same as before other than the SQL I am going to talk below. Changing literals in 'IN clause' caused to SQL to come up with very wrong execution plan and blundered our system's stability - the frequency of execution was very high for the SQL, we could not afford the bad plan.

After analyzing the plan, I found that the SQL with Global Temporary Table (GTT) doing unexpected Nested Loops at the very early state of execution order which was causing more data to process in other steps for the result. We know that, the join with GTT should be done at the later or last stage of execution order. And the best way (for the best selectivity) is to start execution with the underlying table mv_caseload_priv (through the index MV_PGM_CASE_I) we have in the client_access_role view. I drew the plan by hand and attached below for a clear understanding what was going on. The Nested Loop (NL, marked as red circle) was very unexpected, also the execution start point (marked as red cross) was not optimum.


The original SQL and the plan, only problem was not having bind variables, sometimes this 
ended up with terrible execution plan for some bind values in the SQL, was uncontrollable.

SQL > SELECT DISTINCT login_id
FROM client_access_role car
WHERE car.pgm_id = :pgm
AND car.role IN (21,22,112,......);


Elapsed: 00:00:00.11
------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                         | 11900 |   429K|   678   (1)| 00:00:01 |       |       |
|   1 |  HASH UNIQUE                             |                         | 11900 |   429K|   678   (1)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                              |                         | 11900 |   429K|   676   (1)| 00:00:01 |       |       |
|   3 |    INLIST ITERATOR                       |                         |       |       |            |          |       |       |
|*  4 |     INDEX RANGE SCAN                     | AR_ROLE_ACL_I           | 16724 |   130K|    31   (0)| 00:00:01 |       |       |
|   5 |    NESTED LOOPS                          |                         |  9589 |   271K|   643   (1)| 00:00:01 |       |       |
|   6 |     MAT_VIEW ACCESS BY GLOBAL INDEX ROWID| MV_CASELOAD_PRIV        |   279 |  3348 |    83   (0)| 00:00:01 | ROWID | ROWID |
|*  7 |      INDEX RANGE SCAN                    | MV_CASE_PRIV_PGM_I      |   279 |       |     3   (0)| 00:00:01 |       |       |
|*  8 |     INDEX RANGE SCAN                     | CA_CASELOAD_LOGIN_ACL_I |    34 |   578 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CA"."ACL_ID"="R"."ACL_ID")
   4 - access("R"."ROLE"=21 OR "R"."ROLE"=22 OR "R"."ROLE"=112)
   7 - access("CP"."PGM_ID"=TO_NUMBER(:PGM))
   8 - access("CP"."CASELOAD_ID"="CA"."CASELOAD_ID")
Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
         97  consistent gets


Replaced 'IN clause' with a Global Temporary Table for generating fixed sort of plan, which 
backfired, caused severe performance problem for the unexpected terrible plan.

SQL> SELECT DISTINCT login_id
FROM client_access_role car
WHERE car.pgm_id = :pgm
AND car.role IN (SELECT value FROM gtt_in_list_2);


Elapsed: 00:00:18.98
Execution Plan
----------------------------------------------------------
Plan hash value: 1525097291
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name            | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                 |     1 |    50 |    34   (3)| 00:00:01 |       |       |
|   1 |  HASH UNIQUE                    |                 |     1 |    50 |    34   (3)| 00:00:01 |       |       |
|   2 |   NESTED LOOPS                  |                 |     1 |    50 |    33   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                 |                 |     1 |    38 |    32   (0)| 00:00:01 |       |       |
|   4 |     NESTED LOOPS                |                 |     1 |    21 |     1   (0)| 00:00:01 |       |       |
|   5 |      INDEX FULL SCAN            | GTT2_VAL_I      |     1 |    13 |     0   (0)| 00:00:01 |       |       |
|   6 |      TABLE ACCESS BY INDEX ROWID| ACL_ROLE        |     1 |     8 |     1   (0)| 00:00:01 |       |       |
|*  7 |       INDEX RANGE SCAN          | AR_ROLE         |  3787 |       |     1   (0)| 00:00:01 |       |       |
|   8 |     TABLE ACCESS BY INDEX ROWID | CASELOAD_ACCESS |   137 |  2329 |    31   (0)| 00:00:01 |       |       |
|*  9 |      INDEX RANGE SCAN           | CASE_ACC_ACL    |   137 |       |     2   (0)| 00:00:01 |       |       |
|  10 |    PARTITION HASH ITERATOR      |                 |     1 |    12 |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |     INDEX RANGE SCAN            | MV_PGM_CASE_I   |     1 |    12 |     1   (0)| 00:00:01 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("R"."ROLE"="VALUE")
   9 - access("CA"."ACL_ID"="R"."ACL_ID")
  11 - access("CP"."PGM_ID"=TO_NUMBER(:PGM) AND "CP"."CASELOAD_ID"="CA"."CASELOAD_ID")
Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
    2817221  consistent gets


The tuned SQL by 'ORDERED' hint to guide the optimizer doing correct join order as we know best.

SQL> SELECT /*+ ORDERED */ DISTINCT login_id
FROM client_access_role c
JOIN gtt_in_list_2 x ON (c.role = x.value)
WHERE c.pgm_id = :pgm;



Elapsed: 00:00:00.08
Execution Plan
----------------------------------------------------------
Plan hash value: 3611759149
-------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                         |     1 |    50 |   771   (7)| 00:00:01 |       |       |
|   1 |  HASH UNIQUE                              |                         |     1 |    50 |   771   (7)| 00:00:01 |       |       |
|*  2 |   HASH JOIN                               |                         |     1 |    50 |   770   (7)| 00:00:01 |       |       |
|   3 |    INDEX FULL SCAN                        | GTT2_VAL_I              |     1 |    13 |     0   (0)| 00:00:01 |       |       |
|*  4 |    HASH JOIN                              |                         |   311K|    10M|   762   (6)| 00:00:01 |       |       |
|   5 |     NESTED LOOPS                          |                         |  9589 |   271K|   643   (1)| 00:00:01 |       |       |
|   6 |      MAT_VIEW ACCESS BY GLOBAL INDEX ROWID| MV_CASELOAD_PRIV        |   279 |  3348 |    83   (0)| 00:00:01 | ROWID | ROWID |
|*  7 |       INDEX RANGE SCAN                    | MV_CASE_PRIV_PGM_I      |   279 |       |     3   (0)| 00:00:01 |       |       |
|*  8 |      INDEX RANGE SCAN                     | CA_CASELOAD_LOGIN_ACL_I |    34 |   578 |     2   (0)| 00:00:01 |       |       |
|   9 |     TABLE ACCESS FULL                     | ACL_ROLE                |   571K|  4467K|   105  (23)| 00:00:01 |       |       |
-------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("R"."ROLE"="X"."VALUE")
   4 - access("CA"."ACL_ID"="R"."ACL_ID")
   7 - access("CP"."PGM_ID"=TO_NUMBER(:PGM))
   8 - access("CP"."CASELOAD_ID"="CA"."CASELOAD_ID")
Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        705  consistent gets






And we are back with the super performance!
***

Mar 31, 2014

What happened when we received ORA-04030: out of process memory - trying to allocate 824504 bytes (pga heap,kco buffer)


Here goes the story in short -

We had to bump up Oracle memory to address buffer cache growth. So, in our Solaris database box, we increased Oracle project memory in /etc/project from 32 GB to 64 GB - we had enough physical memory in the server - 128 GB.

Then we increased SGA_MAX_SIZE from 28 GB to 36 GB and MEMORY_MAX_SIZE 28 GB to 42 GB. We did not set the target parameters as we had not been not using Automatic Memory Management (AMM) or Automatic Shared Memory Management (ASMM).

Then from the following day, we started receiving errors while running the RMAN backup jobs -


ORA-00603: ORACLE server session terminated by fatal error
ORA-27103: internal error
Solaris-AMD64 Error: 9: Bad file number
Additional information: 108
Additional information: 4456448

Also, receiving ORA-04030 error in alert_log file.

ORA-04030: out of process memory when trying to allocate 824504 bytes (pga heap,kco buffer)
ORA-27103: internal error
Solaris-AMD64 Error: 9: Bad file number

It was clear to me that some memory allocation issue we hit as auto allocation of PGA memory was failing. It's production box, and internal error, so interpretation should come from Oracle Support. We opened SR. Sent several trace  and alert logs. At the beginning, they thought that it's OS resource allocation problem, then support thought it's a RMAN backup problem and possibly a bug, and suggested several alternatives. That did not help at all. Finally they forwarded the SR to Solaris OS Memory Engineers. They found something interesting for us.

After 3 weeks back and forth information with Oracle Support, we have been told that, by setting SGA_MAX_SIZE & MEMORY_MAX_SIZE we had enabled Dynamic Intimate Shared Memory (DISM) feature of Oracle Solaris. And this may not work in Solaris 10 x86 based system, also not recommended. So, we need to disable DISM by NOT setting SGA_MAX_SIZE, SGA_TARGET, MEMORY_MAX_SIZE and MEMORY_MAX_TARGET.

The steps we followed to disable DISM -

1. Setting individual SGA components (like shared pool, buffer cache, large pool etc.)

2. Create a PFILE from the SPFILE

3. Removed all the four parameters SGA_MAX_SIZE, SGA_TARGET, MEMORY_MAX_SIZE and MEMORY_MAX_TARGET from PFILE and let Oracle compute the SGA_MAX_SIZE aggregating all individual SGA components

4. Startup using the edited PFILE

5. Creating SPFILE from the edited PFILE

6. Shutdown & restart again so that SPFILE takes effect


Details of DISM could be found in the following doc - Oracle support referred this.

http://www.oracle.com/technetwork/articles/systems-hardware-architecture/using-dynamic-intimate-memory-sparc-168402.pdf


***

Mar 22, 2014

Database query runs slow in second execution - why?

When a query returns result quickly in the first run but the same query runs slow in second execution or successive executions, it is probably for the cardinality feedback the query used for the other executions. We had a scenario recently.

The first execution returns in less than a second - 78 records. When the query goes for successive executions, it takes over 5 minutes to return the same result. This slow execution of the query slowed down one of the application servers and caused huge number of thread stuck and bog down the server at the end. I thought, this happened for the wrong cardinality feedback from the first execution of the query.

To become very sure about this, I did the following tests - 

SQL> alter session set statistics_level=all;
SQL> set autotrace traceon;

SQL> SELECT l.id, l.login_name, l.first_name, l.last_name, l.title_id, l.title 
FROM table_x l, view_y pr 
WHERE l.id = pr.login_id
AND pr.pgm_id IN (68894) 
AND l.enroll = 0 
AND l.status = 1 
ORDER BY l.last_name, l.first_name, l.login_name, l.title;

Elapsed: 00:00:00.78

Statistics
----------------------------------------------------------
          8  recursive calls
          3  db block gets
      16074  consistent gets
        629  physical reads
          0  redo size
       5783  bytes sent via SQL*Net to client
        575  bytes received via SQL*Net from client
          7  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         79  rows processed

SQL> /

Elapsed: 00:05:49.29

Statistics
----------------------------------------------------------
       4069  recursive calls
          8  db block gets
      47259  consistent gets
     518490  physical reads
        272  redo size
       5783  bytes sent via SQL*Net to client
        575  bytes received via SQL*Net from client
          7  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         79  rows processed

Note
-----
   - cardinality feedback used for this statement


To get rid of the cardinality feedback for this test case, I used hint with the query to tell the optimizer not to use the feedback. FYI - cardinality feedback could be turned off in session level and instance level by using ALTER SESSION SET "_OPTIMIZER_USE_FEEDBACK"=FALSE;


SQL> SELECT /*+ OPT_PARAM('_OPTIMIZER_USE_FEEDBACK' 'FALSE') */ l.id, l.login_name, l.first_name, l.last_name, l.title_id, l.title 
FROM table_x l, view_y pr 
WHERE l.id = pr.login_id
AND pr.pgm_id IN (68894) 
AND l.enroll = 0 
AND l.status = 1 
ORDER BY l.last_name, l.first_name, l.login_name, l.title;


Elapsed: 00:00:00.65

Statistics
----------------------------------------------------------
          0  recursive calls
          3  db block gets
      16069  consistent gets
          0  physical reads
         64  redo size
       5783  bytes sent via SQL*Net to client
        575  bytes received via SQL*Net from client
          7  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         79  rows processed


A word about cardinality feedback

Cardinality feedback is introduced in Oracle 11g to make the optimizer working better. To describe the feature in short - when a query is being executed repetitively, it sends cardinality feedback if the estimated and actual rows have been revealed different in earlier executions. Anyways, we had a scenario related to cardinality feedback. We had complex SQLs which had been running in 1 to 2 seconds for the first run then suddenly it started taking several minutes!

Why cardinality feedback was being used?

* We have very complex underlying predicates in the view - that's could be one reason for not estimating correct cardinality in the execution phase. For the complex view, when many rows returns in intermediate steps in the executions, that makes the cardinality estimation even worse in our case. 

* Stale statistics could be another reason for wrong cardinality estimation. But that was not in our case, I checked that, there was no stale statistics for the tables used in the query.


So, for the pr.pgm_id 68894, the first execution luckily ends up with best execution plan. Oracle optimizer is intelligent - it reports a mismatch in cardinality estimate in the first run and gives feedback for successive runs. The cardinality feedback drove the query to go for full table scans on the next run on several tables for the underlying complex view "view_y". For the pr.pgm_id 68894, the query had to process about 10 GB of data (detailed could be found in SQL Monitoring in OEM) when it went for full table scans. As it had to process that amount of data, it did not use regular buffer cache, instead it used available memory in PGA and temporary tablespace to hold and process those huge amount of data. That aggravated the situation slowing down for doing excessive physical reads. 

Now we'll fix this in the application code - we don't want to turn the cardinality feedback feature off at instance level for such discrete instance for a single query. 


***

Jan 31, 2014

Materialized view fast refresh slowness for not sharing child cursors in flashback queries

I have not found any good writeup on this in web which could lead me to some solution of this problem. I found my own ways to solve it. Thought to write down some of it so that it might be helpful for others. 

Let me try to put the long story in a vey short way.  

If I summarize, these were the sequence of events. One of my friends hit this problem and asked me to take a look at the database. There was a Materialized view which supposed to be refreshed in every few minutes. 

- I found that the following query ran slow (took 30 seconds and more) which is a system generated query related to materialized view refresh.

update sys.sumpartlog$ s set s.timestamp = :1, s.scn = :2  
where  rowid in  (select rowid from sumpartlog$ AS OF SNAPSHOT(:3) s1  
where s1.bo# = :4   and s1.timestamp >= to_date('4000-01-01:00:00:00','YYYY-MM-DD:HH24:MI:SS'))

- From several database reports like AWR, ADDM - I found that, most of the database time was being spent on "soft parse" and the only top queries was the materialized view refresh, was taking more than 50% of database resources most of the times. 

- To investigate the soft parse issue, I queried the V$SQL dictionary view. I found that, there were more than 5000 child cursors for the parent cursor. I collected the sql_id of the query. However, there were three other queries having huge humber of child cursors. 

- To investigate why child cursor were not being shared, I queried V$SQL_SHARED_CURSOR dictionary view. Out of about 50 reasons for not sharing child cursors, I found that FLASHBACK_CURSOR attribute was being set to 'Y' which means cursor not shared due to flashback query. 

- I knew that in flashback queries, the cursors are not shared, that's how it works and Oracle think this the right way of doing flashback queries. 

- Materialized view usage the flashback query techniques which is being used to look at old data for refresh operations. 

- Then I looked at the Oracle support documentations and found that, some people report this as an Oracle problem. From some earlier release, Oracle started using child cursor concepts I believe. Oracle workaround is to upgrade to 11.2.0.2.0 version where child cursors has come max limits and then automatically discarded. Oracle Doc ID: 1051346.1

- I knew that, in the latest Oracle release, database maintains a list of child cursors, if it crosses the threshold of 100, it discards the child cursors. 

- I knew that, too many child cursors sometimes could slow down soft parsing as it needs to loop through the shared pool hash chain to find the right cursor for the right plan. 

- Then I thought, if I could get rid of those excessive cursors, the query parse time would be reduced, and thus i improve response time. 

- Then I query the V$SQLAREA to find the query address and hash value of those queries having over 5000+ cursors. I wrote a cron job to purge those parent cursors from shared pool periodically - once in a hour using the Oracle dbms_shared_pool package. 

DECLARE

 addhash varchar2(50);
BEGIN
 select address||','||hash_value into addhash from v$sqlarea where sql_id = 'chbgkk6myytz';
 dbms_shared_pool.purge(addhash,'C');
END;
/
- That solved this particular parsing problem for the Oracle internal queries used in refreshing Materialized view.