Thinking Database Performance: June 2008

Jun 30, 2008

Bulk Delete: Object analysis and High water mark adjustment in Oracle

We have some summary like tables where data are populated over time as users progress with using the system. The data needs to be cleaned up periodically.

Today I did that kind of stuffs to clean one summary table - I deleted approximately 1.5 million records!

So, what else need to do after that bulk deletes? Well, there are several useful things that need to considered for the performance.

* Reset the Hight water mark of the table
* Analyze the table to collect stats
* Analyze/Rebuild index

High Water Mark

The High water mark is the maximum space ever claimed/touched by an object (i, e. table or index ). Above the High water mark - all the blocks are available for others.

When a table is created, at least one extent (by the way, extent is a collection of contiguous blocks, and block is the smallest unit to store data) is allocated to the table. When rows are inserted into the table, the High water mark will be bumped up as more blocks will be needed to store newly inserted data. Now, for example, I have 20 million records in a table which claimed 1000 extents to store that data. So in that case, the high water mark is set up to that 1000 extents. Now assuming that I have deleted 15 million records from the table later on remaining 5 million in that table. Obviously the remaining 5 million records won't need that whole 1000 extents claimed/occupied earlier. Oracle does not fee up the space - so the high water mark will be up to the 1000 extents. Why Oracle keep the High water mark as it is - that's another story.

So, The High water is an indicator within a data segment. It demarcates the claimed used-unused space, or space that has been formatted for user data.

The space below this mark cannot be released even if there is no user data in it. But there are ways to tell Oracle to set the high water mark if we want to release unused space up to the High water mark.

Segment shrink reset High water mark for the object. It is an online operation, which means that the table will be open to the database user and DML statements are allowed at the same time.

So, I went ahead and did the bulk deletes for the summary tables. Before the deletes - I checked blocked occupied currently the index clustering factor

SQL>SELECT 'FPS_HISTORY', COUNT( DISTINCT( SUBSTR( dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) AS BLOCKS FROM FPS_HISTORY;

FPS_HISTORY BLOCKS
----------- -------------------------------
FPS_HISTORY 207798

SQL> select index_name,clustering_factor from user_indexes where table_name like 'FPS_HISTORY';

INDEX_NAME CLUSTERING_FACTOR
------------------------------ ---------------------------------------------------
FPS_ARC_DT_I 829036
FPS_HISTORY_ALL_I 1465327

After the deletes -

SQL> SELECT 'FPS_HISTORY', COUNT( DISTINCT( SUBSTR( dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) AS BLOCKS
FROM FPS_HISTORY;

FPS_HISTORY BLOCKS
----------- -----------------------------
FPS_HISTORY 20177

The block occupied by table data now is less as I deleted. But the index clustering factor did not change - so I analyzed indexes.

SQL> analyze index FPS_ARC_DT_I compute statistics;
SQL> analyze index FPS_HISTORY_ALL_I compute statistics;

SQL> select index_name,clustering_factor from user_indexes where table_name like 'FPS_HISTORY';

INDEX_NAME CLUSTERING_FACTOR
------------------------------ --------------------------------------------------
FPS_ARC_DT_I 80800
FPS_HISTORY_ALL_I 88830

After the deletes when I analyzed/rebuild indexes, the clustering factor reduced significantly! It would help a lot for query performance.

Now, I did shrink the table to set High water mark and release space. I issued the following 2 commands to do that -

SQL> ALTER TABLE fps_history ENABLE ROW MOVEMENT;
SQL> ALTER TABLE fps_history SHRINK SPACE;

Jun 24, 2008

Tablespace Block size and SQL performance

Tables and Indexes on large block tablespace requires less consistent get and block get operations and thus requires less CPU time. I did this experiments before 2 years. The below experiments is just for refreshing my memory :-)

-----BEFORE, TABLE AND INDEXES ARE ON 16K BLOCK

SQL> DELETE FROM lb_worklist WHERE form_id= 'TL-TRCNY-68J3B9EH53' AND login_id= 77865;

1 row deleted.

Elapsed: 00:00:00.01

Execution Plan
------------------------------

----------------------------
Plan hash value: 492469788

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 63 | 4 (0)| 00:00:01 |
| 1 | DELETE | LB_WORKLIST | | | | |
|* 2 | INDEX RANGE SCAN| LB_WORK_FRM_LOGIN_I | 1 | 63 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("FORM_ID"='TL-TRCNY-68J3B9EH53' AND "LOGIN_ID"=77865)

Statistics
----------------------------------------------------------
0 recursive calls
8 db block gets
3 consistent gets
0 physical reads
1120 redo size
842 bytes sent via SQL*Net to client
779 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
9 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SELECT * FROM lb_worklist WHERE login_id = 77865 and test = 0;

5618 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 3804693804

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 149 | 16241 | 104 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| LB_WORKLIST | 149 | 16241 | 104 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | LB_WORKLIST_LOGIN_TEST | 149 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("LOGIN_ID"=77865 AND "TEST"=0)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2203 consistent gets
0 physical reads
0 redo size
516047 bytes sent via SQL*Net to client
4583 bytes received via SQL*Net from client
376 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5619 rows processed

---- AFTER, TABLE AND INDEXES ARE ON 2K BLOCK

SQL> DELETE FROM lb_worklist WHERE form_id= 'TL-TRCNY-68J394SX3W' AND login_id= 77865;

1 row deleted.

Elapsed: 00:00:00.00

Execution Plan
------------------------------

----------------------------
Plan hash value: 492469788

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 63 | 5 (0)| 00:00:01 |
| 1 | DELETE | LB_WORKLIST | | | | |
|* 2 | INDEX RANGE SCAN| LB_WORK_FRM_LOGIN_I | 1 | 63 | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("FORM_ID"='TL-TRCNY-68J394SX3W' AND "LOGIN_ID"=77865)

Statistics
----------------------------------------------------------
0 recursive calls
10 db block gets
4 consistent gets
0 physical reads
1108 redo size
842 bytes sent via SQL*Net to client
779 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
9 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SELECT * FROM lb_worklist WHERE login_id = 77865 and test = 0;

5619 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 3804693804

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 149 | 16241 | 115 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| LB_WORKLIST | 149 | 16241 | 115 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | LB_WORKLIST_LOGIN_TEST | 149 | | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("LOGIN_ID"=77865 AND "TEST"=0)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2802 consistent gets
0 physical reads
0 redo size
516145 bytes sent via SQL*Net to client
4583 bytes received via SQL*Net from client
376 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5619 rows processed

However, Tables and Indexes on larger block size might increase the possibility of buffer cache latch contention. It completely depends on business logics and application design.

Jun 19, 2008

Database Replication for Business Continuation

I want to start by giving a definition of Database Replication - Database replication is the creation and maintenance of multiple copies of the same database - all database is synchronized possible in real time. That means, when changes occur in one database, it will be replicated to other database(s) instantly or some acceptable maximum time delay.

Now the question comes, why do we need such thing? You are thinking right, Database Replication is a part of Business Continuity plan - a way to recover from any malicious attack or recover from disaster. Anyways, we do also have this type of setup for our Applications.

We actively maintain two Sites - you may call data centers. For example, they are addressed as PRI and SEC Sites. Each Site contains some equipments like -

* Database Server

* Database Backup Server

* Web Servers

* Monitoring Server

* EMC Storage

* Load Balancer etc

Each Site has this type of setup and can independently runs application. But what happens if one Site become useless by any disaster or for any outage? That’s why we have another Site to continue the business. Usually the usages of multiple Sites is not only limited for recovering from disaster but also sharing loads and dividing data between data centers.

Now the necessity comes for a Database Replication product which will keep the users’ data in sync among multiple Sites’ Database servers so that we could switch to other site at any time.

From last few months, we are looking for an alternative database replication mechanism that would suit best for us. We have been running SharePlex for data replication purpose for the last 3 years. By this time, our system usages increased more than twice in just last year. Now we need more stuffs than what SharePlex is providing right now. SharePlex was good with our basic requirements. We are not comfortable with the performance and other configuration issues now a days. So, we are planning to explore new available technologies and most likely will switch over to a better one.

We found two alternatives to be considered for us -

1. Oracle Streams

2. GoldenGate Software

Earlier, we did lots of experiments with Oracle Streams for about 1.5 months. You can guess, there are lots of outcomes. I might tell about Oracle Streams another day. I am writing now to tell a bit about my GoldenGate experience!

GoldenGate Software (GGS) is a popular one used for data replication and keep several database in sync. They are supporting a number of databases including Oracle, Teradata, DB2, Sybase, SQL Server. It seems that GGS are in market for long time, probably before Streams introduced by Oracle, now they are running version 9.5.

How GGS works? The summary is - GGS has EXTRACT process which captures changes in source database. The changes are kept in files called TRAIL. The changes can be directly sent to the target/destination database’s REPLICAT process to apply the changes in destination database. GGS EXTRACT reads data from oracle online redo log file - so it’s independent of oracle process. I forgot to mention, GGS has several processes, MANAGER is one of the important process which managers all other stuffs.

So, this is how Database Replication plays important role in a system.