Jun 30, 2008

Bulk Delete: Object analysis and High water mark adjustment in Oracle

We have some summary like tables where data are populated over time as users progress with using the system. The data needs to be cleaned up periodically.

Today I did that kind of stuffs to clean one summary table - I deleted approximately 1.5 million records!

So, what else need to do after that bulk deletes? Well, there are several useful things that need to considered for the performance.

* Reset the Hight water mark of the table
* Analyze the table to collect stats
* Analyze/Rebuild index

High Water Mark

The High water mark is the maximum space ever claimed/touched by an object (i, e. table or index ). Above the High water mark - all the blocks are available for others.

When a table is created, at least one extent (by the way, extent is a collection of contiguous blocks, and block is the smallest unit to store data) is allocated to the table. When rows are inserted into the table, the High water mark will be bumped up as more blocks will be needed to store newly inserted data. Now, for example, I have 20 million records in a table which claimed 1000 extents to store that data. So in that case, the high water mark is set up to that 1000 extents. Now assuming that I have deleted 15 million records from the table later on remaining 5 million in that table. Obviously the remaining 5 million records won't need that whole 1000 extents claimed/occupied earlier. Oracle does not fee up the space - so the high water mark will be up to the 1000 extents. Why Oracle keep the High water mark as it is - that's another story.

So, The High water is an indicator within a data segment. It demarcates the claimed used-unused space, or space that has been formatted for user data.

The space below this mark cannot be released even if there is no user data in it. But there are ways to tell Oracle to set the high water mark if we want to release unused space up to the High water mark.

Segment shrink reset High water mark for the object. It is an online operation, which means that the table will be open to the database user and DML statements are allowed at the same time.

So, I went ahead and did the bulk deletes for the summary tables. Before the deletes - I checked blocked occupied currently the index clustering factor

SQL>SELECT 'FPS_HISTORY', COUNT( DISTINCT( SUBSTR( dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) AS BLOCKS FROM FPS_HISTORY;

FPS_HISTORY BLOCKS
----------- -------------------------------
FPS_HISTORY 207798


SQL> select index_name,clustering_factor from user_indexes where table_name like 'FPS_HISTORY';

INDEX_NAME CLUSTERING_FACTOR
------------------------------ ---------------------------------------------------
FPS_ARC_DT_I 829036
FPS_HISTORY_ALL_I 1465327


After the deletes -

SQL> SELECT 'FPS_HISTORY', COUNT( DISTINCT( SUBSTR( dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) AS BLOCKS
FROM FPS_HISTORY;


FPS_HISTORY BLOCKS
----------- -----------------------------
FPS_HISTORY 20177

The block occupied by table data now is less as I deleted. But the index clustering factor did not change - so I analyzed indexes.

SQL> analyze index FPS_ARC_DT_I compute statistics;
SQL> analyze index FPS_HISTORY_ALL_I compute statistics;

SQL> select index_name,clustering_factor from user_indexes where table_name like 'FPS_HISTORY';

INDEX_NAME CLUSTERING_FACTOR
------------------------------ --------------------------------------------------
FPS_ARC_DT_I 80800
FPS_HISTORY_ALL_I 88830

After the deletes when I analyzed/rebuild indexes, the clustering factor reduced significantly! It would help a lot for query performance.

Now, I did shrink the table to set High water mark and release space. I issued the following 2 commands to do that -

SQL> ALTER TABLE fps_history ENABLE ROW MOVEMENT;
SQL> ALTER TABLE fps_history SHRINK SPACE;

No comments: