Thinking Database Performance: 2008

Sep 17, 2008

Bulk data manipulation and index rebulding

Sometimes we do need to manipulate large amount of data in database. This manipulation might include INSERT, UPDATE or DELETE operations. We need to analyze tables and indexes after such operation with large data. From my experience, I have seen, up-to-date statistics plays major role in SQL performance.

For example, we might be interested to analyze the table from where I just deleted 500K records and rebuild all it's corresponding indexes for current statistics.

The following scripts takes TABLE NAME as input and generates the necessary scripts for the said purposes.

INPUT:
LB_WORKLIST

OUTPUTS:
ANALYZE TABLE LB_WORKLIST COMPUTE STATISTICS;
ALTER INDEX LB_WORK_EXP_I REBUILD COMPUTE STATISTICS;
ALTER INDEX LB_WORKLIST_NOTIF REBUILD COMPUTE STATISTICS;
ALTER INDEX LB_WORK_FRM_LOGIN_I REBUILD COMPUTE STATISTICS;
ALTER INDEX LB_WORKLIST_LOGIN_TEST REBUILD COMPUTE STATISTICS;

/* scripts- table analyze and index rebuild */
SET SERVEROUT ON;

DECLARE
TYPE alalyze_cur IS REF CURSOR;
ind_cursor alalyze_cur;
TABLE_NAME VARCHAR2(30);
INDEX_NAME VARCHAR2(30);
stmt VARCHAR2(200);

BEGIN
TABLE_NAME:=UPPER('&TABLE_NAME');
stmt := 'SELECT index_name FROM USER_INDEXES WHERE INDEX_TYPE <> ''LOB'' AND TABLE_NAME = :j';
DBMS_OUTPUT.PUT_LINE('ANALYZE TABLE '||TABLE_NAME||' COMPUTE STATISTICS;');
OPEN ind_cursor FOR stmt USING table_name;

LOOP
FETCH ind_cursor INTO index_name;
EXIT WHEN ind_cursor%NOTFOUND;
DBMS_OUTPUT.PUT_LINE('ALTER INDEX '||index_name||' REBUILD COMPUTE STATISTICS;');
END LOOP;
END;
/

Sep 5, 2008

Google Chrome: Open source web browser by Google

It's good to have another browser from a company like Google which always promises to deliver simple but effective applications. Google Chrome is an open source - and off course free web browser developed by Google.

Google is saying that it is multi-threaded web browser and most importantly the browser will never get crashed! At a glance, I like the cool features "Recently Closed Tabs" and things like "Search in History" - others feel similar to Firefox 3.0.1. Let's see more about how it is different from other browsers!

Google knows the trick of presenting complex information in a simpler way and make it understandable just like a story. Few words about the Chrome from Google can be found from the link below. It a pleasure to read the Chrome book. Here it goes -

http://books.google.com/books?id=8UsqHohwwVYC&printsec=frontcover#PPP1,M1

The Chrome can be downloaded from this link -

http://www.google.com/chrome/?hl=en?hl=en

Aug 20, 2008

The 911 service and GPS location tracking

Let me describe the background in brief -

We have web applications. The users can fill up forms from anywhere in the world. Some service needs to be physically present to some physical location. How could we know whether the forms have been filled up from a desired physical location or how could we verify if users logged in to the system from office or home?

The solution we thought - is GPS!

I did some studies about some similar services those use GPS information. The 911 emergency service is one of them.

The 911 service in cell phone is an interesting one. There are basically two kinds of solutions available for the cell phone operators - Network assisted location tracking and GPS assisted location tracking.

For the first case, some operator like T-Mobile uses Triangulation method to locate subject when it is seen by 3 transmitting tower. At any given time, 3 towers calculates receiving signal based on time differences and/or angle differences and locates on relative readings.

Other operators, like Sprint supports E911 based on GPS information. A 911 call triggers the phone chip - the GPS coordinates are transmitted along with baseband signal to 911 operator and the nearest Public Safety Answering Point. This feature is built in with all phone made after 2004-05 - I think.

There are other services provided by operators. For example, Sprint offers the TeleNav service. In that case, tracking by private parties require the person whose phone is being tracked to explicitly give permission and monthly charge. And usually client software has to be installed on the phone to provide these kind of services.

For us, the desktop based solution - where a GPS device will be attached to users' computer, a program will be running inside web browser, having Java Runtime Environment installed in the machine, the web browser will collect GPS coordinates from pc's port and send these data along with application data to the servers.

The demo version worked successfully!

Some of the interesting articles I liked can be found here -

http://electronics.howstuffworks.com/gadgets/travel/gps.htm

Jul 21, 2008

Data pump export error ORA-39095 with FILESIZE and PARALLEL clauses

First of all, let me tell you one thing - if you are using "parallel" clause and "filesize" altogether and got the error, it's probably not your fault, rather it could be because of the bug I am going to tell about.

Oracle provides nice data pump features to move data around for large database. Taking the whole database to a single dump causes problem in handling big files and managing disk efficiently. With this good intention in mind, I tried to export our large data and objects in smaller chunks.

I issued the following command, which ended up with errors, later discovered that it triggered a bug -

expdp user/pass \
EXCLUDE=TABLE:"LIKE'ACTIVITY'" \
DUMPFILE=EXPDIR:prod-20080721_01%U.dmp,EXPDIR:prod-20080721_02%U.dmp \
FILESIZE=4G \
PARALLEL=2 \
JOB_NAME=PROD_20080721
LOGFILE=EXPDIR1: split-dump.log

This means - export the user schema without the table ACTIVITY, create several dump files with maximum size 4 GB (size would help me to move data around) and create as many dumpfiles as needed with the convention and perform the task in 2 threads.

It supposed to generate files like follows -

prod-20080721_0101.dmp
prod-20080721_0201.dmp
prod-20080721_0102.dmp
prod-20080721_0202.dmp
...

and so forth as needed, because the the substitution variable %U, which indicates that multiple files (01 to 99) would be generated using the specified filename as a template.

Well, I thought that I would go and grab a cup of coffee, and by that time the export job would be completed. But the reality was different, just after few minutes, I saw this error message on my console -

ERROR: ORA-39095: Dump file space has been exhausted: Unable to allocate 8192 bytes Job "USER"."PROD_20080721" stopped due to fatal error at 00:37:29

What I did?

Like others, I googled. Unfortunately, I was not satisfied with the documents I found at that time telling about the error. All the available documents were saying that this error only results when the specified dump file size is not enough to export all the data - which was not the case with me!

Again with a fresh mind, I googled to understand the unexpected problem – but did not find any satisfactory explanation. There are lots of documents saying the possible solutions for this error but did not find any case like my one. Interestingly noticed, in some blogs, people are interested to give solution without specifying the problem :-)

Anyways, this cause could be appropriate if I had not mentioned the substitution variable %U. My dump size was about 30 GB and as I mentioned 2 dump files with substitution variable %U, export process would have been able to generate 2*99 files with 4 GB file size each. So, the “ORA-39095: Dump file space has been exhausted” did not make sense to me in this case. FYI, there were around 80 GB free space left on the DUMPDIR location – more than enough for the task.

Now it's time for the master attempt. I searched in metalink.oracle.com to find any possible bug related to this. Bingo! It was there!!

Here are few lines from Metalink -

Symptoms

When trying to export using PARALLEL clause with one dumpfile, or a number of dumpfile less than the parallelism value.
The export may fail with the following error:
ORA-39095: Dump file space has been exhausted: Unable to allocate 8192 bytes

Cause

When using one dumpfile -or a number less than parallelism value-, several slave processes wait for the file locked by the other process to write. And so we are not benefiting from the parallelism anyway.Sometimes the slave process locking the file does not release the lock after finishing as it's supposed to release the lock when the dump process ends, and it'll not end cause the other processes are waiting to write to the file.

Possible Solutions

1- Use a number of dumpfiles equal to, or more than the parallelism value.
2- Avoid using PARALLEL clause.

So, this was an obvious bug I was facing. I concluded the workaround as follows-

* I have to mention many files – the number of files must be a bit more than the parallelism mentioned. For example, if I mention parallelism 2, then I should mention 4 files with the substitution variable %U

* When the problem encounters – I could add dump files and continue the job

* I should not use the parallelism clause

To accomplished the half done task, I went for the option 2. Data pump feature allows to restart a job and add additional stuffs in the middle of a job – a nice one! It involves few simple commands -

1. I queried database to find the job status -

SQL> select job_name, state from dba_datapump_jobs;
JOB_NAME STATE
------------------------------ ------------------------------
PROD_20080721 NOT RUNNING

2. I had to attach the job with the export process

# expdp user/pass attach=PROD_20080721

3. I added an additional file

Export> ADD_FILE=EXPDIR:prod-20080721_03%U.dmp
Export> CONTINUE_CLIENT

However, I would recommend to use the following simple version (without PARALLEL clause) of export to avoid troubles. This would work for database up to around 290 GB size -

expdp user/pass \
exclude=TABLE:\"like\'ACTIVITY\'\" \
DUMPFILE=EXPDIR1:prod-20080721_%U.dmp \
FILESIZE=3G \
JOB_NAME=EXPORT_0721 \
LOGFILE=EXPDIR1:splited-dump.log

That was the story of using oracle expdp with filesize and parallel clause! Please let me know if this helps.

Jun 30, 2008

Bulk Delete: Object analysis and High water mark adjustment in Oracle

We have some summary like tables where data are populated over time as users progress with using the system. The data needs to be cleaned up periodically.

Today I did that kind of stuffs to clean one summary table - I deleted approximately 1.5 million records!

So, what else need to do after that bulk deletes? Well, there are several useful things that need to considered for the performance.

* Reset the Hight water mark of the table
* Analyze the table to collect stats
* Analyze/Rebuild index

High Water Mark

The High water mark is the maximum space ever claimed/touched by an object (i, e. table or index ). Above the High water mark - all the blocks are available for others.

When a table is created, at least one extent (by the way, extent is a collection of contiguous blocks, and block is the smallest unit to store data) is allocated to the table. When rows are inserted into the table, the High water mark will be bumped up as more blocks will be needed to store newly inserted data. Now, for example, I have 20 million records in a table which claimed 1000 extents to store that data. So in that case, the high water mark is set up to that 1000 extents. Now assuming that I have deleted 15 million records from the table later on remaining 5 million in that table. Obviously the remaining 5 million records won't need that whole 1000 extents claimed/occupied earlier. Oracle does not fee up the space - so the high water mark will be up to the 1000 extents. Why Oracle keep the High water mark as it is - that's another story.

So, The High water is an indicator within a data segment. It demarcates the claimed used-unused space, or space that has been formatted for user data.

The space below this mark cannot be released even if there is no user data in it. But there are ways to tell Oracle to set the high water mark if we want to release unused space up to the High water mark.

Segment shrink reset High water mark for the object. It is an online operation, which means that the table will be open to the database user and DML statements are allowed at the same time.

So, I went ahead and did the bulk deletes for the summary tables. Before the deletes - I checked blocked occupied currently the index clustering factor

SQL>SELECT 'FPS_HISTORY', COUNT( DISTINCT( SUBSTR( dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) AS BLOCKS FROM FPS_HISTORY;

FPS_HISTORY BLOCKS
----------- -------------------------------
FPS_HISTORY 207798

SQL> select index_name,clustering_factor from user_indexes where table_name like 'FPS_HISTORY';

INDEX_NAME CLUSTERING_FACTOR
------------------------------ ---------------------------------------------------
FPS_ARC_DT_I 829036
FPS_HISTORY_ALL_I 1465327

After the deletes -

SQL> SELECT 'FPS_HISTORY', COUNT( DISTINCT( SUBSTR( dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) AS BLOCKS
FROM FPS_HISTORY;

FPS_HISTORY BLOCKS
----------- -----------------------------
FPS_HISTORY 20177

The block occupied by table data now is less as I deleted. But the index clustering factor did not change - so I analyzed indexes.

SQL> analyze index FPS_ARC_DT_I compute statistics;
SQL> analyze index FPS_HISTORY_ALL_I compute statistics;

SQL> select index_name,clustering_factor from user_indexes where table_name like 'FPS_HISTORY';

INDEX_NAME CLUSTERING_FACTOR
------------------------------ --------------------------------------------------
FPS_ARC_DT_I 80800
FPS_HISTORY_ALL_I 88830

After the deletes when I analyzed/rebuild indexes, the clustering factor reduced significantly! It would help a lot for query performance.

Now, I did shrink the table to set High water mark and release space. I issued the following 2 commands to do that -

SQL> ALTER TABLE fps_history ENABLE ROW MOVEMENT;
SQL> ALTER TABLE fps_history SHRINK SPACE;

Jun 24, 2008

Tablespace Block size and SQL performance

Tables and Indexes on large block tablespace requires less consistent get and block get operations and thus requires less CPU time. I did this experiments before 2 years. The below experiments is just for refreshing my memory :-)

-----BEFORE, TABLE AND INDEXES ARE ON 16K BLOCK

SQL> DELETE FROM lb_worklist WHERE form_id= 'TL-TRCNY-68J3B9EH53' AND login_id= 77865;

1 row deleted.

Elapsed: 00:00:00.01

Execution Plan
------------------------------

----------------------------
Plan hash value: 492469788

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 63 | 4 (0)| 00:00:01 |
| 1 | DELETE | LB_WORKLIST | | | | |
|* 2 | INDEX RANGE SCAN| LB_WORK_FRM_LOGIN_I | 1 | 63 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("FORM_ID"='TL-TRCNY-68J3B9EH53' AND "LOGIN_ID"=77865)

Statistics
----------------------------------------------------------
0 recursive calls
8 db block gets
3 consistent gets
0 physical reads
1120 redo size
842 bytes sent via SQL*Net to client
779 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
9 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SELECT * FROM lb_worklist WHERE login_id = 77865 and test = 0;

5618 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 3804693804

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 149 | 16241 | 104 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| LB_WORKLIST | 149 | 16241 | 104 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | LB_WORKLIST_LOGIN_TEST | 149 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("LOGIN_ID"=77865 AND "TEST"=0)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2203 consistent gets
0 physical reads
0 redo size
516047 bytes sent via SQL*Net to client
4583 bytes received via SQL*Net from client
376 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5619 rows processed

---- AFTER, TABLE AND INDEXES ARE ON 2K BLOCK

SQL> DELETE FROM lb_worklist WHERE form_id= 'TL-TRCNY-68J394SX3W' AND login_id= 77865;

1 row deleted.

Elapsed: 00:00:00.00

Execution Plan
------------------------------

----------------------------
Plan hash value: 492469788

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 63 | 5 (0)| 00:00:01 |
| 1 | DELETE | LB_WORKLIST | | | | |
|* 2 | INDEX RANGE SCAN| LB_WORK_FRM_LOGIN_I | 1 | 63 | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("FORM_ID"='TL-TRCNY-68J394SX3W' AND "LOGIN_ID"=77865)

Statistics
----------------------------------------------------------
0 recursive calls
10 db block gets
4 consistent gets
0 physical reads
1108 redo size
842 bytes sent via SQL*Net to client
779 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
9 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SELECT * FROM lb_worklist WHERE login_id = 77865 and test = 0;

5619 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 3804693804

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 149 | 16241 | 115 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| LB_WORKLIST | 149 | 16241 | 115 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | LB_WORKLIST_LOGIN_TEST | 149 | | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("LOGIN_ID"=77865 AND "TEST"=0)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2802 consistent gets
0 physical reads
0 redo size
516145 bytes sent via SQL*Net to client
4583 bytes received via SQL*Net from client
376 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5619 rows processed

However, Tables and Indexes on larger block size might increase the possibility of buffer cache latch contention. It completely depends on business logics and application design.

Jun 19, 2008

Database Replication for Business Continuation

I want to start by giving a definition of Database Replication - Database replication is the creation and maintenance of multiple copies of the same database - all database is synchronized possible in real time. That means, when changes occur in one database, it will be replicated to other database(s) instantly or some acceptable maximum time delay.

Now the question comes, why do we need such thing? You are thinking right, Database Replication is a part of Business Continuity plan - a way to recover from any malicious attack or recover from disaster. Anyways, we do also have this type of setup for our Applications.

We actively maintain two Sites - you may call data centers. For example, they are addressed as PRI and SEC Sites. Each Site contains some equipments like -

* Database Server

* Database Backup Server

* Web Servers

* Monitoring Server

* EMC Storage

* Load Balancer etc

Each Site has this type of setup and can independently runs application. But what happens if one Site become useless by any disaster or for any outage? That’s why we have another Site to continue the business. Usually the usages of multiple Sites is not only limited for recovering from disaster but also sharing loads and dividing data between data centers.

Now the necessity comes for a Database Replication product which will keep the users’ data in sync among multiple Sites’ Database servers so that we could switch to other site at any time.

From last few months, we are looking for an alternative database replication mechanism that would suit best for us. We have been running SharePlex for data replication purpose for the last 3 years. By this time, our system usages increased more than twice in just last year. Now we need more stuffs than what SharePlex is providing right now. SharePlex was good with our basic requirements. We are not comfortable with the performance and other configuration issues now a days. So, we are planning to explore new available technologies and most likely will switch over to a better one.

We found two alternatives to be considered for us -

1. Oracle Streams

2. GoldenGate Software

Earlier, we did lots of experiments with Oracle Streams for about 1.5 months. You can guess, there are lots of outcomes. I might tell about Oracle Streams another day. I am writing now to tell a bit about my GoldenGate experience!

GoldenGate Software (GGS) is a popular one used for data replication and keep several database in sync. They are supporting a number of databases including Oracle, Teradata, DB2, Sybase, SQL Server. It seems that GGS are in market for long time, probably before Streams introduced by Oracle, now they are running version 9.5.

How GGS works? The summary is - GGS has EXTRACT process which captures changes in source database. The changes are kept in files called TRAIL. The changes can be directly sent to the target/destination database’s REPLICAT process to apply the changes in destination database. GGS EXTRACT reads data from oracle online redo log file - so it’s independent of oracle process. I forgot to mention, GGS has several processes, MANAGER is one of the important process which managers all other stuffs.

So, this is how Database Replication plays important role in a system.

May 19, 2008

Understanding Database Character Set and Encoding Techniques

It looks like, there are thousands of pages documentation on oracle database character set, database encoding, NLS_LANG and Unicode stuffs. At least, in last two days - I realized it. For a long time, about 2 years probably, we have been thinking to change the character set of our databases. The need came in front when our users started writing long comments by copying from MS word.

What’s wrong with this?

Windows uses a different character set and encoding techniques than we do in our applications. We can not convert some special characters available in windows and other character sets. For example, if users write a note using single quote, double quote, currency symbol and things like these, we can not convert those things and store it properly as it is in the database for our current encoding techniques. And next time when users are provided this data - they see some strange symbol/garbage instead.

Stop! If you are thinking what this encoding and character sets are - I would highly recommend you to read this nice article http://www.joelonsoftware.com/articles/Unicode.html

Anyway, currently we are using West European character set in the database. We need to change it to Unicode in order to support special characters from users’ input.There are ways to change character set of database - it’s not a very hard job for a DBA. But there are catches. What will happen for those special characters/symbols those already inserted?

To do the changes in database character set, oracle has some procedures like scanning the current data (with the Oracle CSSCAN tool) to make sure that the data is stored properly in the current character set. When we did this step to check the compatibilities, we had been reported that the we would be loosing some characters during the character set conversion in database. Oracle calls this - Loosy Conversion!!

I was trying to understand the actual behavior of these character encoding/conversion. For this, I went through hundreds of pages to understand the encoding mechanisms and the explanations appropriate for our applications. The findings below -

* ASCII character set are defined in the decimal range from 0 - 127
* WE8ISO8859P1 character set which we are using as database character set, the special characters are defined in the decimal range from 160 to 255
* WE8MSWIN1252 character set are being used by windows users, defines the special characters in the decimal range 128 to 255
* As we do not set any NLS_LANG parameter in jdbc template, it is default UTF (UTF-8/UTF-16, depends on version)
* As we do not set NLS_LANG environment variable, it is default “AMERICAN_AMERICA.US7ASCII” (LANGUAGE_TERRITORY.CHARSET). However, it is only applicable if we interact with sqlplus or when other client’s NLS is not set.

** Case-1 **

Now, for example, user is inserting the copyright character (decimal value 169) from ms word. There will be a series of conversions like -

It will first converted from windows character set WE8MSWIN1252 to UTF8, and finally when it will be inserted to database using jdbc, will be converted to database character set WE8ISO8859P1.
I simulated this behavior setting NLS_LANG to “AMERICAN_AMERICA.UTF8″ (because this is the default settings for jdbc)

SQL> select dump(convert(convert(’©’,'UTF8′,’WE8MSWIN1252′),’WE8ISO8859P1′,’UTF8′)) from dual;

DUMP(CONVERT(CONVERT(’©’,'UTF8′,’WE8MSWIN1252′),
————————————————
Typ=1 Len=1: 169

The value 169 which is the decimal code for copyright symbol will be inserted into database as it is. When selecting the character from database, the original character is retrieved in the following fashion.

SQL> select convert(convert(convert(’©’,'UTF8′,’WE8MSWIN1252′),’WE8ISO8859P1′,’UTF8′),’WE8MSWIN1252′,’WE8ISO8859P1′) from dual;
CON
—
©

** Case-2 **

If we do the same simulation for Euro Symbol which is represented decimal value 128 in windows character set, the actual inserted value in database will be decimal value 191 instead of 128. This is because, the database character set does not have any representation for value 128 (WE8ISO8859P1 only understands special char value in the range 160-255) and for this it replace this value with a replacement character - inverted question mark, decimal value 191.

SQL> select dump(convert(convert(’€’,'UTF8′,’WE8MSWIN1252′),’WE8ISO8859P1′,’UTF8′)) from dual;

DUMP(CONVERT(CONVERT(’¿’,'UTF8′,’WE8MSWIN1252′),
————————————————
Typ=1 Len=1: 191

Eventually when we retrieve this value from database, we see the inverted question mark from UI.

SQL> select convert(convert(convert(’€’,'UTF8′,’WE8MSWIN1252′), ’WE8ISO8859P1′,’UTF8′),’WE8MSWIN1252′,’WE8ISO8859P1′) from dual;

CON
—
¿

So, my understanding is, the special characters within the range 128 to 159, inserted from windows clients are not recognizable by the database character set WE8ISO8859P1 and thus replaced by replacement character (inverted question mark) in database. And these characters are NOT recoverable also. That’s why, we would have loosy conversion from character set WE8ISO8859P1 to AL32UTF8.

May 11, 2008

Database Replication with GoldenGate software

For a number of reasons - people replicate data to remote destinations. We do have our own needs. We have been running SharePlex for data replication purpose for the last 3 years. By this time, our system usages increased more than twice in just last year. Initially SharePlex was good for us but now a days we need more flexible configurations and different replication architecture for the system.

We are looking for an alternative database replication mechanism that would suit best for us. We are exploring new technologies available to serve replication purpose.

Now, we found two alternatives to be considered as we need more control for an "Active-Active" mode -

1. Oracle Streams
2. GoldenGate Software

I did several extensive experiments with Oracle Streams for about 1.5 months. You can guess, there are lots of outcomes. This post is for telling a bit about my GoldenGate experience!

GoldenGate Software (GGS) is a popular one used for data replication and keep several database in sync. They are supporting a number of database including Oracle, Teradata, DB2, Sybase, SQL Server. It seems that GGS are in market for long time, probably before Streams introduced by Oracle, now they are running version 9.5.

The basic concepts of GGS are - GGS has EXTRACT process which captures changes in source database. The changes are kept in files called TRAIL. The changes can be directly sent to the target/destination database's REPLICAT process to apply the changes in destination database. GGS EXTRACT reads data from oracle online redo log file - so it's independent of oracle process. I forgot to mention, GGS has several processes, MANAGER is one of the important process which managers all other stuffs.

My first "one way" replication test configuration (obvious these are not all for production db) with GGS were as follows -

Source Neptune: Create EXTRACT which will capture changes and send the changes to remote host's TRAIL files

EDIT PARAMS extprod

EXTRACT extprod
USERID ggs, PASSWORD ggs
RMTHOST saturn, MGRPORT 7820
RMTTRAIL /export/home/oracle/ggs/dirdat/rt
TABLE DEV.LOGIN;

ADD EXTRACT extprod, TRANLOG, BEGIN NOW

**
ADD RMTTRAIL /export/home/oracle/ggs/dirdat/rt, EXTRACT extprod, MEGABYTES 50

Destination Saturn: Create REPLICAT which will read changes from TRAIL files and apply the changes in current database

EDIT PARAMS repprod

REPLICAT repprod
ASSUMETARGETDEFS
DISCARDFILE ./dirrpt/REPSOE_DSC.rpt, PURGE
USERID ggs, PASSWORD ggs
MAP DEV.*, TARGET DEV.*;

**
ADD REPLICAT repprod, EXTTRAIL /export/home/oracle/ggs/dirdat/rt, NODBCHECKPOINT

So, these were the basic stuffs that I started with GoldenGate Software (GGS). They have thousand page documents and there are a lot more to test. Earlier, we created spreadsheet putting about 30 evaluation criteria against which we compared SharePlex and Oracle Streams. After experimenting with GGS, I will put the outcomes for GGS and would take the ultimate product decision.

Expecting comments if you have any good/bad experience with the product!

May 4, 2008

Oracle: Index auto dropped while importing. Where my index gone?

One day, someone was telling me that he had been lost in blue, one index was vanished!

We have multiple production servers. So, taking a dump from one server and import it to another is quite common for initial synchronization. Like this, one day, one of my colleague found that schema on p01 database had one less index than s01 database. One index was not created in p01 while imported table "TITLE". I investigated and found that it happened for an interesting feature of oracle!

The "TITLE" table has primary key ID. You know, when primary key is created, one system generated index is created with the primary key column to enforce the constraint.

During the release 7.2, we created a composite index named "TITLE_ID_TITLE" on table "TITLE" in database server s01 using two columns including the primary key column "ID" and another column, "TITLE" (TITLE table has a column named TITLE).

Now, when oracle was importing indexes for table TITLE on p01, it found that the composite index "TITLE_ID_TITLE" is enough to enforce the primary key - so it automatically discarded the system generated index which was created with the primary key.

That's the ghost story behind this!

May 2, 2008

SpringSource launches new Application Platform

SpringSource, the company behind Spring framework, launches new Application Platform without Java EE standerds! The Application Platform has been built on Spring, OSGi, and Apache Tomcat. The new appserver departs from the Java2 EE standards, exposing the Spring programming model natively, along with a new deployment and packaging system.

I was reading the overview of it and was amazed by reading the following development and production benefits. It is our dream to modularize the big applications that we are developing and do the updates in production server in real time without shutting down the system for long time.

I am just pasting the benefits SpringSource are talking about -

Production Benefits

Real time application and server updates
Better resource utilization
Side by side resource versioning
Longer server uptime
Run new and old web application

Development Benefits

Faster iterative development
Small server footprint
More manageable applications
Greater code reuse
Resolve transitive dependencies

Nice blogs have been written by the development team about the architecture.

http://blog.springsource.com/main/2008/04/30/introducing-the-springsource-application-platform/

The overall introduction is given on the official site of SpringSource -

http://www.springsource.com/web/guest/products/suite/applicationplatform

I have the strong desire to explore more on this!

Apr 13, 2008

Handling BIG Database Table "UserActivity"

Let's start with the stats -

We have over 70 million records in user_activity table now. As the name implies, user_activity keeps record of users' actions in applications.

The growths for year 2007 were like (in million) -

1st quarter: 6
2nd quarter: 8.5
3rd quarter: 10.5
4th quarter: 12

These huge number of records were giving a lot more pain than expected. SharePlex tool can not sync this big table between databases. This table is occupying 1/3 size of the whole database. Handling dump requires extra time and space. We could go for oracle partitioning but it will not serve our purpose here.

I thought to handle it by ourselves. I decided to to split user_activity table into a number of tables according to date range. For example, all 2004 data would reside in one table, 2005 data would go to another table, the same for 2006 data as well. I am keeping these yearly data together as we had less amount of data during those years. Then I split data quarterly from 2007 and onwards. Again, this is not fixed, in 2009 when we would have 100K system users, we might need to split monthly :)

Now, user_activity table is divided into a number of tables like -

UA_200401_200412
UA_200501_200512
UA_200601_200612
UA_200701_200703
UA_200704_200706
UA_200707_200709
UA_200710_200712
USER_ACTIVITY

After each quarter of 2008, at some time, I will run the dynamic partitioning script which will split user_activity again and will create a new table. For example, the first quarter table will be UA_200801_200803.

We have programming challenges here. What if when user select a date range where we need to fetch data from multiple tables and how do we formulate a clean logic for that?

Okay, we need some generic mechanism so that the code remains clean and scalable. I created a table USER_ACTIVITY_SPLIT_INFO which provides the necessary table(s) information from where user activity data will be looked up after breaking the original table. This table contains information like follows -

TABLE_NAME	BEGIN_DATE	END_DATE
UA_200401_200412	25-OCT-03 02.23.22.000000 PM	31-DEC-04 03.50.56.000000 PM
UA_200501_200512	01-JAN-05 05.09.38.000000 AM	31-DEC-05 11.58.52.000000 PM
UA_200601_200612	01-JAN-06 12.01.23.000000 AM	31-DEC-06 11.59.55.956000 PM
UA_200701_200703	01-JAN-07 12.00.27.530000 AM	31-MAR-07 11.59.59.344000 PM
UA_200704_200706	01-APR-07 12.00.00.786000 AM	30-JUN-07 11.59.58.917000 PM
UA_200707_200709	01-JUL-07 12.00.00.055000 AM	30-SEP-07 11.59.57.929000 PM
UA_200710_200712	01-OCT-07 12.00.00.811000 AM	31-DEC-07 11.59.57.893000 PM
USER_ACTIVITY	01-JAN-08 12.00.02.140000 AM

Now, for example, user is seeking data for a date range 11-01-2007 to 04-01-2008. The following query will return the name of table(s) from where data need to be fetched.

SELECT x.table_name
FROM user_activity_split_info x
WHERE x.begin_date BETWEEN TO_DATE('11-01-2007','mm-dd-yyyy') AND TO_DATE('04-01-2008','mm-dd-yyyy')
OR x.end_date BETWEEN TO_DATE('11-01-2007','mm-dd-yyyy') AND TO_DATE('04-01-2008','mm-dd-yyyy');

Output:

TABLE_NAME

UA_200710_200712

USER_ACTIVITY

So it just returned two table names where the data would be found.

In our application code, a method will return the name of table(s) based on the user's date input and we would fetch data directly from those tables and add with combine results together. The programmer does not have to know how many tables to deal with!

One more thing, we don't have to worry about the sequence number which would cross limit after certain years. Now we can reset user_activity sequence at any point just with a single command.

I also came up with another approach which would provided more simpler programming - almost noting to change in code but that approach requires user_activity data duplication by overlapping each three months data! This is not feasible for a table like user_activity - but it was an option.

Let's see how things work!

Apr 9, 2008

Data transfers through SQL*Net

We need to check how much data is going back and forth from database servers to take some system engineering decisions. In order to know what is the flow rate for current sessions, I issued the following SQL -

/** Current Sessions **/

SELECT n.name,ss.username, ROUND((SUM(st.value))/1024/1024) MB
FROM v$sesstat st,v$session ss ,v$statname n
WHERE ss.SID=st.SID
AND n.statistic# = st.statistic#
AND n.name like '%SQL*Net%'
HAVING ROUND((SUM(st.value))/1024/1024) > 0
GROUP BY N.NAME,ss.username
ORDER BY ss.username, n.name;

NAME	USERNAME	MB
---------------------------	--------------	-----------
SQL*Net roundtrips to/from client	DBSNM	1
bytes received via SQL*Net from client	DBSNM	182
bytes sent via SQL*Net to client	DBSNM	154
bytes received via SQL*Net from client	PROD	68
bytes sent via SQL*Net to client	PROD	141

Well, how do I know the historical data for the instance? The answer could be like follows -

/** Since Instance Startup **/

SELECT NAME,ROUND(VALUE/1024/1024) MB
FROM v$sysstat
WHERE NAME like '%SQL*Net%'
AND ROUND(VALUE/1024/1024) > 0
ORDER BY name;

NAME	MB
-------------------------------------	-------------------
SQL*Net roundtrips to/from client	380
bytes received via SQL*Net from client	46331
bytes sent via SQL*Net to client	92414

There are other ways to do that from the trace files. If I enable trace level by adding the following lines in my listener.ora and reload it, then I would find my desired stuffs in the trace file.

-- listener.ora

TRACE_FILE_LISTENER = netstat-info-20080409.trc
TRACE_DIRECTORY_LISTENER = /export/home/oracle
TRACE_LEVEL_LISTENER =SUPPORT

From command line as oracle user -
# lsnrctl reload

Now it's time to check the logs after certain period -

# trcasst -s /export/home/oracle/netstat-info-20080409.trc

The trcasst just came up with the nice formatted output - cool !

*************************************************************************
* Trace Assistant *
*************************************************************************

----------------------
Trace File Statistics:
----------------------
Start Timestamp : 09-APR-2008 13:03:10:531
End Timestamp : 09-APR-2008 19:54:36:028
Total number of Sessions: 1172

DATABASE:
Operation Count: 0 OPENS, 0 PARSES, 0 EXECUTES, 0 FETCHES

ORACLE NET SERVICES:
Total Calls : 1593 sent, 1427 received, 0 oci
Total Bytes : 292573 sent, 255785 received
Average Bytes: 183 sent per packet, 179 received per packet
Maximum Bytes: 2011 sent, 2034 received

Grand Total Packets: 1593 sent, 1427 received

*************************************************************************
* Trace Assistant has completed *
*************************************************************************

Apr 3, 2008

Stop user from accidental database damage!

The guys who play with database can not help themselves doing crazy stuffs always. Sometimes we logged in multiple production servers and also test servers simultaneously. These sessions are opened in multiple console tabs. It might be happened, we want to run some operations on test servers but accidentally run those on production boxes. Just imagine, I want to run TRUNCATE TABLE operation on test box but I did it on prod boxes actually - opss! what a disaster!

To avoid this types of blundering actions, I have introduces the following trigger to prevent any kind of unconscious DDLs in production boxes. The trigger will be stored on proper schema always and if any DDL issueed in the presence of the trigger, it will simply slap us :P

CREATE OR REPLACE TRIGGER ddl_restrict_trigger
BEFORE ALTER OR CREATE OR DROP OR TRUNCATE
ON SCHEMA
BEGIN
RAISE_APPLICATION_ERROR(-20001,' You are not authorized to perform DDL. Please contact DBA team.');
END;
/

Read-only database user

Sometimes the application developers need to debug data from the production database. It demands to give access to the live schema. How can I trust a developer who accidentally won't modify stuffs in database and blander everything!

I need a read-only user on which developers only would be able to issue "select" statements. For now, it will serve my purpose.

What I did, I created a user, created role with select any table privileges and assigned that user with the role. Now the developers are restricted to issue other than select commands - cool!

There are two ways of doing this. Here are the scripts for that -

** option-1 **

> sqlplus system/*******
> create user devels identified by xyz;

> create role SELECT_ANY_TABLE;
> grant select any table to SELECT_ANY_TABLE;

> grant connect to devels;
> grant SELECT_ANY_TABLE to devels;

** option-2 **

> sqlplus system/******
> create user devels identified by xyz;
> sqlplus prod/***********

Now have to run the following piece on PL/SQL

set serveroutput on
DECLARE
sql_txt VARCHAR2(300);
CURSOR tables_cur IS
SELECT x.table_name FROM user_tables x;
BEGIN
dbms_output.enable(10000000);
FOR tables IN tables_cur LOOP
sql_txt:='GRANT SELECT ON '||tables.table_name||' TO devels';
execute immediate sql_txt;
END LOOP;
END;
/

The second option has a drawback - each time we add new tables in schema, we have to run it.

Jan 31, 2008

Latch Contention in Oracle

What is Latch?

Well, the hardware definition of latch is - a window or door lock. The electronic definition of latch is - an electronic circuit used to store information.

Then what is Latch in Oracle? Very uncommon as the problem is super sophisticated.

Latch is one kind of very quick (could be acquired and released in nanoseconds) lock or serialization mechanism (makes more sense) to protect Oracle’s shared memory in SGA. Basically latch protects the same area of SGA being updated by more than one process.

Now question comes, what are protected and why?

Each Oracle operation needs to read and update SGA. For example –

When a query reads a block from disk, it will modify a free block in buffer cache and adjust the buffer cache LRU chain
When a new SQL statement is parsed, it will be added to the library cache within SGA
When DML issued and modifications are made in blocks, changes are placed in redo buffer
Database writer periodically (after commit, after SCN change or after each 3 sec) writes buffers from memory to disk and updates their status from dirty to clean.
Redo log writer writes blocks from redo buffer to redo logs.

Latch prevents any of these operations from colliding and possibly corrupting the SGA.

If the specific latch is already in use by another process, oracle will retry very frequently with a cumulative delay up to certain times (controlled by hidden parameter) called spin count. First time one process fails to acquire the latch, it will attempt to awaken after 10 milliseconds up to its spin count. Subsequent waits will increase in duration - might be seconds in extreme cases. This affects response time and throughput.

Most common type latch is Cache Buffer Latch and Cache buffer LRU chain latch which are caused for highly accessing blocks, called hot blocks. Contention on these latches is typically caused by concurrent access to a very hot block. The most common type of such hot block is index root or block branch.

Unfortunately we are experiencing Cache Buffer Latch contention in database now a days.

Identifying HOT BLocks

Possible hot blocks in the buffer cache normally can be identified by a high or rapid increasing wait count on the CACHE BUFFERS CHAINS latch.
This latch is acquired when searching for data blocks cached in the buffer cache. Since the Buffer cache is implemented as a sum of chains of blocks, each of those chains is protected by a child of this latch when needs to be scanned. Contention in this latch can be caused by very heavy access to a single block. This can require the application to be reviewed.

By examining the waits on this latch, information about the segment and the specific block can be obtained using the following queries.

First determine which latch id (ADDR) are interesting by examining the number of sleeps for this latch. The higher the sleep count, the more interesting the latch id (ADDR) is:

select * from (
select CHILD#  "cCHILD", ADDR    "sADDR", 
    GETS    "sGETS" , MISSES  "sMISSES",
    SLEEPS  "sSLEEPS"
from v$latch_children

where name = 'cache buffers chains'

order by 5 desc, 4, 1, 2, 3
) where rownum <=20;

Have to run the above query a few times to to establish the id (ADDR) that has the most consistent amount of sleeps. Once the id (ADDR) with the highest sleep count is found then this latch address can be used to get more details about the blocks currently in the buffer cache protected by this latch. The query below should be run just after determining the ADDR with the highest sleep count.

column segment_name format a35
select /*+ RULE */
e.owner ||'.'|| e.segment_name segment_name,
e.extent_id extent#,
x.dbablk - e.block_id + 1 block#,
x.tch,
l.child#
from
sys.v$latch_children l,
sys.x$bh x,
sys.dba_extents e
where
x.hladdr = 'ADDR' and
e.file_id = x.file# and
x.hladdr = l.addr and
x.dbablk between e.block_id and e.block_id + e.blocks -1
order by x.tch desc ;

Depending on the TCH column (The number of times the block is hit by a SQL statement), you can identify a hotblock. The higher the value of the TCH column, the more frequent the block is accessed by SQL statements.

The solution of the problem will vary depending upon the problem and application architecture.
Some generic approach might be -

1. Creation of Reverse Index of those hot blocks
2. Distribute hot blocks table on different size cache segment
3. Ensure the usages of index to reduce full scan