These are the things I suggest developers to follow in developing applications where Oracle database is being used. Some of these I learned from my experience as a developer, my own and others mistakes, and the rest, I learned as a dba!
1. Proper and Consistent Naming:
* Keywords in UPPER case
* Provide prefix for table name to group tables (for example according to functionality/module)
* Provide constraint names, index names
* Honor the company specific guidelines and best practices
2. Normalization of Database:
* Normalize up to 4th Normal Form
* I would say, first design ideally then you may denormalize according to needs
If you don't remember what you learned in university, here is a good link to remember by example -
http://support.microsoft.com/kb/100139
3. Primary/Unique key:
* Every table should have a primary and/or unique key
* Helpful for replication product
* Key data should not be visible
* Do not make a string field as primary key
4. Bind Variable:
* Use bind variables instead of literals. This is specially true for all OLTP system or a mixed system (OLTP+heavy reporting) like ours.
For example –
SELECT * FROM users WHERE id = 1009;
Instead, use this in jdbc template –
SELECT * FROM users WHERE id = ?;
And then pass your bind parameters.
5. Nested Query: Avoid nesting queries.
For example, I would avoid the following types of queries -
SELECT login_name FROM user WHERE prov_id IN
(SELECT id FROM agency WHERE id IN
(SELECT prov_id FROM message WHERE ....)
6. Data Segmentation:
* Add column by which you can group/segment data for big tables
For example – add agency id in user forms although it may not be needed for the application.
7. Bulk Operation in Batch:
* Frequent insert/update/deletes in batch.
8. Data Sorting:
* Avoid unnecessary ORDER BY
* Avoid unnecessary DISTINCT
9. Database Call:
* Do not execute queries inside a LOOP
* Do not open connections inside a LOOP
10. IN and EXISTS:
* Use IN if the sub-query has the selective WHERE clause. For example -
SELECT * FROM employee_details ed
WHERE ed.emp_id IN (SELECT id FROM employee WHERE ssn = ‘2990-030-998’) ;
* If opposite or the parent query contains selective WHERE, use the EXISTS. For example -
SELECT * FROM employee_details ed
WHERE EXISTS (SELECT e.id FROM employee e WHERE ed.emp_id = e.id
AND e.city=’Dallas’) ;
11. Data Access:
* Avoid frequent visit to table data – Caching could be a good solution
* select/update values at a single step for multiple columns. For example -
--Non recommended
UPDATE user SET first_name =’John’ WHERE id = 11;
UPDATE user SET last_name =’Doe’ WHERE id = 11;
UPDATE user SET ssn=’9900-8987-009’ WHERE id = 11;
--Recommended
UPDATE user SET first_name=’John’, last_name=’Doe’, ssn=’ 9900-8987-009’ WHERE id = 11;
12. JOIN:
* Use equi JOIN.
* Use LEFT JOIN where you really need to use, otherwise it could be misleading.
* Join in the order that returns least number of rows to the parent step.
* Perform filtering option early in the inner sql that makes a join. Now a days, Oracle optimizer tries to make the join efficiently, but may not do so efficiently all the time
13. CASE:
* Use CASE for multiple aggregates. For example -
--Non Recommended
SELECT a.name "Agency", PENDING.cnt "Pending Approval", APPROVED.cnt "Approved"
FROM agency a
LEFT JOIN
(SELECT p.id, p.name, g.status, COUNT(distinct g.form_id) cnt
FROM agency p
LEFT JOIN user_forms g ON (g.prov_id=p.id)
WHERE g.event_date >= :param1 and g.event_date -1 < :param2
GROUP BY p.id,p.name, g.status
ORDER BY p.name, g.status) PENDING ON (prov.id = PENDING.id)
LEFT JOIN
(SELECT p.id, p.name, g.status, COUNT(distinct g.form_id) cnt
FROM agency p
LEFT JOIN user_forms g ON (g.prov_id=p.id)
WHERE g.event_date >= :param1 and g.event_date -1 < :param2
GROUP BY p.id,p.name, g.status
ORDER BY p.name, g.status) APPROVED ON (prov.id = APPROVED.id)
WHERE a.status=1
ORDER BY a.name;
--Recommended
SELECT p.name "Agency", B.PENDING "Pending Approval", B.APPROVED "Approved"
FROM agency A
LEFT JOIN
(SELECT a.id
, COUNT(CASE when g.status=2 then g.form_id end) "PENDING"
, COUNT(CASE when g.status=3 then g.form_id end) "APPROVED"
FROM user_forms g, agency a
WHERE g.prov_id=a.id
AND g.event_date >= TO_DATE('01-may-2009', 'dd-mon-yyyy')
AND g.event_date -1 < TO_DATE('01-aug-2009', 'dd-mon-yyyy')
group by p.id) B ON A.id=B.id
ORDER BY A.name;
14. Use of Function:
* Be careful of using functions (SUBSTR, TO_CHAR, UPPER etc ) in WHERE/ORDER BY clause –will fail to use an existing index on that column.
* Create function based index for these cases if really needed.
For example -
WHERE UPPER(first_name) like 'A%' - this will not use an index on column "first_name" unless it is function based.
15. Indexing: In general –
* Index all predicates in WHERE clause
* Join columns
* Important FKs.
* For large table (and no heavy updates), for column having low cardinality data, use bitmap index.
For example – “status” (having a range of values 1 to 5) on “form_summary” table is a good candidate for bitmap indexing.
There should have tradeoffs if there are too many insert/updates/deletes. A dba could help more on index tuning.
16. Integer representation for group values:
* Use number instead of string for a small domain of values. This will be helpful for Bitmap indexing and filter operation.
For example - for different status, use corresponding number representation instead of string values.
Working =1, Submitted=2, Approved=3, Deleted=4
Now queries could be used like –
SELECT * FROM user_forms WHERE status > 1;
17. Keeping summary of child table:
* In parent table, keeping information about detail table could be a good trick to improve performance.
Table: USER_FORMS (FORM_ID, CONTENT)
Table: ARCHIVE_DATA (FORM_ID, CONTENT)
--Non Recommended
SELECT uf.form_id, uf.content,
(SELECT COUNT(*) FROM archive_data WHERE form_id = ‘AXB-IU78-Y6TRQ’) AS cnt
FROM user_forms uf
WHERE uf.form_id = ‘AXB-IU78-Y6TRQ’;
--Recommended
Table: USER_FORMS (FORM_ID, CONTENT, ARCHIVE_COUNT) // add an extra column which will keep archive count.
SELECT uf.form_id, uf.content, uf.archive_count
FROM user_forms uf
WHERE uf.form_id = ‘AXB-IU78-Y6TRQ’;
18. Avoid OR and IN clauses:
* Too many OR clauses in queries could be an indication of bad data model! Try to find out alternatives – now!
These are what I remember at the moment.
***
1 comment:
Your blog keeps getting better and better! Your older articles are not as good as newer ones you have a lot more creativity and originality now keep it up!
Post a Comment