Database Health Checks in Oracle – Practical DBA Guide
Database health checks are a critical DBA activity, especially when application users report performance issues, slowness, or errors. This blog provides a step-by-step, real-world checklist that DBAs can follow to quickly assess the overall health of an Oracle database and identify potential problem areas.
When Should You Perform a Health Check?
Application slowness or hanging
User complaints about timeouts or errors
Sudden increase in database load
Post-maintenance or patching validation
Proactive daily/weekly monitoring
1. Check the Database Details
Understanding the basic database status is the first step.
SET pages 9999 lines 300
COL open_mode FOR a10
COL host_name FOR a30
SELECT name AS db_name,
host_name,
database_role,
open_mode,
version AS db_version,
logins,
TO_CHAR(startup_time,'DD-MON-YYYY HH24:MI:SS') AS db_up_time
FROM v$database, gv$instance;
For RAC Databases
SET pages 9999 lines 300
COL open_mode FOR a10
COL host_name FOR a30
SELECT inst_id,
instance_name,
name AS db_name,
host_name,
database_role,
open_mode,
version AS db_version,
logins,
TO_CHAR(startup_time,'DD-MON-YYYY HH24:MI:SS') AS db_up_time
FROM v$database, gv$instance;
2. Monitor Resource Consumption
Check whether database resources are nearing configured limits.
SELECT *
FROM v$resource_limit
WHERE resource_name IN ('processes','sessions');
Note:
V$SESSIONshows current sessions, whereasV$RESOURCE_LIMITshows current and maximum usage, which is more useful during incidents.
3. Check the Alert Log
The alert log records critical database events such as ORA errors, crashes, and space issues.
Locate Alert Log
locate alert_<ORACLE_SID>
OR
find / -name 'alert_*.log' 2> /dev/null
Review Alert Log
vi <alert_log_path>
Inside vi:
Shift + G→ go to end of file?ORA-→ search for Oracle errorsn / N→ navigate search results
Alert Log Location (11g and Above)
SELECT name, value
FROM v$diag_info
WHERE name = 'Diag Trace';
4. Check Listener Log
Listener issues can cause connectivity problems.
Locate Listener Log
locate listener.log
OR
find / -name 'listener.log' 2> /dev/null
Review Listener Log
vi listener.log
Search for:
TNS-errorserror
OR
lsnrctl status
From the output, note the Listener Log File location.
5. Check Filesystem Space Usage
Disk space issues are one of the most common causes of outages.
df -h # Linux / UNIX
df -g # AIX
Ensure adequate free space for:
Datafiles
Redo logs
Archive logs
Oracle Home
6. Generate AWR Report
AWR reports help identify performance bottlenecks.
@?/rdbms/admin/awrrpt.sql
For RAC
@?/rdbms/admin/awrrpti.sql
Compare Two Periods (Optional)
@?/rdbms/admin/awrddrpt.sql
7. Generate ADDM Report
ADDM analyzes AWR data and provides findings and recommendations.
@?/rdbms/admin/addmrpt.sql
For RAC
@?/rdbms/admin/addmrpti.sql
8. Find Locks, Blockers, and Waiting Sessions
Locking issues can severely impact application performance.
Basic Lock Queries
SELECT * FROM v$lock;
SELECT * FROM gv$lock; -- RAC
Blocking and Waiting Sessions
SELECT * FROM v$lock
WHERE block > 0 OR request > 0;
Detailed Lock Information
SELECT object_name, s.inst_id, s.sid, s.serial#, p.spid, s.osuser,
s.program, s.machine, s.status
FROM gv$locked_object l,
gv$session s,
gv$process p,
dba_objects o
WHERE l.object_id = o.object_id
AND l.session_id = s.sid
AND s.paddr = p.addr;
Identify Blockers and Waiters
SELECT blocking_session, sid, serial#, wait_class,
seconds_in_wait, username, osuser, program, logon_time
FROM v$session
WHERE blocking_session IS NOT NULL
ORDER BY 1;
If two sessions are both blockers and waiters, Oracle automatically resolves the deadlock.
9. Check Alerts in OEM
Oracle Enterprise Manager provides centralized alert monitoring.
Steps
Login to OEM
Navigate to Alerts
Review:
Targets Down
Critical Alerts
Warning Alerts
Errors
OEM alerts often provide early warning signs of database issues.
Conclusion
A structured database health check helps DBAs quickly diagnose issues and restore normal operations. Following this checklist ensures that database, OS, listener, and performance aspects are all reviewed in a systematic manner.
Proactive and consistent health checks significantly reduce downtime and improve application reliability.
No comments:
Post a Comment