–ORACLE
1、查询碎片程度高的表
条件为什么block>100,因为一些很小的表,只有几行数据实际大小很小,但是block一次性分配就是5个(11g开始默认一次性分配1M的block大小了,见create table storged的NEXT参数),5个block相对于几行小表数据来说就相差太大了。
算法中/0.9是因为块的pfree一般为10%,所以一个块最多只用了90%,而且一行数据大于8KB时容易产生行链接,把一行分片存储,一样的一个块连90%都用不满、
AVG_ROW_LEN还是比较准的,比如个人实验情况一表6个字段,一个number,其他5个都是char(100)但是实际数据都是’1111111’7位,AVG_ROW_LEN显示依然为513
SELECT TABLE_NAME,(BLOCKS*8192/1024/1024)”理论大小M”,
(NUM_ROWS*AVG_ROW_LEN/1024/1024/0.9)”实际大小M”,
round((NUM_ROWS*AVG_ROW_LEN/1024/1024/0.9)/(BLOCKS*8192/1024/1024),3)*100||’%’ “实际使用率%”
FROM USER_TABLES where blocks>100 and (NUM_ROWS*AVG_ROW_LEN/1024/1024/0.9)/(BLOCKS*8192/1024/1024)30;
3、集群因子clustering_factor高的表
集群因子越接近块数越好,接近行数则说明索引列的列值相等的行分布极度散列,可能不走索引扫描而走全表扫描
select tab.table_name,tab.blocks,tab.num_rows,ind.index_name,ind.clustering_factor,
round(nvl(ind.clustering_factor,1)/decode(tab.num_rows,0,1,tab.num_rows),3)*100||’%’ “集群因子接近行数”
from user_tables tab, user_indexes ind where tab.table_name=ind.table_name
and tab.blocks>100
and nvl(ind.clustering_factor,1)/decode(tab.num_rows,0,1,tab.num_rows) between 0.35 and 3
4、根据sid查spid或根据spid查sid
select s.sid,s.serial#,s.LOGON_TIME,s.machine,p.spid,p.terminal from v$session s,v$process p where s.paddr=p.addr and s.sid=XX or p.spid=YY
5、根据sid查看具体的sql语句
select username,sql_text,machine,osuser from v$session a,v$sqltext_with_newlines b
where DECODE(a.sql_hash_value, 0, prev_hash_value, sql_hash_value)=b.hash_value and a.sid=&sid order by piece;
6、根据spid查询具体的sql语句
select ss.SID, pr.SPID, ss.action, sa.SQL_FULLTEXT, ss.TERMINAL, ss.PROGRAM, ss.SERIAL#, ss.USERNAME, ss.STATUS, ss.OSUSER, ss.last_call_et
from v$process pr, v$session ss, v$sqlarea sa
where ss.status=’ACTIVE’ and ss.username is not null and pr.ADDR = ss.PADDR
and ss.SQL_ADDRESS = sa.ADDRESS and ss.SQL_HASH_VALUE = sa.HASH_VALUE
and pr.spid = XX
7、查看历史session_id的SQL来自哪个IP
(当然这是个误解,都是历史的了,怎么可能还查到spid,其实查看trace文件名就可以知道spid,trace文件里面有sid和具体sql,如果trace存在incident,那trace就看不到具体sql,但是可以在incident文件中看到具体的sql,如DW_ora_17751.trc中17751就是spid,里面有这样的内容Incident 115 created, dump file: /XX/incident/incdir_115/DW_ora_17751_i115.trc,那么在DW_ora_17751_i115.trc就可以看到具体的sql语句)
DB_ora_29349.trc中出现如下
*** SESSION ID:(5057.12807) 2016-10-26 14:45:52.726
通过表V$ACTIVE_SESSION_HISTORY来查,如下
select a.sql_id,a.machine,a.* from V$ACTIVE_SESSION_HISTORY a where a.session_id=5057 and a.SESSION_SERIAL#=12807
查询上面的machine的IP是多少
select s.sid,s.serial#,s.LOGON_TIME,s.machine,p.spid,p.terminal from v$session s,v$process p where s.paddr=p.addr and s.machine=’localhost’
通过上面的spid在oracle服务器上执行netstat -anp |grep spid即可
[oracle@dwdb trace]$ netstat -anp |grep 17630
tcp 210 0 192.168.64.228:11095 192.168.21.16:1521 ESTABLISHED 17630/oracleDB
tcp 0 0 ::ffff:192.168.64.228:1521 ::ffff:192.168.64.220:59848 ESTABLISHED 17630/oracleDB
出现两个,说明来自220,连接了228数据库服务器,但是又通过228服务器的dblink去连接了16服务器
8、查询DML死锁会话sid,及引起死锁的堵塞者会话blocking_session
select sid, blocking_session, LOGON_TIME,sql_id,status,event,seconds_in_wait,state, BLOCKING_SESSION_STATUS from v$session where event like ‘enq%’ and state=’WAITING’ and BLOCKING_SESSION_STATUS=’VALID’
BLOCKING_SESSION:Session identifier of the blocking session. This column is valid only if BLOCKING_SESSION_STATUS has the value VALID.
可以在v$session.LOGON_TIME上看到引起死锁的堵塞者会话比等待者要早
如果遇到RAC环境,一定要用gv$来查,并且执行alter system kill session ‘sid,serial#’要到RAC对应的实例上去执行
或如下也可以
select
(select username from v$session where sid=a.sid) blocker,
a.sid,
a.id1,
a.id2,
‘ is blocking ‘ “IS BLOCKING”,
(select username from v$session where sid=b.sid) blockee,
b.sid
from v$lock a, v$lock b
where a.block = 1
and b.request > 0
and a.id1 = b.id1
and a.id2 = b.id2;
9、查询DDL锁的sql
SELECT sid, event, p1raw, seconds_in_wait, wait_time
FROM sys.v_$session_wait
WHERE event like ‘library cache %’
p1raw结果为’0000000453992440′
SELECT s.sid, kglpnmod “Mode”, kglpnreq “Req”, s.LOGON_TIME
FROM x$kglpn p, v$session s
WHERE p.kglpnuse=s.saddr
AND kglpnhdl=’0000000453992440′;
结果为671 0 3 2011-11-1 12:00:00
525 2 0 2011-11-4 12:00:00
10、查询锁住的DDL对象
select d.session_id,s.SERIAL#,d.name from dba_ddl_locks d,v$session s where d.owner=’MKLMIGEM’ and d.SESSION_ID=s.sid
11、查询当前正在执行的sql
SELECT s.sid,s.serial#,s.username,spid,v$sql.sql_id,machine,s.terminal,s.program,sql_text
FROM v$process,v$session s,v$sql
WHERE addr=paddr and s.sql_id=v$sql.sql_id AND sql_hash_value=hash_value
12、查询正在执行的SCHEDULER_JOB
select owner,job_name,sid,b.SERIAL#,b.username,spid from ALL_SCHEDULER_RUNNING_JOBS,v$session b,v$process where session_id=sid and paddr=addr
13、查询正在执行的dbms_job
select job,b.sid,b.SERIAL#,b.username,spid from DBA_JOBS_RUNNING a ,v$session b,v$process where a.sid=b.sid and paddr=addr
14、查询一个会话session、process平均消耗多少内存,查看下面avg_used_M值
select round(sum(pga_used_mem)/1024/1024,0) total_used_M, round(sum(pga_used_mem)/count(1)/1024/1024,0) avg_used_M,
round(sum(pga_alloc_mem)/1024/1024,0) total_alloc_M, round(sum(pga_alloc_mem)/count(1)/1024/1024,0) avg_alloc_M from v$process;
15、TOP 10 执行次数排序
select *
from (select executions,username,PARSING_USER_ID,sql_id,sql_text
from v$sql,dba_users where user_id=PARSING_USER_ID order by executions desc)
where rownum 4
group by b.tablespace_name,b.file_name,b.file_id,b.bytes,b.AUTOEXTENSIBLE,b.MAXBYTES
order by b.tablespace_name;
24、查看表空间可用百分比
select b.tablespace_name,a.total,b.free,round((b.free/a.total)*100) “% Free” from
(select tablespace_name, sum(bytes/(1024*1024)) total from dba_data_files group by tablespace_name) a,
(select tablespace_name, round(sum(bytes/(1024*1024))) free from dba_free_space group by tablespace_name) b
WHERE a.tablespace_name = b.tablespace_name
order by “% Free”;
25、查看临时表空间使用率
SELECT temp_used.tablespace_name,total,used,
total – used as “Free”,
round(nvl(total-used, 0) * 100/total,3) “Free percent”
FROM (SELECT tablespace_name, SUM(bytes_used)/1024/1024 used
FROM GV_$TEMP_SPACE_HEADER
GROUP BY tablespace_name) temp_used,
(SELECT tablespace_name, SUM(bytes)/1024/1024 total
FROM dba_temp_files
GROUP BY tablespace_name) temp_total
WHERE temp_used.tablespace_name = temp_total.tablespace_name
26、查询undo表空间使用情况
select tablespace_name,status,sum(bytes)/1024/1024 M from dba_undo_extents group by tablespace_name,status
27、查看ASM磁盘组使用率
select name,round(total_mb/1024) “总容量”,round(free_mb/1024) “空闲空间”,round((free_mb/total_mb)*100) “可用空间比例” from gv$asm_diskgroup
28、统计每个用户使用表空间率
SELECT c.owner “用户”,
a.tablespace_name “表空间名”,
total/1024/1024 “表空间大小M”,
free/1024/1024 “表空间剩余大小M”,
( total – free )/1024/1024 “表空间使用大小M”,
Round(( total – free ) / total, 4) * 100 “表空间总计使用率%”,
c.schemas_use/1024/1024 “用户使用表空间大小M”,
round((schemas_use)/total,4)*100 “用户使用表空间率%”
FROM (SELECT tablespace_name,
Sum(bytes) free
FROM DBA_FREE_SPACE
GROUP BY tablespace_name) a,
(SELECT tablespace_name,
Sum(bytes) total
FROM DBA_DATA_FILES
GROUP BY tablespace_name) b,
(Select owner ,Tablespace_Name,
Sum(bytes) schemas_use
From Dba_Segments
Group By owner,Tablespace_Name) c
WHERE a.tablespace_name = b.tablespace_name
and a.tablespace_name =c.Tablespace_Name
order by c.owner,a.tablespace_name;
SELECT c.owner,
a.tablespace_name,
total/1024/1024,
free/1024/102,
( total – free )/1024/1024,
Round(( total – free )/total, 4)*100,
c.schemas_use/1024/1024,
round((schemas_use)/total,4)*100
FROM (SELECT tablespace_name,
Sum(bytes) free
FROM DBA_FREE_SPACE
GROUP BY tablespace_name) a,
(SELECT tablespace_name,
Sum(bytes) total
FROM DBA_DATA_FILES
GROUP BY tablespace_name) b,
(Select owner ,Tablespace_Name,
Sum(bytes) schemas_use
From Dba_Segments
Group By owner,Tablespace_Name) c
WHERE a.tablespace_name = b.tablespace_name
and a.tablespace_name =c.Tablespace_Name
order by c.owner,a.tablespace_name;
29、查看闪回区\快速恢复区空间使用率
select sum(percent_space_used)||’%’ “已使用空间比例” from V$RECOVERY_AREA_USAGE
30、查看僵死进程,分两种
alter system kill session一执行则session即标记为KILLED,但是如果会话产生的数据量大则这个kill可能会比较久,在这个过程中session标记为KILLED但是这个会话还在V$session中,则V$session.paddr还在,所以可以匹配到V$process.addr,所以process进程还在;当kill过程执行完毕,则这个会话即不在V$session中
会话不在的
select * from v$process where addr not in (select paddr from v$session) and pid not in (1,17,18)
会话还在的,但是会话标记为killed
select * from v$process where addr in (select paddr from v$session where status=’KILLED’)
再根据上述结果中的SPID通过如下命令可以查看到process的启动时间
ps auxw|head -1;ps auxw|grep SPID
31、查看行迁移或行链接的表
select * From dba_tables where nvl(chain_cnt,0)0
chain_cnt :Number of rows in the table that are chained from one data block to another or that have migrated to a new block, requiring a link to preserve the old rowid. This column is updated only after you analyze the table.
32、数据缓冲区命中率
LECT a.VALUE+b.VALUE logical_reads, c.VALUE phys_reads,
round(100*(1-c.value/(a.value+b.value)),2)||’%’ hit_ratio
FROM v$sysstat a,v$sysstat b,v$sysstat c
WHERE a.NAME=’db block gets’
AND b.NAME=’consistent gets’
AND c.NAME=’physical reads’;
或
SELECT DB_BLOCK_GETS+CONSISTENT_GETS Logical_reads,PHYSICAL_READS phys_reads,
round(100*(1-(PHYSICAL_READS/(DB_BLOCK_GETS+CONSISTENT_GETS))),2)||’%’ “Hit Ratio”
FROM V$BUFFER_POOL_STATISTICS WHERE NAME=’DEFAULT’;
33、共享池命中率
以下两者应该都可以,看个人怎么理解
select sum(pinhits)/sum(pins)*100 from v$librarycache;
select sum(pinhits-reloads)/sum(pins)*100 from v$librarycache;
34、查询归档日志切换频率
select sequence#,to_char(first_time,’yyyymmdd_hh24:mi:ss’)
firsttime,round((first_time-lag(first_time) over(order by first_time))*24*60,2) minutes from
v$log_history where first_time > sysdate – 3 order by first_time,minutes;
或
select sequence#,to_char(first_time,’yyyy-mm-dd hh24:mi:ss’) First_time,First_change#,switch_change# from
v$loghist where first_time>sysdate-3 order by 1;
35、查询lgwr进程写日志时每执行一次lgwr需要多少秒,在state是waiting的情况下,某个等待编号seq#下,seconds_in_wait达多少秒,就是lgwr进程写一次IO需要多少秒
select event,state,seq#,seconds_in_wait,program from v$session where program like ‘%LGWR%’ and state=’WAITING’
36、查询没有索引的表
Select table_name from user_tables where table_name not in (select table_name from user_indexes)
Select table_name from user_tables where table_name not in (select table_name from user_ind_columns)
37、查询7天的db time
TH sysstat AS
(select sn.begin_interval_time begin_interval_time,
sn.end_interval_time end_interval_time,
ss.stat_name stat_name,
ss.value e_value,
lag(ss.value, 1) over(order by ss.snap_id) b_value
from dba_hist_sysstat ss, dba_hist_snapshot sn
where trunc(sn.begin_interval_time) >= sysdate – 7
and ss.snap_id = sn.snap_id
and ss.dbid = sn.dbid
and ss.instance_number = sn.instance_number
and ss.dbid = (select dbid from v$database)
and ss.instance_number = (select instance_number from v$instance)
and ss.stat_name = ‘DB time’)
select to_char(BEGIN_INTERVAL_TIME, ‘mm-dd hh24:mi’) ||
to_char(END_INTERVAL_TIME, ‘ hh24:mi’) date_time,
stat_name,
round((e_value – nvl(b_value, 0)) /
(extract(day from(end_interval_time – begin_interval_time)) * 24 * 60 * 60 +
extract(hour from(end_interval_time – begin_interval_time)) * 60 * 60 +
extract(minute from(end_interval_time – begin_interval_time)) * 60 +
extract(second from(end_interval_time – begin_interval_time))),
0) per_sec
from sysstat
where (e_value – nvl(b_value, 0)) > 0
and nvl(b_value, 0) > 0
38、查询产生热块较多的对象
x$bh .tch(Touch)表示访问次数越高,热点快竞争问题就存在
SELECT e.owner, e.segment_name, e.segment_type
FROM dba_extents e,
(SELECT *
FROM (SELECT addr,ts#,file#,dbarfil,dbablk,tch
FROM x$bh
ORDER BY tch DESC)
WHERE ROWNUM < 11) b
WHERE e.relative_fno = b.dbarfil
AND e.block_id b.dbablk;
39、导出AWR报告的SQL语句
select * from dba_hist_snapshot
select * from table(dbms_workload_repository.awr_report_html(DBID, INSTANCE_NUMBER, startsnapid,endsnapid))
select * from TABLE(DBMS_WORKLOAD_REPOSITORY.awr_diff_report_html(DBID, INSTANCE_NUMBER, startsnapid,endsnapid, DBID, INSTANCE_NUMBER, startsnapid,endsnapid));
40、查询某个SQL的执行计划
select a.hash_value,a.* from v$sql a where sql_id=’0n4qfzbqfsjm3′
select * from table(dbms_xplan.display_cursor(v$sql.hash_value,0,’advanced’));
含顺序的
select * from table(xplan.display_cursor(‘v$sql.sql_id’,0,’advanced’));
不过要先创建xplan包,再执行
SQL> CREATE PUBLIC SYNONYM XPLAN FOR SYS.XPLAN;
SQL> grant execute on sys.xplan to public;
–DB2
–每天需要监控的内容
–监控CPU和内存
vmstat 1 5
–监控IO
iostat 1 5
sar -d 1 5
–查看表空间的存储空间状态
–查看数据库目录的可用空间 dbpath
–查看日志目录所在的文件系统的剩余空间
df -k /logdir
–查看归档的日志
db2adutl query db testdb
–检查诊断日志和管理通知日志
db2diag.log
db2instname.nfy
–检查数据库备份
db2 list history backup all for testdb
db2 list history archive log all for testdb
db2 list history all for testdb
–查看数据库表的状态
db2 “select tabname,colcount,status from syscat.tables
where tabschema not like ‘SYS%’ order by tabname
“
status = N 正常
status = C 需要检查完整性
db2 set integrity for tabname immediate checked
–查看缓存池命中率,数据逻辑读,数据物理读
db2 “select SNAPSHOT_TIMESTAMP,char(bp_name,15)
,TOTAL_LOGICAL_READS,TOTAL_PHYSICAL_READS,total_hit_ratio_percent
,data_logical_reads,data_physical_reads,data_hit_ratio_percent
from sysibmadm.bp_hitratio
“
–查看缓存池命中率,索引逻辑读,索引物理读
db2 “select SNAPSHOT_TIMESTAMP,char(bp_name,15)
,TOTAL_LOGICAL_READS,TOTAL_PHYSICAL_READS,total_hit_ratio_percent
,index_logical_reads,index_physical_reads,index_hit_ratio_percent
from sysibmadm.bp_hitratio
“
–监控执行成本最高的SQL
db2 “select agent_id,PERCENT_ROWS_SELECTED
from sysibmadm.appl_performance
order by percent_rows_selected
“
–监控运行最长的SQL
db2 “select agent_id,appl_status,elapsed_time_min
from sysibmadm.long_running_sql
order by elapsed_time_min desc fetch first 5 rows only
“
–监控执行次数最多的SQL
db2 “select SNAPSHOT_TIMESTAMP, NUM_EXECUTIONS, char(STMT_TEXT,150)
from sysibmadm.top_dynamic_sql
order by num_executions desc fetch first 10 rows only
“
–监控排序次数最多的SQL
db2 “select stmt_sorts,char(STMT_TEXT,150),SNAPSHOT_TIMESTAMP,sorts_per_execution
from sysibmadm.top_dynamic_sql
order by stmt_sorts desc fetch first 10 rows only
“
–监控引起锁等待的SQL
db2 “select agent_id, char(stmt_text,100) as statement, stmt_elapsed_time_ms
from table(snapshot_statement(‘TESTDB’,-1)) as B
where agent_id in
(select agent_id_holding_lk from table(snapshot_lockwait(‘TESTDB’,-1)) as A
order by lock_wait_start_time asc fetch first 20 rows only
)
order by stmt_elapsed_time_ms desc
”
–查看选择的行数与读取的行数的比例 应大于 20%
db2 “select float(rows_selected)/rows_read from sysibmadm.snapdb where rows_read!=0 “
–表扫描次数 scans
db2pd -d testdb -tcbstats
–查看扫描次数多的表相关的语句
db2 ” select stmt_text from table(mon_get_pkg_cache_stmt(NULL,NULL,NULL,-2)) AS T WHERE stmt_text like ‘%TESTTAB%’ “
–检查锁相关的等待,超时,升级,死锁
db2 get snapshot for all on testdb > log.txt
grep -n “Deadlocks detected” log.txt | grep -v “= 0”
grep -n “Lock waits” log.txt | grep -v “= 0”
grep -n “Lock escalation” log.txt | grep -v “= 0”
grep -n “Lock Timeouts” log.txt | grep -v “= 0”
–查看当前运行最频繁、最消耗资源的SQL (Costly SQL)
db2 “select substr(stmt_text,1,100) as stmt_text
from table(mon_get_pkg_cache_stmt(NULL,NULL,NULL,-2)) AS T
where rows_read!=0
order by rows_returned/rows_read asc fetch first 10 rows only
“
–执行最频繁的SQL
db2 “select substr(stmt_text,1,100) as stmt_text, num_executions
from table(mon_get_pkg_cache_stmt(NULL,NULL,NULL,-2)) AS T
order by num_executions desc fetch first 10 rows only
“
–排序最多的语句
db2 “select substr(stmt_text,1,100) as stmt_text , total_sorts
from table(mon_get_pkg_cache_stmt(NULL,NULL,NULL,-2)) AS T
order by total_sorts desc fetch first 10 rows only
“
———————————————————————————-
— 每周需要监控的内容
———————————————————————————-
–检查备份完整性
db2ckbkp -h TESTDB.0.db2inst1.NODE0000.CATN0000.20131023221025.001
–检查是否需要runstats
没有搜集过统计信息的表
db2 “select tabname from syscat.tables where stats_time is null “
没有收集过统计信息的索引
db2 “select indname from syscat.indexes where stats_time is null “
15天没有更新过统计信息的表
db2 “select tabname from syscat.tables where stats_time < current timestamp – 15 days"
–监控表是否需要重组
db2 reorgchk update statistics on table all
–监控新对象
db2 “select tabschema, tabname, create_time
from syscat.tables
where create_time > ‘2014-01-01.00.00.00.000000’
“
–包缓存中的SQL
db2 “select substr(stmt_text,1,200) as sql_stmt, current date
from table(snapshot_dyn_sql(‘TESTDB’,-1)) as snapshot_dyn_sql
“
监控系统资源占用情况
db2 “select application_handle, substr(application_name,1,30) as appname, total_cpu_time
from table(mon_get_connection(null,null)) as t
order by total_cpu_time desc
“
后台实用程序
db2 list utilities
–获得数据库总大小信息
db2 “call get_dbsize_info(?,?,?,0)”
–检查数据库用户db2inst1的权限
db2 ” select substr(authority,1,30) as authority , d_user, d_group, d_public,role_user, role_group,role_public,d_role
from table(sysproc.auth_list_authorities_for_authid(‘DB2INST1′,’U’)) as t
order by authority
“
–查找无效对象
db2 “select char(tabschema,20), char(tabname,40),type,status from syscat.tables order by 1”
db2 “select viewname from syscat.views where valid=’N'”
db2 “select trigname from syscat.triggers where VALID=’N’ “
—————————————————————
–内存监控
–数据库和实例每一个内存池的内存大小
db2mtrk -i -p -v -d
–实例总共占用的内存
db2pd -dbptnmem
Cached:含在MemUsed内,已经分配的内存,当前没有使用到,
但这部分内存不能给其他进程使用,DB2可以分配给其他内存池。
—-
db2top 每隔一段时间收集一次快照,然后通过计算其与最近一次快照之间的数值差别与经过的时间
–交互模式
db2top -d testdb
db2top -d testdb -f db2top_collect.txt -C -m 2 -i 15
其中-m参数指定运行多少分钟,-i指定每隔多少秒收集一次快照
–播放监控文件
db2top -d testdb -f db2top_collect.txt -b l -A
–直接跳转到某个时间点重新播放监控
db2top -d testdb -f db2top_collect.txt /02:00:00
–IO监控 硬盘使用 5秒钟监控一次,监控10次
–linux
iostat -d 5 10
–aix
iostat -D 5 10
–CPU和虚拟内存监控(物理内存和交换空间),5秒钟监控一次,监控10次
vmstat 5 10
ipcs -a 进程间通信的信息
db2grep -dump 查看DB2安装版本



