利用hanganalyz_systemstate dump诊断数据库

Linux操作系统时间：2012-03-22 16:36:390删除编辑诊断数据库是否hang

一般采用两个命令：hanganalyze/dump systemstate，其中systemstate级别大于256时会对每个进程进行dump，当数据库进程太多时会比较消耗资源，尤其在非solaris/linux平台

HANGANALYZE uses internal kernel calls to determine
if a session is waiting for a resource, and reports the relationships
between blockers and waiters. In addition, it determines which processes
are "interesting" to be dumped, and may perform. automatic 
dumps and on those processes, based on the level used
while executing the HANGANALYZE.

The "HANGANALYZE" command has been

available since Oracle Release 8.1.6. In Oracle9i it was enhanced to
provide "cluster wide" information in Real Application Cluster
(RAC) environments on a single shot. The meaning of this is that it will
generate information for all the sessions in the cluster regardless of the
instance that issued the command.

3 Syntax Examples:

ALTER SESSION SET EVENTS 'immediate trace name

  HANGANALYZE level ';

ORADEBUG hanganalyze

ORADEBUG -g def hanganalyze

The sets the amount of additional

  information that will be extracted from the processes found by
  HANGANALYZE (ERROSTACK dump) based on the "STATE" of the node.

The levels are defined as follows:

    10
     |  | 
    Dump all processes (IGN state) 
     |

—

    5
     |  | 
    Level 4 + Dump all processes involved in wait
    chains (NLEAF state)
     |

    4
     |  | 
    Level 3 + Dump leaf nodes (blockers) in wait
    chains (LEAF,LEAF_NW,IGN_DMP state) 
     |

    3
     |  | 
    Level 2 + Dump only processes thought to be in a
    hang (IN_HANG state) 
     |

    1-2
     |  | 
    Only HANGANALYZE output, no process dump at all
     |

It is advisable not to use levels higher than 3 due
to the potentially large number of trace files that may be produced (and
could overwhelm the I/O subsystem). Since HANGANALYZE will be mostly used
to diagnose "true" hangs, a level 3 will dump the processes that
are involved in the hang condition - this usually involves fewer than 4
processes.

In some cases hanganalyze could appear to be hung. Hanganalyze will only
return when all processes are dumped.  To verify that hanganalyze is
working on non-clustered environments, monitor the size of the trace files
in the user_dump_destination. If the trace files are growing in size,
hanganalyze is not hung.  To verify that hanganalyze is working on
clustered environments, monitor the size of the "diag" trace file
in the background_dump_destination.
 |

A system state is a set of process states for all
processes on the instance when the dump is taken.  A system state dump
is useful in determining the interaction between processes.  A
systemstate dump will report on resources that are being held by each process.

3 Syntax Examples:

ALTER SESSION SET EVENTS 'immediate trace name

  SYSTEMSTATE level ';

ORADEBUG dump systemstate

ORADEBUG -g all dump systemstate

The sets the amount of additional

  information that will be extracted from the processes found by
  SYSTEMSTATE dump based on the "STATE" of the node.

The levels are defined as follows:

    1
     |  | 
    Very basic process information only
     |

—

    2
     |  | 
    process + session state objects
     |

    10
     |  | 
    Most common level - includes state object trees
    for all processes
     |

    Level
    + 256
     |  | 
    Adding 256 to the level will try to dump short
    stack info for each process.  
     |

Dumping

    short stacks in the systemstate dump is an enhancement that has been
    included in patchsets 9.2.0.6 and 10.1.0.4.  It is also included
    in base release 10.2.0.1.

Short

    stacks are produced reasonably quickly on Solaris and Linux but on
    other platforms including short stacks in dumps can take a very long
    time. Hence it is advisable to test if the overhead is acceptable
    before using this option.

Typical levels are 266 (Solaris or Linux) and 10 (other platforms)

The ideal manner to collect the hanganalyze and systemstate dumps is to collect 2 sets. However, it is best to have the 2 hanganalyzes in one trace file and the 2 systemstate dumps is 2 separate trace files. In order to do this, you will need 3 separate sqlplus sessions. We will identify these as SQL1, SQL2, and SQL3.

export

 ORACLE_SID=PROD           
 ## Replace PROD with the SID you want to trace

Using SQL*Plus

 connect as SYSDBA using the following command:

sqlplus -prelim / as sysdba

Do this 3 times in 3 separate windows, creating 3 sqlplus sessions ()

In execute the following:

SQL1> oradebug setmypid

SQL1> oradebug unlimit

SQL1> oradebug hanganalyze 3

In execute the following:

SQL2> oradebug setmypid

SQL2> oradebug unlimit

SQL2> oradebug dump systemstate 266

Wait at least 2

 minutes to give time to identify process state changes.

In execute the following:

SQL1> oradebug hanganalyze 3

In execute the following:

SQL3> oradebug setmypid

SQL3> oradebug unlimit

SQL3> oradebug dump systemstate 266

Systemstate

 level 258 and 266 are only available in patchsets 9.2.0.6, 10.1.0.4 and
 base release 10.2.0.1 and higher. If you are not on one of these releases,
 use level 10 in place of the level suggested above.

If you are using

 systemstate level 266 and it is taking much longer than expected to
 generate the dump file, then end this systemstate dump and try level 258.

The trace files will be written to the "diag" background process trace file in the background_dump_destination on each node.

Identify the SID

 you want to trace

export ORACLE_SID=PROD

Replace PROD with the SID you want to trace

Using SQL*Plus

 connect as "/ AS SYSDBA" using the following command:

sqlplus -prelim / as sysdba

Execute the

 following:

SQL> oradebug setmypid

SQL> oradebug unlimit

SQL> oradebug -g all hanganalyze 3

SQL> oradebug -g all dump systemstate 266

Wait at least 2

 minutes to give time to identify process state changes.

Execute the

 following:

SQL> oradebug -g all hanganalyze 3

SQL> oradebug -g all dump systemstate 266

Verify that the the "diag" background process trace file has been updated in the background_dump_destination on each node. Examine each "diag" file to ensure that it contains 2 hanganalyze dumps and a systemstate dump.

所谓的hang是指oracle内部死锁。当DML发生内部死锁时，oracle可以自动处理；当内核级资源(latch/pin)死锁时，通常oracle就会挂在那里

当CYCLE出现时，意味着数据库真的遭遇死锁了

Cycle 1 : :

</472/0x80057dd8//library cache lock> <— sid 17, ospid 6674: blocker/waiter

— </7/0x800580f4//library cache lock> <— sid 13, ospid 6676: blocker/waiter

— </204/0x8005a644//library cache lock> <— sid 21, ospid 5813: blocker/waiter Dumping System_State and Fixed_SGA in process with ospid Dumping Process information for process with ospid Dumping Process information for process with ospid Dumping Process information for process with ospid f your hanganalyze output has a "CYCLE" section, the database is in a hang state.

Note down the

 SID and OSPID for the blocker and waiter sessions to use in  and 
 steps.

Go to the next

 step to collect additional data.

the "OPEN CHAIN" section reports sessions involved in a wait chain. A wait chain means that one session is blocking one or more other sessions. Open chains represents "stuck" or "locked" sessions

Hanganalyze

1

Open chains found:

Chain 1 : :

</44773/0x265f15c//SQL*Net message from client> <–sid 16, ospid 1948: blocker

— </5/0x265fad4//enqueue> <–sid 12, ospid 2112: waiter

— </14/0x265fdfc//enqueue> <–sid 13, ospid 2076: waiter

Chain 2 : :

<19/3/0x2660124/2392/No Wait>

Hanganalyze

2

Open chains found:

Chain 1 : :

</44773/0x265f15c//SQL*Net message from client> <–sid 16, ospid 1948: blocker

— </5/0x265fad4//enqueue> <–sid 12, ospid 2112: waiter

— </14/0x265fdfc//enqueue> <–sid 13, ospid 2076: waiter

Chain 2 : :

<19/3/0x2660124/2392/No Wait>

Hanganalyze

3

Open chains found:

Chain 1 : :

</44773/0x265f15c//SQL*Net message from client> <–sid 16, ospid 1948: blocker

— </5/0x265fad4//enqueue> <–sid 12, ospid 2112: waiter

— </14/0x265fdfc//enqueue> <–sid 13, ospid 2076: waiter

— </3/0x2660124//enqueue> <–sid 19, ospid 2392: waiter

通过对比三次报告，判断会话是否阻塞时间过长，在上面例子中，需要对blocker和第一个waiter会话做errorstack；

若一个进程阻塞了10个或更多sesson，通常Open Chain之前会有Found信息，若一个blocker出现多次，则需要收集errorstack

Found 34 objects waiting for ospid/wait_event>

<131/754/0x9fc1e8//No Wait>

Found 17 objects waiting for ospid/wait_event>

<245/2343/0xa19f48//latch free>

Found 24 objects waiting for ospid/wait_event>

<144/2183/0xa0c9b8//latch free>

SQL> oradebug setospid

SQL> oradebug unlimit

SQL> oradebug dump errorstack 3

若没有满足以上条件的输出结果，那么数据库应该是性能慢而非hang

包含call stack和process state两部分，其中process state又有一系列的SO(state object)组成，大致分为process/session/library object lock

1 process

 | 
shows that the state object is of type process
 |

— |

 | 
shows that the ospid is equal to 25765
 |

0x3661bd0c, type: 2, owner: (nil), flag: INIT/-/-/0x00

Oracle pid=16, calls cur/top: 0x3672bb94/0x3672bb94, flag: (0) -

int error: 0, call error: 0, sess error: 0, txn error 0

(post info) last post received: 0 0 0

last post received-location: No post

last process to post me: none

last post sent: 0 0 0

last post sent-location: No post

last process posted by me: none

(latch info) wait_event=0 bits=0

Process Group: DEFAULT, pseudo proc:
0x3664d528

O/S info: user: oracle, term: pts/0,

OSD pid info: Unix process pid: 25765,
image: [username@machine.name.com](http://blog.itpub.net/15480802/viewspace-719304/mailto:username@machine.name.com)
(TNS V1-V3)
 |

shows that the state object is of type process | |

shows that the ospid is equal to 25765 |

2 session

  0x366dd1b4,
type: 4, owner: 0x3661bd0c, flag: INIT/-/-/0x00

 trans: 0x354e1fb0, creator:
0x3661bd0c, flag: (41) USR/- BSY/-/-/-/-/-

DID: 0001-0010-000010E9, short-term DID: 0000-0000-00000000

txn branch: (nil)

oct: 26, prv: 0, 0x, psql: 0x32a2c520, user: 54/SCOTT

O/S info: user: oracle, term: pts/0, ospid: 25764,

machine: machine.name.com

program: [sqlplus@machine.name.com](http://blog.itpub.net/15480802/viewspace-719304/mailto:sqlplus@machine.name.com)
(TNS V1-V3)

application name: SQL*Plus, hash value=3669949024

blocking sess=0x0x366c7f14 seq=39 wait_time=0

seconds since wait started=11900

, object #=c7cc,
table/partition=0
 |

shows that the state object is of type process | |

shows that the sid for this session is 118. This is used to ensure that you are looking at the correct session.
0x | shows that the address of the SQL for this session is 3076f3f8. This will be used to find the Active SQL in the next step. | |

shows that the wait event for this session is enq: TM – contention. This will be used when determining the type of lock associated with the hang or lock situation that you are troubleshooting. | |

shows the value for P1 if the wait event is an enqueue. P1 is a hexadecimal value that represents the lock type and lock mode (lmode). See MetaLink Note 34566.1 for details on obtaining the lock type and lock mode from P1. MetaLink Note 29787.1 provides details on the lock types and lock modes. |

3 library object lock

通过session信息里的sql address搜查，可以找到相应的library object lock(搜索时去掉0x前缀)

0x2f041b48, type: 53, owner: 0x366dd1b4, flag: INIT/-/-/0x00

lock=2f041b48 handle=3076f3f8 mode=N

call pin=(nil) session pin=(nil) hpc=0000

hlc=0000

htl=0x2f041b94[0x30c3e358,0x324497a4]

htb=0x30c3e358 ssga=0x30c3de24

user=366dd1b4 session=366dd1b4 count=1

flags=[0000] savepoint=0x460964fc

mtx=0x3076f4ac(1) cdp=1

hash=01e8abf8cabca30a0249f91737e32c8f

timestamp=03-27-2007 14:39:28

namespace=CRSR

flags=RON/KGHP/TIM/PN0/SML/KST/DBN/MTX/[120100d0]

kkkk-dddd-llll=0000-0001-0001 lock=N pin=0

latch#=1 hpc=0002 hlc=0002

lwt=0x3076f454[0x3076f454,0x3076f454]

ltm=0x3076f45c[0x3076f45c,0x3076f45c]

pwt=0x3076f438[0x3076f438,0x3076f438]

ptm=0x3076f440[0x3076f440,0x3076f440]

ref=0x3076f474[0x3076f474,0x3076f474]

lnd=0x3076f480[0x3076f480,0x3076f480]
 |

shows that the state object is of type Library Object Lock | |

shows that the Library Object Handle address is 3076f3f8. This address is the same as the SQL address found in the "session" state object. This is use to verify that this is the state object containing the active SQL. | |

show that the active SQL is "lock table emp in exclusive mode". |

可以将显示的function在MOS作为关键字搜索，一般来说需要忽略kse/ksd打头的

As a general rule, ignore the functions beginning with 'kse' and 'ksd'. Start by using the first 5 functions following the 'kse' and 'ksd' functions. You may need to try several combinations of search criteria before finding any previously identified issues that match the issue you are attempting to troubleshoot.

对该案例可以尝试ntprd nsprecv nsrdr nsdo nsbrecv enq: TM conection关键字搜索

----- Call Stack Trace -----

calling

call    
entry               
argument values in hex

location

type    
point               
(? means dubious value)

——————– ——– ——————–

----------------------------

ksedst()+27

call    
ksedst1()           
1 ? 1 ?

ksedmp()+557

call    
ksedst()            
1 ? 8409404 ? AC007B ? 18 ?

C0246C4 ? 6 ?

ksdxfdmp()+1382

call    
0C969FDE            
3 ? 1 ? BBA5021 ? 6F64736E ?

           312B2928
? 373134 ?

ksdxcb()+1321

call    
00000000            
BFFFC3B0 ? 11 ? 3 ?

BFFFC310 ? BFFFC360 ?

sspuser()+102

call    
00000000            
1 ? 2000 ? 0 ? 0 ? 0 ? 0 ?

00A767A0

signal  
00000000            
C ? BFFFC838 ? BFFFC8B8 ?

ntprd()+150

call    
sntpread()          
CCAAE88 ? CCAB4A0 ? CCAA030 ?

CCADB9E ? CCAA09C ? 0 ?

nsprecv()+416

call    
00000000            
CCA9F08 ? CCADB9E ? CCAA09C ?

0 ? 0 ? A ?

nsrdr()+155

call    
nsprecv()           
CCA9BF8 ? BFFFD540 ?

CCA1AC4 ?

nsdo()+1417

call    
nsrdr()             
CCA9BF8 ? CCA1AC4 ?

nsbrecv()+51

call    
nsdo()              
CCA19A8 ? 55 ? CCA1AC4 ? 0 ?

BFFFD7FC ? 0 ? 3 ?

nioqrc()+373

call    
nsbrecv()           
CCA19A8 ? CCA1AC4 ?

BFFFD7FC ? 0 ?

__PGOSF86_opikndf2( call

00000000            
CC6B5C4 ? 0 ? BFFFED90 ? 1 ?

)+740

0 ? 0 ?

opitsk()+511

    call    
00000000            
CC6B5C4 ? 2 ? BFFFED90 ? 1 ?

0 ? 0 ?

opiino()+821

call    
opitsk()            
0 ? 0 ?

opiodr()+835

call    
00000000            
3C ? 4 ? BFFFF830 ?

opidrv()+466

call    
opiodr()            
3C ? 4 ? BFFFF830 ? 0 ?

sou2o()+91

call    
opidrv()            
3C ? 4 ? BFFFF830 ?

opimai_real()+117

call    
sou2o()             
BFFFF814 ? 3C ? 4 ?

        BFFFF830 ?

main()+111

call     opimai_real()       
2 ? BFFFF860 ?

__libc_start_main() call

00000000            
2 ? BFFFF924 ? BFFFF930 ?

+211

A81C66 ? BBAFF4 ? 0 ?
 |

通常情况下，以上步骤可以大致诊断出数据库是否hang以及造成hang的根本原因

                                                来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/15480802/viewspace-719304/，如需转载，请注明出处，否则将追究法律责任。