下面的测试来至于一位网友,他们生产数据库异常,在drop表空间,重建控制文件后,报下面的错误:
Sat Jul 19 00:45:47 2014SMON: enabling cache recoverySat Jul 19 00:45:47 2014Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_12464.trc:ORA-01173: data dictionary indicates missing data file from system tablespaceSat Jul 19 00:45:47 2014Error 1173 happened during db open, shutting down databaseUSER: terminating instance due to error 1173Instance terminated by USER, pid = 12464ORA-1092 signalled during: alter database open resetlogs…
下面是简单的测试一下,提供2种方法来解决此故障。欢迎大家加入ORACLE超级群:17115662 免费解决各种ORACLE问题,以后BLOG将迁移到http://www.htz.pw1,数据库版本
www.htz.pw > select * from v$version where rownum<3; BANNER—————————————————————-Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64biPL/SQL Release 10.2.0.4.0 – Productionwww.htz.pw > !lsb_release -aLSB Version: :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarchDistributor ID: RedHatEnterpriseASDescription: Red Hat Enterprise Linux AS release 4 (Nahant Update 8)Release: 4Codename: NahantUpdate8
2,查询undo段的名字因为在实验过程中,我们需要使用到undo段的名字,所以这里提前查询出来,如果在生产环境,我们可以直接使用bbed去查询undo$表,或者是使用odu,dul等工具去直接抽取undo$表,另外了可以使用strings system数据文件来过滤UNDO段。
www.htz.pw > @undo_segment.sql SEGMENT_HEADER TABLESPACE SEGMENT_NAME FILE#.BLOCK STATUS SEGMENT_SIZE(M)——————– —————————— ————– ———- —————SYSTEM.OLD PRI.SYSTEM 1.9 ONLINE 0UNDOTBS1.CURRENT PUB._SYSSMU1$ 2.9 ONLINE 1UNDOTBS1.CURRENT PUB._SYSSMU10$ 2.153 ONLINE 1UNDOTBS1.CURRENT PUB._SYSSMU9$ 2.137 ONLINE 13UNDOTBS1.CURRENT PUB._SYSSMU8$ 2.121 ONLINE 18UNDOTBS1.CURRENT PUB._SYSSMU7$ 2.105 ONLINE 0UNDOTBS1.CURRENT PUB._SYSSMU6$ 2.89 ONLINE 6UNDOTBS1.CURRENT PUB._SYSSMU5$ 2.73 ONLINE 2UNDOTBS1.CURRENT PUB._SYSSMU4$ 2.57 ONLINE 0UNDOTBS1.CURRENT PUB._SYSSMU3$ 2.41 ONLINE 1UNDOTBS1.CURRENT PUB._SYSSMU2$ 2.25 ONLINE 0
3,生成创建控制文件脚本
[oracle@www.htz.pw sql]$./create_controlfile_sql.sh please input direcotry default /tmp:please input file name default control.ctl:Database altered.
这里生成的默认文件位置在/tmp/control.ctl4,重建控制文件
www.htz.pw > shutdown abort;ORACLE instance shut down. STARTUP NOMOUNTCREATE CONTROLFILE REUSE DATABASE "ORCL1024" NORESETLOGS NOARCHIVELOGMAXLOGFILES 16MAXLOGMEMBERS 3MAXDATAFILES 100MAXINSTANCES 8MAXLOGHISTORY 292LOGFILEGROUP 1 ‘/oracle/app/oracle/oradata/orcl1024/redo01.log’ SIZE 50M,GROUP 2 ‘/oracle/app/oracle/oradata/orcl1024/redo02.log’ SIZE 50M,GROUP 3 ‘/oracle/app/oracle/oradata/orcl1024/redo03.log’ SIZE 50MDATAFILE‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’,‘/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf’,(需要删除这行)‘/oracle/app/oracle/oradata/orcl1024/sysaux01.dbf’,‘/oracle/app/oracle/oradata/orcl1024/users01.dbf’CHARACTER SET ZHS16GBK;RECOVER DATABASEALTER DATABASE OPEN;ALTER TABLESPACE TEMP ADD TEMPFILE ‘/oracle/app/oracle/oradata/orcl1024/temp01.dbf’ SIZE 1482M REUSE AUTOEXTEND ON NEXT 655360 MAXSIZE 32767M;
5,故障现象出现
www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2170641 generated at 07/19/2014 00:35:54 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_72_%u_.arcORA-00280: change 2170641 for thread 1 is in sequence #72 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}/oracle/app/oracle/oradata/orcl1024/redo02.logORA-00310: archived log contains sequence 71; sequence 72 requiredORA-00334: archived log: ‘/oracle/app/oracle/oradata/orcl1024/redo02.log’ ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’ www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2170641 generated at 07/19/2014 00:35:54 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_72_%u_.arcORA-00280: change 2170641 for thread 1 is in sequence #72 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}/oracle/app/oracle/oradata/orcl1024/redo03.logLog applied.Media recovery complete. www.htz.pw > alter database open resetlogs;alter database open resetlogs*ERROR at line 1:ORA-01092: ORACLE instance terminated. Disconnection forced
alert中出现下面的报错
Sat Jul 19 00:45:47 2014SMON: enabling cache recoverySat Jul 19 00:45:47 2014Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_12464.trc:ORA-01173: data dictionary indicates missing data file from system tablespaceSat Jul 19 00:45:47 2014Error 1173 happened during db open, shutting down databaseUSER: terminating instance due to error 1173Instance terminated by USER, pid = 12464ORA-1092 signalled during: alter database open resetlogs…
6,故障处理方法1在运气比较好的情况下使用此方案是可行的,朋友的数据库使用此方法,数据库能正常的OPEN。6.1 修改参数文件这里手动创建pfile文件,直接修改pfile文件比较简单,并且不影响原spfile文件,增加下面红色部分参数
www.htz.pw > !vi /tmp/123.oraorcl1024.__db_cache_size=54525952orcl1024.__java_pool_size=4194304orcl1024.__large_pool_size=8388608orcl1024.__shared_pool_size=88080384orcl1024.__streams_pool_size=0*._backup_ksfq_bufsz=1048576*._log_parallelism=2*._log_parallelism_max=4*._pga_max_size=5368709120*._smm_max_size=3145728*.audit_file_dest=’/oracle/app/oracle/admin/orcl1024/adump’*.background_dump_dest=’/oracle/app/oracle/admin/orcl1024/bdump’*.compatible=’10.2.0.3.0′*.control_files=’/oracle/app/oracle/oradata/orcl1024/control01.ctl’,’/oracle/app/oracle/oradata/orcl1024/control02.ctl’,’/oracle/app/oracle/oradata/orcl1024/control03.ctl’*.core_dump_dest=’/oracle/app/oracle/admin/orcl1024/cdump’*.cpu_count=3*.db_block_size=8192*.db_domain=”*.db_file_multiblock_read_count=16*.db_name=’orcl1024′*.db_recovery_file_dest=’/oracle/app/oracle/flash_recovery_area’*.db_recovery_file_dest_size=4294967296*.dbwr_io_slaves=4*.disk_asynch_io=FALSE*.dispatchers='(PROTOCOL=TCP) (SERVICE=orcl1024XDB)’*.event=”*.job_queue_processes=10*.open_cursors=300*.pga_aggregate_target=1073741824*.processes=150*.recyclebin=’OFF’*.remote_login_passwordfile=’EXCLUSIVE’*.sga_target=167772160#*.undo_management=’AUTO’*.undo_management=’manual’*.undo_tablespace=’UNDOTBS1′*.user_dump_dest=’/oracle/app/oracle/admin/orcl1024/udump’_corrupted_rollback_segments=(_SYSSMU1$,_SYSSMU2$,_SYSSMU3$,_SYSSMU4$,_SYSSMU5$,_SYSSMU6$,_SYSSMU7$,_SYSSMU8$,_SYSSMU9$,_SYSSMU10$)这里通常还需要增加下面的2个参数_allow_resetlogs_corruption=true_allow_error_simulation=true另外还可以会增加一个event,如果smon一些功能的event。
6.2 启动数据库
www.htz.pw > startup mount pfile=’/tmp/123.ora’;ORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytesDatabase mounted.www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2171386 generated at 07/19/2014 00:45:47 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_%u_.arcORA-00280: change 2171386 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}/oracle/app/oracle/oradata/orcl1024/redo03.logORA-00339: archived log does not contain any redoORA-00334: archived log: ‘/oracle/app/oracle/oradata/orcl1024/redo03.log’ ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’ www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2171386 generated at 07/19/2014 00:45:47 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_%u_.arcORA-00280: change 2171386 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}/oracle/app/oracle/oradata/orcl1024/redo01.logLog applied.Media recovery complete.www.htz.pw > alter database open resetlogs; Database altered.这里看到数据库已经正常打开,这里还需要注意观察,alert日志文件是否有异常报错。
6.3 重建undo表空间
www.htz.pw > !rm /oracle/app/oracle/oradata/orcl1024/undotbs01.dbf www.htz.pw > create undo tablespace undotbs1 datafile ‘/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf’ size 10m; Tablespace created.
6.4 使用源参数文件启动数据库
www.htz.pw > shutdown immediate;Database closed.Database dismounted.ORACLE instance shut down.www.htz.pw > startupORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytesDatabase mounted.Database opened.www.htz.pw >
数据库启动正常,注意观察alert日志中是否有报错。7 故障处理方法2使用此方法,要求原来的UNDO数据文件存在,此方法就是将原来的undo数据文件再次增加到控制文件中去,此方法比较复制,因为在开启数据库的都会遇到其它很多的一些问题。7.1 故障现象重现
www.htz.pw > select open_mode from v$database; OPEN_MODE———-READ WRITE 数据库的状态是正常的www.htz.pw > select name from v$dbfile; NAME——————————————————————————–/oracle/app/oracle/oradata/orcl1024/users01.dbf/oracle/app/oracle/oradata/orcl1024/sysaux01.dbf/oracle/app/oracle/oradata/orcl1024/system01.dbf/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf存在的数据文件www.htz.pw > shutdown abort;ORACLE instance shut down.重建控制文件,控制文件中不包括undo表空间www.htz.pw > @/tmp/control.ctlORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytes Control file created. ORA-00283: recovery session canceled due to errorsORA-01610: recovery using the BACKUP CONTROLFILE option must be done ALTER DATABASE OPEN*ERROR at line 1:ORA-01589: must use RESETLOGS or NORESETLOGS option for database open ALTER TABLESPACE TEMP ADD TEMPFILE ‘/oracle/app/oracle/oradata/orcl1024/temp01.dbf’*ERROR at line 1:ORA-01109: database not open www.htz.pw > www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2171941 generated at 07/19/2014 00:51:38 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_%u_.arcORA-00280: change 2171941 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}cancelORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’ ORA-01112: media recovery not started *.remote_login_passwordfile=’EXCLUSIVE’*.sga_target=167772160#*.undo_management=’AUTO’*.undo_management=’manual’*.undo_tablespace=’UNDOTBS1′*.user_dump_dest=’/oracle/app/oracle/admin/orcl1024/udump’_allow_resetlogs_corruption=true_allow_error_simulation=true增加上面的参数文件 www.htz.pw > startup force mount pfile=’/tmp/123.ora’;ORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytesDatabase mounted.www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2171941 generated at 07/19/2014 00:51:38 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_%u_.arcORA-00280: change 2171941 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} /oracle/app/oracle/oradata/orcl1024/redo01.logLog applied.Media recovery complete.www.htz.pw > www.htz.pw > alter database open resetlogs;alter database open resetlogs*ERROR at line 1:ORA-01092: ORACLE instance terminated. Disconnection forced 重现故障现在Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_14960.trc:ORA-00704: bootstrap process failureORA-00604: error occurred at recursive SQL level 2ORA-01173: data dictionary indicates missing data file from system tablespaceSat Jul 19 01:01:40 2014Error 704 happened during db open, shutting down databaseUSER: terminating instance due to error 704Instance terminated by USER, pid = 14960ORA-1092 signalled during: alter database open resetlogs…
7.2 重建控制文件重建控制文件,控制文件中包括undo表空间的数据文件
[oracle@www.htz.pw sql]$sqlplus / as sysdba SQL*Plus: Release 10.2.0.4.0 – Production on Sat Jul 19 01:09:05 2014 Copyright (c) 1982, 2007, Oracle. All Rights Reserved. Connected to an idle instance. www.htz.pw > startup nomount pfile=’/tmp/123.ora’;ORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 byteswww.htz.pw > @/tmp/control.ctlORA-01081: cannot start already-running ORACLE – shut it down firstCREATE CONTROLFILE REUSE DATABASE "ORCL1024" RESETLOGS NOARCHIVELOG*ERROR at line 1:ORA-01503: CREATE CONTROLFILE failedORA-01189: file is from a different RESETLOGS than previous filesORA-01110: data file 2: ‘/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf’
这里提示ORA-01189的错误。1189的错误很简单,因为数据文件头的resetlogs信息不一致导致的。7.3 bbed修改数据文件头中RESETLOG与SCN信息
www.htz.pw > !cat /tmp/bbed.parlistfile=/tmp/bbed.datafile www.htz.pw > !cat /tmp/bbed.datafile1 /oracle/app/oracle/oradata/orcl1024/system01.dbf2 /oracle/app/oracle/oradata/orcl1024/undotbs01.dbf3 /oracle/app/oracle/oradata/orcl1024/sysaux01.dbf4 /oracle/app/oracle/oradata/orcl1024/users01.dbf[oracle@www.htz.pw ~]$bbed parfile=/tmp/bbed.parPassword: BBED: Release 2.0.0.0.0 – Limited Production on Sat Jul 19 01:12:13 2014 Copyright (c) 1982, 2007, Oracle. All rights reserved. *********** !!! For Oracle Internal Use only !!! ************* BBED> info File# Name Size(blks) —– —- ———- 1 /oracle/app/oracle/oradata/orcl1024/system01.dbf 0 2 /oracle/app/oracle/oradata/orcl1024/undotbs01.dbf 0 3 /oracle/app/oracle/oradata/orcl1024/sysaux01.dbf 0 4 /oracle/app/oracle/oradata/orcl1024/users01.dbf 0这里只需要修改上次resetlogs与SCN的值就可以了BBED> assign file 2 block 1 offset 112 = file 1 block 1 offset 112;Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) yub4 kcvfhrlc @112 0x32dc2c73 BBED> assign file 2 block 1 offset 116 = file 1 block 1 offset 116;ub4 kscnbas @116 0x002125e0 BBED> assign file 2 block 1 offset 484 = file 1 block 1 offset 484;ub1 pad @484 0xe1 BBED> assign file 2 block 1 offset 492 = file 1 block 1 offset 492;ub1 pad @492 0x74BBED> sum apply dba 2,1Check value for File 2, Block 1:current = 0x093b, required = 0x093b www.htz.pw > startup force nomount pfile=’/tmp/123.ora’;ORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytes
7.4 重建控制文件
www.htz.pw > @/tmp/control.ctlORA-01081: cannot start already-running ORACLE – shut it down first Control file created. ORA-00283: recovery session canceled due to errorsORA-01610: recovery using the BACKUP CONTROLFILE option must be done
控制文件重建成功7.5 遇到600错误
www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2172129 generated at 07/19/2014 00:53:08 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_%u_.arcORA-00280: change 2172129 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}cancelORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’ ORA-01112: media recovery not started www.htz.pw > alter database open resetlogs;alter database open resetlogs*ERROR at line 1:ORA-01092: ORACLE instance terminated. Disconnection forcedORA-00704: bootstrap process failureORA-00704: bootstrap process failureORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [],[], []
这里触发了ORA-00600 kcbgtcr_13错误,只需要手动提交事务就可以了。7.6 手动提交事务信息
[09:50:33]www.htz.pw > startup mount pfile=’/tmp/123.ora’;[09:50:34]ORACLE instance started.[09:50:34][09:50:34]Total System Global Area 167772160 bytes[09:50:34]Fixed Size 2082432 bytes[09:50:34]Variable Size 100665728 bytes[09:50:34]Database Buffers 54525952 bytes[09:50:34]Redo Buffers 10498048 bytes[09:50:38]Database mounted.[09:50:51]www.htz.pw > recover database using backup controlfile until cancel;[09:50:51]ORA-00279: change 2172135 generated at 07/19/2014 01:41:17 needed for thread 1[09:50:51]ORA-00289: suggestion :[09:50:51]/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_[09:50:51]%u_.arc[09:50:51]ORA-00280: change 2172135 for thread 1 is in sequence #1[09:50:51][09:50:51][09:50:51]Specify log: {<RET>=suggested | filename | AUTO | CANCEL}[09:50:53]cancel[09:50:54]ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below[09:50:54]ORA-01194: file 1 needs more recovery to be consistent[09:50:54]ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’[09:50:54][09:50:54][09:50:54]ORA-01112: media recovery not started[09:50:54][09:50:54][09:51:06]www.htz.pw > alter database open resetlogs;[09:51:09]alter database open resetlogs[09:51:09]*[09:51:09]ERROR at line 1:[09:51:09]ORA-01092: ORACLE instance terminated. Disconnection forced
后面alert报下面的错误
Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_32544.trc:ORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [], [], []Sat Jul 19 01:43:01 2014Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_32544.trc:ORA-00704: bootstrap process failureORA-00704: bootstrap process failureORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [], [], []Sat Jul 19 01:43:01 2014Error 704 happened during db open, shutting down database
在trace文件中查看有那些块没有提交。
[oracle@www.htz.pw ~]$grep -E ‘^Block header dump|^0x0’ /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_32544.trc0x01 0x1f64 0x02 0x1ef8 0x01 0x1f64 0x02 0x1ef8 0x01 0x1f64 0x02 0x1ef8 0x01 0x1f64 0x02 0x1ef8 0x01 0x1f64 0x02 0x1ef8 0x01 0x1f64 0x02 0x1ef8 Block header dump: 0x0040007a0x01 0x0003.001.00000191 0x0080002b.014c.03 —- 1 fsc 0x0000.00000000Block header dump: 0x0040017c0x01 0x0000.022.00000002 0x00400196.0004.37 –U- 12 fsc 0x0000.00000147Block header dump: 0x004000da0x01 0x0004.00c.0000011d 0x0080559d.00d3.02 C— 0 scn 0x0000.0008ab18Block header dump: 0x004000db0x01 0x0008.017.00000002 0x00800080.0000.01 CBU- 0 scn 0x0000.000024040×02 0x0004.01a.0000017a 0x0080003c.016b.32 –U- 1 fsc 0x000e.001e8de0Block header dump: 0x0040007a0x01 0x0003.001.00000191 0x0080002b.014c.03 —- 1 fsc 0x0000.00000000Block header dump: 0x0040006a0x01 0x0000.008.00000034 0x0040019e.003b.07 C— 0 scn 0x0000.0021245c
通过10046跟踪报错的SQL语句
ksedmp: internal or fatal errorORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [], [], []Current SQL statement for this session:select ctime, mtime, stime from obj$ where obj# = :1—– Call Stack Trace —–这里看到了报错的SQL语句,以SQL语句来搜索,直到搜索到如下的Cursor#5(0x2a97ca18b0) state=FETCH curiob=0x2a97cba468 curflg=f fl2=0 par=0x2a97ca1710 ses=0x69f82a30 sqltxt(0x69a944b0)=select ctime, mtime, stime from obj$ where obj# = :1 hash=fa0bd3f60d6ee4f2495f9af8199b75b9 parent=0x6677c4b8 maxchild=01 plk=0x66f56af0 ppn=ncursor instantiation=0x2a97cba468 used=1405705379 child#0(0x69a94288) pcs=0x6677c0c8 clk=0x66f56dd0 ci=0x6677b7b0 pn=0x69ad37f0 ctx=0x6616fe90 kgsccflg=0 llk[0x2a97cba470,0x2a97cba470] idx=0 xscflg=e0141476 fl2=45000401 fl3=4022210c fl4=100 Bind bytecodes Opcode = 1 Unoptimized Offsi = 48, Offsi = 0kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=2a97cba020 bln=22 avl=02 flg=05 value=20
这里中可以看到绑定变量的值是20.在相同的版本其它的数据库中执行下面的操作
SQL> select rowid from obj$ where obj# =20; ROWID——————AAAAASAABAAAAB6AAA SQL> @rowid_to_info.sqlEnter value for rowid: AAAAASAABAAAAB6AAAROWID_TYPE: 1OBJECT_NUMBER: 18RELATIVE_FNO: 1BLOCK_NUMBER: 122ROW_NUMBER: 0 PL/SQL procedure successfully completed. 正在好trace文件中的Block header dump: 0x0040007a0x01 0x0003.001.00000191 0x0080002b.014c.03 —- 1 fsc 0x0000.00000000
其实我们还可以从10046trace文件中找到此信息如下:
===================== PARSING IN CURSOR #5 len=52 dep=1 uid=0 oct=3 lid=0 tim=1372757926978214 hv=429618617 ad=’69a944b0′select ctime, mtime, stime from obj$ where obj# = :1END OF STMTPARSE #5:c=0,e=234,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1372757926978212BINDS #5:kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=2a97cba020 bln=22 avl=02 flg=05 value=20EXEC #5:c=0,e=330,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1372757926978586WAIT #5: nam=’db file sequential read’ ela= 23 file#=1 block#=218 blocks=1 obj#=-1 tim=1372757926978753WAIT #5: nam=’db file sequential read’ ela= 9 file#=1 block#=219 blocks=1 obj#=-1 tim=1372757926978804WAIT #5: nam=’db file sequential read’ ela= 7 file#=1 block#=122 blocks=1 obj#=-1 tim=1372757926978841
这里需要注意的是绑定变量的值。
在trace文件中可以发现下面的内容 tab 0, row 26, @0x18f1tl: 70 fb: –H-FL– lb: 0x1 cc: 17col 0: [ 2] c1 02col 1: [ 4] c3 06 17 08col 2: [ 1] 80col 3: [12] 5f 4e 45 58 54 5f 4f 42 4a 45 43 54col 4: [ 2] c1 02col 5: *NULL*col 6: [ 1] 80col 7: [ 7] 78 6c 03 0c 01 28 31col 8: [ 7] 78 72 07 13 01 3b 01col 9: [ 7] 78 6c 03 0c 01 28 31col 10: [ 1] 80col 11: *NULL*col 12: *NULL*col 13: [ 1] 80col 14: *NULL*col 15: [ 1] 80col 16: [ 4] c3 07 38 24
bbed手动提交事务,需要更改itl与行中的lck值
BBED> p ktbbhstruct ktbbh, 48 bytes @20 ub1 ktbbhtyp @20 0x01 (KDDBTDATA) union ktbbhsid, 4 bytes @24 ub4 ktbbhsg1 @24 0x00000012 ub4 ktbbhod1 @24 0x00000012 struct ktbbhcsc, 8 bytes @28 ub4 kscnbas @28 0x0021251b ub2 kscnwrp @32 0x0000 b2 ktbbhict @36 1 ub1 ktbbhflg @38 0x02 (NONE) ub1 ktbbhfsl @39 0x00 ub4 ktbbhfnx @40 0x00000000 struct ktbbhitl[0], 24 bytes @44 struct ktbitxid, 8 bytes @44 ub2 kxidusn @44 0x0003 ub2 kxidslt @46 0x0001 ub4 kxidsqn @48 0x00000191 struct ktbituba, 8 bytes @52 ub4 kubadba @52 0x0080002b ub2 kubaseq @56 0x014c ub1 kubarec @58 0x03 ub2 ktbitflg @60 0x0001 (NONE) union _ktbitun, 2 bytes @62 b2 _ktbitfsc @62 0 ub2 _ktbitwrp @62 0x0000 ub4 ktbitbas @64 0x00000000BBED> modify /x 80 offset 61Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y File: /oracle/app/oracle/oradata/orcl1024/system01.dbf (1) Block: 122 Offsets: 61 to 70 Dba:0x0040007a———————————————————————— 80000000 00000000 016c BBED> x /rn *kdbr[26]rowdata[5278] @6453 ————-flag@6453: 0x2c (KDRHFL, KDRHFF, KDRHFH)lock@6454: 0x01cols@6455: 17 col 0[2] @6456: 1 col 1[4] @6459: 52207 col 2[1] @6464: 0 col 3[12] @6466: -0 col 4[2] @6479: 1 col 5[0] @6482: *NULL*col 6[1] @6483: 0 col 7[7] @6485: -0 col 8[7] @6493: -0 col 9[7] @6501: -0 col 10[1] @6509: 0 col 11[0] @6511: *NULL*col 12[0] @6512: *NULL*col 13[1] @6513: 0 col 14[0] @6515: *NULL*col 15[1] @6516: 0 col 16[4] @6518: 65535 BBED> modify /x 00 offset 6454 File: /oracle/app/oracle/oradata/orcl1024/system01.dbf (1) Block: 122 Offsets: 6454 to 6463 Dba:0x0040007a———————————————————————— 001102c1 0204c306 1708 BBED> sum applyCheck value for File 1, Block 122:current = 0x3d20, required = 0x3d20 BBED> verifyDBVERIFY – Verification startingFILE = /oracle/app/oracle/oradata/orcl1024/system01.dbfBLOCK = 122 DBVERIFY – Verification complete Total Blocks Examined : 1Total Blocks Processed (Data) : 1Total Blocks Failing (Data) : 0Total Blocks Processed (Index): 0Total Blocks Failing (Index): 0Total Blocks Empty : 0Total Blocks Marked Corrupt : 0Total Blocks Influx : 0
7.7 报00600坏块的错误
www.htz.pw > alter database open resetlogs;alter database open resetlogs*ERROR at line 1:ORA-01092: ORACLE instance terminated. Disconnection forced Sat Jul 19 01:56:37 2014Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_1894.trc:ORA-00604: error occurred at recursive SQL level 1ORA-00607: Internal error occurred while making a change to a data blockORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [106], [6101], [], [], [], []Error 604 happened during db open, shutting down databaseUSER: terminating instance due to error 604Instance terminated by USER, pid = 1894ORA-1092 signalled during: alter database open resetlogs…这里可以看到数据文件1,块106,出现了6101的错误。此错误由于是ITL中的值与LOCK不一致导致的。
bbed修改行的lock值
BBED> set dba 1,106 DBA 0x0040006a (4194410 1,106) BBED> verifyDBVERIFY – Verification startingFILE = /oracle/app/oracle/oradata/orcl1024/system01.dbfBLOCK = 106 Block Checking: DBA = 4194410, Block Type = KTB-managed data blockdata header at 0x2a97696244kdbchk: row locked by non-existent transaction table=0 slot=10 lockid=1 ktbbhitc=1Block 106 failed with check code 6101 DBVERIFY – Verification complete Total Blocks Examined : 1Total Blocks Processed (Data) : 1Total Blocks Failing (Data) : 1Total Blocks Processed (Index): 0Total Blocks Failing (Index): 0Total Blocks Empty : 0Total Blocks Marked Corrupt : 0Total Blocks Influx : 0
此报错的修改见6101(row locked by non-existent transaction)§ion-id={ED5D037D-0C46-4246-BE5B-55253274C854}&page-id={1790402E-23BC-419C-ABCB-E1499735E049}&base-path=https://d.docs.live.net/e6b0cd9e2335d47f/huang/ORACLE/backup_migrate/%E5%BC%82%E5%B8%B8%E6%81%A2%E5%A4%8D)7.8 启动数据库通过上面几步操作,再次启动数据库
www.htz.pw > startup mount pfile=’/tmp/123.ora’;ORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytesDatabase mounted.www.htz.pw > recover database using backup controlfile unitl cancel;ORA-00905: missing keyword www.htz.pw > recover database using backup controlfile until cancel;ORA-00279: change 2172139 generated at 07/19/2014 01:56:34 needed for thread 1ORA-00289: suggestion :/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_%u_.arcORA-00280: change 2172139 for thread 1 is in sequence #1 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}cancelORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’ ORA-01112: media recovery not started www.htz.pw > alter database open resetlogs; Database altered. 使用原参数能正常启动数据库。www.htz.pw > startup force;ORACLE instance started. Total System Global Area 167772160 bytesFixed Size 2082432 bytesVariable Size 100665728 bytesDatabase Buffers 54525952 bytesRedo Buffers 10498048 bytesDatabase mounted.Database opened.www.htz.pw >
整个实验测试结束