May 31, 2011
[2 Comments](https://desaitaral.wordpress.com/2011/05/31/ora-15075/#comments)
We faced issue with ASM 2 Weeks Back and here is small background behind this. This is Oracle 11.1.0.7.0 and using ASMLIB. Changed name of original disk and other stuff to avoid policy violation.
We wanted to have some more space on lefthand SCSI so SA’s Added One 50GB Disk. While adding so he accidently overwrite DISK1 which was already in use and thanks to ASM it invalidated DISK1 and rebalanced to other disks. Issue didn’t cause any outage and we were happy till we have maintenence for this database.
Now, this time we wanted to move this Disk to SAN and we are using external reduncency for same. So, best way is to add disk and remove disk without any downtime needed. Below is the output before starting any work. If you can see that DISK1 is marked as CANDIDATE disk and which now won’t exist as of the problem mention above.
GROUP_NUMBER DISK_NUMBER MOUNT_S HEADER_STATU MODE_ST STATE NAMELABEL PATH------------ ----------- ------- ------------ ------- -------- ------------------------------ ------------------------------- ---------------------- 1 0 CACHED CANDIDATE ONLINE NORMAL DISK001 DISK001 ORCL:DISK001 1 1 CACHED MEMBER ONLINE NORMAL DISK002 DISK002 ORCL:DISK002 1 2 CACHED MEMBER ONLINE NORMAL DISK003 DISK003 ORCL:DISK003 1 3 CACHED MEMBER ONLINE NORMAL DISK004 DISK004 ORCL:DISK004 1 4 CACHED MEMBER ONLINE NORMAL DISK005 DISK005 ORCL:DISK005 1 5 CACHED MEMBER ONLINE NORMAL DISK006 DISK006 ORCL:DISK006 2 0 CACHED MEMBER ONLINE NORMAL DISK007 DISK007 ORCL:DISK007 2 1 CACHED MEMBER ONLINE NORMAL DISK008 DISK008 ORCL:DISK008 0 0 CACHED PROVISIONED ONLINE NORMAL DISK009 ORCL:DISK009 0 1 CACHED PROVISIONED ONLINE NORMAL DISK010 ORCL:DISK010 |
|---|
There we about 25-30 Disk provisioned so cut the other output. But as you can see that this two disk are ready to add. While Adding this disk we got this error.
SQL> ALTERDISKGROUP XXXXXXDATA ADDDISK 'ORCL:DISK009'SIZE51200M ,'ORCL:DISK010'SIZE51200MNOTE: Assigning number (1,6) todisk (ORCL:DISK009)NOTE: Assigning number (1,7) todisk (ORCL:DISK010)NOTE: requesting all-instance membership refresh forgroup=1NOTE: initializing header ongrp 1 disk DISK009NOTE: initializing header ongrp 1 disk DISK010NOTE: cache opening disk 6 ofgrp 1: DISK009 label:DISK009NOTE: cache opening disk 7 ofgrp 1: DISK010 label:DISK010NOTE: requesting all-instance disk validation forgroup=1Wed May 25 17:36:50 2011NOTE: disk validation pending forgroup1/0xb4587064 (XXXXXXDATA)SUCCESS: validated disks for1/0xb4587064 (XXXXXXDATA)ERROR: ORA-15075 signalled during reconfiguration ofdiskgroup XXXXXXDATANOTE: membership refresh pending forgroup1/0xb4587064 (XXXXXXDATA)kfdp_query(XXXXXXDATA): 7Wed May 25 17:36:56 2011kfdp_queryBg(): 7kfdp_query(XXXXXXDATA): 8kfdp_queryBg(): 8NOTE: cache closing disk 6 ofgrp 1: DISK009 label:DISK009NOTE: cache closing disk 6 ofgrp 1: DISK009 label:DISK009NOTE: De-assigning number (1,6) fromdisk (ORCL:DISK009)NOTE: cache closing disk 7 ofgrp 1: DISK010 label:DISK010NOTE: cache closing disk 7 ofgrp 1: DISK010 label:DISK010NOTE: De-assigning number (1,7) fromdisk (ORCL:DISK010)kfdp_query(XXXXXXDATA): 9kfdp_queryBg(): 9SUCCESS: refreshed membership for1/0xb4587064 (XXXXXXDATA)Wed May 25 17:36:59 2011ORA-15032: notallalterations performedORA-15075: disk(s) are notvisible cluster-wideERROR: ALTERDISKGROUP XXXXXXDATA ADDDISK 'ORCL:DISK009'SIZE51200M ,'ORCL:DISK010'SIZE51200MWed May 25 17:37:07 2011SQL> ALTERDISKGROUP XXXXXXDATA ADDDISK 'ORCL:DISK009'SIZE51200M ,'ORCL:DISK010'SIZE51200MNOTE: Assigning number (1,8) todisk (ORCL:DISK009)NOTE: Assigning number (1,9) todisk (ORCL:DISK010)NOTE: requesting all-instance membership refresh forgroup=1NOTE: De-assigning number (1,8) fromdisk (ORCL:DISK009)NOTE: De-assigning number (1,9) fromdisk (ORCL:DISK010)ERROR: ORA-15033 signalled during reconfiguration ofdiskgroup XXXXXXDATA |
|---|
There could be many reasons for this like scandisk not ran on all nodes, permission etc. For our case multipath was not set maybe. But for us we need to remove this two disk and let SA’s correct the issue and then try to add them back.
SQL> alterdiskgroup XXXXXXDATA dropdisk 'ORCL:DISK009'ORA-15032: notallalterations performedORA-15054: disk "ORCL:DISK009"does notexist indiskgroup "XXXXXXDATA"ERROR: alterdiskgroup XXXXXXDATA dropdisk 'ORCL:DISK009'Wed May 25 18:32:48 2011SQL> alterdiskgroup XXXXXXDATA dropdisk 'ORCL:DISK010'ORA-15032: notallalterations performedORA-15054: disk "ORCL:DISK010"does notexist indiskgroup "XXXXXXDATA"ERROR: alterdiskgroup XXXXXXDATA dropdisk 'ORCL:DISK010'Wed May 25 18:33:04 2011SQL> alterdiskgroup XXXXXXDATA dropdisk 'ORCL:DISK010'forceORA-15032: notallalterations performedORA-15054: disk "ORCL:DISK010"does notexist indiskgroup "XXXXXXDATA"ERROR: alterdiskgroup XXXXXXDATA dropdisk 'ORCL:DISK010'force |
|---|
So, you can’t remove this disk. As to remove disk you have to format it’s header using something like this
dd if=/dev/zero of=ORCL:DISK009 bs=4096 count=5000
But, as we didn’t have any data we planned to take some downtime as this was QA database and let SA’s remove SAN and re-configure as it needed. But, When they rebooted box issue got more worse we were not able to see DATA diskgroup and ASM was not able to mount that also.
SQL> ALTERDISKGROUP ALLMOUNTNOTE: cache registered groupXXXXXXDATA number=1 incarn=0xb7f88ac9NOTE: cache began mount (first) ofgroupXXXXXXDATA number=1 incarn=0xb7f88ac9NOTE: cache registered groupXXXXXXFLASH number=2 incarn=0xb7f88acaNOTE: cache began mount (first) ofgroupXXXXXXFLASH number=2 incarn=0xb7f88acaNOTE:Loaded lib: /opt/oracle/extapi/64/asm/orcl/1/libasm.soNOTE: Assigning number (1,1) todisk (ORCL:DISK002)NOTE: Assigning number (1,2) todisk (ORCL:DISK003)NOTE: Assigning number (1,3) todisk (ORCL:DISK004)NOTE: Assigning number (1,4) todisk (ORCL:DISK005)NOTE: Assigning number (1,5) todisk (ORCL:DISK006)NOTE: Assigning number (1,8) todisk (ORCL:DISK009)NOTE: Assigning number (1,9) todisk (ORCL:DISK010)ERROR: noPST quorum ingroup1: required 1, found 0NOTE: cache dismounting group1/0xB7F88AC9 (XXXXXXDATA)NOTE: dbwr notbeing msg'd to dismountNOTE: lgwr not being msg'd todismountNOTE: cache dismounted group1/0xB7F88AC9 (XXXXXXDATA)NOTE: cache ending mount (fail) ofgroupXXXXXXDATA number=1 incarn=0xb7f88ac9kfdp_dismount(): 2kfdp_dismountBg(): 2NOTE: De-assigning number (1,1) fromdisk (ORCL:DISK002)NOTE: De-assigning number (1,2) fromdisk (ORCL:DISK003)NOTE: De-assigning number (1,3) fromdisk (ORCL:DISK004)NOTE: De-assigning number (1,4) fromdisk (ORCL:DISK005)NOTE: De-assigning number (1,5) fromdisk (ORCL:DISK006)NOTE: De-assigning number (1,8) fromdisk (ORCL:DISK009)NOTE: De-assigning number (1,9) fromdisk (ORCL:DISK010)ERROR: diskgroup XXXXXXDATA was notmountedNOTE: Assigning number (2,0) todisk (ORCL:DISK007)NOTE: Assigning number (2,1) todisk (ORCL:DISK008)NOTE: start heartbeating (grp 2)kfdp_query(XXXXXXFLASH): 5kfdp_queryBg(): 5NOTE: cache opening disk 0 ofgrp 2: DISK007 label:DISK007NOTE: F1X0 found ondisk 0 fcn 0.0NOTE: cache opening disk 1 ofgrp 2: DISK008 label:DISK008NOTE: cache mounting (first) group2/0xB7F88ACA (XXXXXXFLASH)* allocate domain 2, invalid = TRUEkjbdomatt send tonode 1NOTE: attached torecovery domain 2NOTE: cache recovered group2 tofcn 0.144346NOTE: LGWR attempting tomount thread 1 fordiskgroup 2NOTE: LGWR mounted thread 1 fordisk group2NOTE: opening chunk 1 atfcn 0.144298 ABANOTE: seq=13 blk=8268NOTE: cache mounting group2/0xB7F88ACA (XXXXXXFLASH) succeededNOTE: cache ending mount (success) ofgroupXXXXXXFLASH number=2 incarn=0xb7f88acakfdp_query(XXXXXXFLASH): 6kfdp_queryBg(): 6NOTE: Instance updated compatible.asm to10.1.0.0.0 forgrp 2SUCCESS: diskgroup XXXXXXFLASH was mountedORA-15032: notallalterations performedORA-15063: ASM discovered an insufficient number ofdisks fordiskgroup "XXXXXXDATA"ERROR: ALTERDISKGROUP ALLMOUNT |
|---|
So, Now the issue is the DISK1 Which was wrongly Labeled last time oracle is trying to find that and now it’s not able to find so and it had assigned that DISK35 header value to that which was wrongly overwritten but it’s actually 01 Disk.
----------------------------- DISK REPORT N0008 ------------------------------ Disk Path: ORCL:DISK035 UniqueDisk ID: Disk Label: DISK035 Physical Sector Size: 512 bytes Disk Size: 51200 megabytes NOTA VALID ASM DISK HEADER. BAD VALUE INFIELD blksize_kfdhdb kfbh.endian: 0 ; 0x000: 0x00kfbh.hard: 0 ; 0x001: 0x00kfbh.type: 0 ; 0x002: KFBTYP_INVALIDkfbh.datfmt: 0 ; 0x003: 0x00kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0kfbh.check: 0 ; 0x00c: 0x00000000kfbh.fcn.base: 0 ; 0x010: 0x00000000kfbh.fcn.wrap: 0 ; 0x014: 0x00000000kfbh.spare1: 0 ; 0x018: 0x00000000kfbh.spare2: 0 ; 0x01c: 0x00000000 |
|---|
But Now comes the Cool Part. Opened SR1 and come to know somebody might know this but for me it was New. As, we know here that disk header was wiped out
ORACLE Keep backup of Disk Header Information structure of AUNUM 1 and Blocknum 254. From that you can retrieve this information. So, now KFED was the one to rescue for merging this information.
kfed read /dev/d1 aun=1 blkn=254 text=/tmp/d1.log
kfed merge /dev/d1 text=/tmp/d1.log
And then Run Scandisk. That’s it everthing Came up Good. Advertisements### Rate this:
Share this:
– [
Like
](https://widgets.wp.com/likes/#)Be the first to like this.### _Related_
AMDU Utility For ASMIn "Utilities"
Consider Reverse Key IndexIn "Index"
Strange ASM Hang IssueIn "Troubleshooting"