安装11G R2 RAC碰到的问题收录

2026-01-10Linux/AIX / Oracle / RAC / RMAN

电梯直达![1756815230925-b8057358-f229-4181-baff-4d3ef70ccc50.png](#) 楼主 !1756815231008-a373638d-6688-4b48-bcaf-d46c1da9e641.gif_发表于 2014-11-26 15:05:21_|只看该作者|倒序浏览

11.2.0.3 RAC windows 环境,CPU超过32核, 安装直接蓝屏:

Bug:14276345 ****

解决方案:降核后打patch

Patch:14613222

Patch:14613223

##################################################################################################

1、HP_UX环境在安装11g r2时无法显示节点,Vendor Clusterware Detected. Cluster configuration cannot be modified解决

安装RAC 11.2.0.3 时出现不显示节点的情况,提示Vendor Clusterware detected. Cluster configuration cannot be modified

原因:oracle grid安装程序检测到了第三方集群软件,或者该软件根本不存在。

那么Oracle是如何识别第三方集群的呢?

在安装完第三方集群后,会在特定目录下生成Oracle RAC接口文件,这个文件的作用就是:集群成员管理信息(cluster membership 简称CM)。

在HPUX下该文件是/opt/nmapi/nmapi2/lib/pa20_64,在AIX/Solaris/Linux下这个文件是/opt/ORCLcluster/lib/libskgxn2.so 。

在AIX 上可以删除 /opt/ORCL*、/var/ha、/var/hacmp

解决办法:

  1. using “cmviewcl” to ensure vender clusterware is down

cmviewcl

  1. backup and delete vender clusterware interface file

su – root

cd /opt/nmapi/nmapi2/lib/

mv hpux64 hpux64_back

  1. re-run oracle installer

Then I completed my installation

##################################################################################################

2、AIX环境常见问题

常规问题,存储忘记检查权限

.存储设备磁盘属性检查

对于ESS, EMC, HDS, CLARiiON和有MPIO功能的SAN网络存储设备,必须设置存储的硬盘属性reserve_policy=no_reserve: [装RAC时不会锁盘]

chdev -l hdisk[n] -a reserve_policy=no_reserve

利用命令查看磁盘属性:

/usr/sbin/lsdev –Cc disk
/usr/sbin/lsattr -El hdisk[n]

一、网络参数问题

类似如下很多网络参数有问题:

Oracle要求的sb_max的值为4194304,当前的值是1310720,差别比较大,执行以下命令重新设置sb_max值:

/usr/sbin/no -p -o sb_max=4194304

二、HACMP不兼容

在安装grid执行root.sh脚本的时候收到如下的报错:

root@mtdb1:/u01/app/11.2.0/grid# ./root.sh

Performing root user operation for Oracle 11g

The following environment variables are set as:

ORACLE_OWNER= grid
ORACLE_HOME=  /u01/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin …

Copying oraenv to /usr/local/bin …

Copying coraenv to /usr/local/bin …

Creating /etc/oratab file…

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params

Creating trace directory

User ignored Prerequisites during installation

Failed to write the checkpoint:'' with status:FAIL.Error code is 256

Undefined subroutine &crsconfig_lib::dieformat called at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 6135.

/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution

failed

解决办法:

经过搜索发现这个错误是由于之前的工程师非常“好心”的装上了IBM HACMP软件,但是Oracle Database 11gR2 Grid Infrastructure与HACMP不兼容导致的,处理步骤如下:

Step-1) cd /usr/sbin/cluster/utilities

         mv cldomain cldomain_orig

Step-2) Remove "hagsuser" group using smit security command

Step-3) cd /var/ha/soc

        rm -rf *clients*

Step-4) Modify rootpre.sh file by removing HACMP related part from this file and run rootpre.sh again.

Now we can re-install CRS/DB again.

在安装RAC的所有节点都需要完成以上的步骤,且直接rm -rf Grid的安装目录。IBM HACMP和Grid Infrastructure兼容性可能存在问题,所以在不使用HACMP的情况下不要安装该软

件。

三、grid、oracle用户属性忘记修改

grid# ./root.sh

Performing root user operation for Oracle 11g

The following environment variables are set as:

ORACLE_OWNER= grid
ORACLE_HOME=  /u01/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

The contents of "dbhome" have not changed. No need to overwrite.

The contents of "oraenv" have not changed. No need to overwrite.

The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params

User ignored Prerequisites during installation

User grid is missing the following capabilities required to run CSSD in realtime: CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE

To add the required capabilities, please run:

/usr/bin/chuser capabilities=CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE grid

CSS cannot be run in realtime mode at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 11423.

/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution

failed

原因:grid、oracle用户同样需要修改CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE这3个属性。

解决办法:

在安装RAC的所有节点执行如下命令:

/usr/bin/chuser capabilities=CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE grid

因为是新环境,所以用deinstall 命令铲掉环境后,运行root.sh脚本成功

##################################################################################################

3、CRS集群软件已经安装成功,但是在安装数据库软件时提示找不到集群:

思路:检查权限是否正常及条目内容是否正确:

/app/oraInventory/ContentsXML/inventory.xml

环境中的inventory.xml内容如下:

[grid@vrh1 ContentsXML]$ cat inventory.xml

<?xml version="1.0" standalone="yes" ?>

<INVENTORY>

<VERSION_INFO>

<SAVED_WITH>11.2.0.2.0</SAVED_WITH>

<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>

</VERSION_INFO>

<HOME_LIST>

<HOME NAME="Ora11g_gridinfrahome1" LOC="/g01/11.2.0/grid" TYPE="O" IDX="1" >

<NODE_LIST>

  <NODE NAME="vrh1"/>
  <NODE NAME="vrh2"/>

</NODE_LIST>

</HOME>

</HOME_LIST>

</INVENTORY>

显然是在<HOME NAME这里缺少了CRS=”true”的标志,导致OUI安装界面在检测时认为该节点没有安装GI。

解决方案其实很简单只要加入CRS=”true”在重启runInstaller即可

[grid@vrh1 ContentsXML]$ cat /g01/oraInventory/ContentsXML/inventory.xml

<?xml version="1.0" standalone="yes" ?>

<INVENTORY>

<VERSION_INFO>

<SAVED_WITH>11.2.0.2.0</SAVED_WITH>

<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>

</VERSION_INFO>

<HOME_LIST>

<HOME NAME="Ora11g_gridinfrahome1" LOC="/g01/11.2.0/grid" TYPE="O" IDX="1" CRS="true">

<NODE_LIST>

  <NODE NAME="vrh1"/>
  <NODE NAME="vrh2"/>

</NODE_LIST>

</HOME>

</HOME_LIST>

</INVENTORY>

##################################################################################################

4、时间同步问题

11g Clusterware引入一个新的进程CTSS,该进程主要负责集群的时间管理,确保每个节点的集群时间一致,如果系统的NTP进程运行,则CTSS进程处于observer模式,否则,NTP进

程没有运行,CTSS运行于ACTIVE模式。

问题分析及解决:

运行cluvfy命令进行节点的时间同步,出现如下错误:

su – grid

$cluvfy comp clocksync –n all –verbose

执行失败,出现如以下错误:

执行失败,如:

Version of exectask could not be retrieved from node “node1”

Version of exectask could not be retrieved from node “node1”

ERROR:

Framework setup check failed on all the nodes

Verification cannot processed

命令cluvfy运行出现错误,主要从以下三个方面进行解决:

1、检查两个节点直接的信任关系,否则执行sshUserSetup.sh脚本,该脚本可通过解压安装包获得。

sshUserSetup.sh -user grid -hosts "oadb1 oadb2" -advanced –PromptPassphrase

2、清除临时文件系统中以CVU开头的文件目录。

rm –rf /tmp/CVU*

3、因为数据库升级,会改变exectask*命令执行权限,需要更改该脚本执行权限

su – grid

$ cd $ORACLE_HOME/cv/remenv

$chmod 755 ./*

重新执行cluvfy脚本,具体信息如下:

$ cluvfy comp clocksync –n all –verbose

Checking NTP daemon command line for slewing option "-x"

Check: NTP daemon command line

Node Name Slewing Option Set?

———————————— ————————

node2 no

node1 no

Result:

NTP daemon slewing option check failed on some nodes

PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option "-x"

Result: Clock synchronization check using Network Time Protocol(NTP) failed

PRVF-9652 : Cluster Time Synchronization Services check failed

Verification of Clock Synchronization across the cluster nodes was unsuccessful on all the specified nodes.

以上可以看出xntpd进程运行未使用参数”-x”,在所有节点“系统启动配置文件”检查并添加该参数,然后各个节点重新启动时间同步:

1、AIX平台配置文件:

vi /etc/rc.tcpip

   start /usr/sbin/xntpd "$src_running" "-x"

2、HP-UX平台配置文件:

vi /etc/rc.config.d/netdaemons

    XNTPD_ARGS="-x"

–AIX停止和启动XNTPD进程

stopsrc –s xntpd

startsrc –s xntpd –a “-x”

–HP-UX停止和启动XNTPD进程

/sbin/init.d/xntpd stop
/sbin/init.d/xntpd start

##################################################################################################

5、第三方卷组问题,VERTIAS做的并发卷组

安装CRS的时候,比较顺利,但是在运行脚本: ”/home/oracle/crs/root.sh“的时候,始终启动不了CRS。错误现象如下:

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

Failure at final check of Oracle CRS stack.

察看相关错误:

cat css.log

进入DEBUG模式:

1.对需要调试文件进行备份:

cp $CRS_HOME/install/rootinstall $ORA_CRS_HOME/install/rootinstall.bak

cp $CRS_HOME/install/rootconfig $ORA_CRS_HOME/install/rootconfig.bak

2.修改2个配置文件:

配置文件rootinstall 和rootconfig 脚本中,添加-X:

!/bin/sh -x

#

rootinstall.sbs for CRS installs

3.运行脚本:

. /tmp/rootsh.log

./root.sh

4.察看系统日志:

tail -f /var/adm/syslog/syslog.log

cd /u01/oracle/oraInventory/logs

tail -f installActionXX.log

根据系统日志可疑情况,进行分析:

Oracle Cluster Ready Services waiting for HP-UX Service Guard to start.

环境是用VERTIAS做的并发卷组,没有用到HP的SG,经过求助有经验的主机工程师,发现VERTIAS做集群的时候,需要安装补丁。