2013年1月24日 星期四

PCP in RAC

# PCP發生錯誤的種類:
1.The database instance fail
2.The Database node server fail
3.The Applications/Middle-Tier server fail

# CM分類:
Internal Concurrent Manager (FNDLIBR) -
  跟SM溝通
  只能在單機上跑,負責service manager的start/stop/restart/failover
Service Manager (FNDSM process) -
  跟ICM,CM, and non-Manager Service processes溝通
  SM環境變數由APPSORA.env and gsmstart.sh script設定
  TWO_TASK變數名稱要跟gv$instance的instance_name相同
  需要apps_<sid> listener在每個CP node啟動,以支援 SM 跟 local instance 之間的溝通
  每個node都需要啟動SM
Internal Monitor (FNDIMON process) -
  跟ICM溝通
  IM監控ICM,如果ICM fail會將ICM restart
  每個ICM可能存在的node都要起IM
  ICM如果在多個node啟動,只有ICM會存在第一個啟動的node上面,其他node上面的ICM會被終止
Standard Manager (FNDLIBR process) -
  跟SM, Batch and OLTP process溝通
Transaction Manager -
  跟SM, Batch, OLTP and Form process溝通
  See Note:240818.1 regarding Transaction Manager communication and setup requirements for RAC.

# 設定步驟:
1.11.5.8之后的版本使用GSM做設定
2.In <SID>.xml 設定變數s_appldcp = ON
3.設定之前先備份tnsnames.ora,listener.ora and sqlnet.ora,或是COMMON_TOP/admin/scripts/<SID> 之下被異動的檔案
4.執行a_dautocfg.sh on each cluster node
5.執行完之後,每個node的tnsnames.ora會包含其他node的alias
6.確保tnsnames.ora的alias包含在GV$INSTANCE之中,在開啟CM的過程中,SM會依序建立連線
  檔案中包_含的entry會跟變數 TWO_TASK 相同 (in APPSORA.env)
7.確認FNDSM_<SID> entry 已經加入listener.ora
8.AutoConfig 會異動以及重新設定 database profiles
9.確保Applications Listener is active on each node in the cluster
10.確保每個node都有註冊:system administrator > Install > Nodes
11.進入system administrator > Concurrent > Manager > Define,根據server loading,設定每個CM的primary and secondary node names
12.在開Manager processes之前,要先編輯每個node的APPSORA.env,這是為了指定包含INSTANCE_NAME參數的
   TWO_TASK entry
13.進入system administrator > Concurrent > Manager > Administer 確認Service Manager and Internal Monitor are activated on the secondary node. The Internal Monitor should not be active on the primary cluster node.
14.Stop and restart the Concurrent Manager processes on their primary node(s).檢查配制是否合乎規劃,
   並且FNDSM process (the Service Manager)會伴隨著 FNDIMON process (Internal Monitor)出現

# Failover的限制
當node failed時,其他node必須接手failed node的manager,所以必須讀到相同的log file and output file,所以必須使用shared disk
因此 FNDFS_<HOST> entry in tnsnames.ora under the 8.0.6 ORACLE_HOME location 需要設定,local node 放在第一順位
範例如下:

FNDFS_coe-srv5-pc=(DESCRIPTION=
                (ADDRESS_LIST=
                 (ADDRESS=(PROTOCOL=tcp)(HOST=coe_svr5_pc)(PORT=1231))
                 (ADDRESS=(PROTOCOL=tcp)(HOST=coe_svr7_pc)(PORT=1232)))
                  (CONNECT_DATA=(SID=FNDFS))
                )


---------------------------------------------------------------------------------------
Reference:
Concurrent Manager Setup and Configuration Requirements in an 11i RAC Environment [ID 241370.1]

PURPOSE
-----------------------------
Configuring parallel concurrent processing allows you to distribute concurrent
managers, and workload across multiple nodes in a cluster, or networked environment.
PCP can also be implemented in a RAC environment in order to provide automated
failover of workload should the primary (source), or secondary (target) concurrent
processing nodes, or RAC instances fail.  There are several different failure
scenarios that can occur depending on the type of Applications Technology Stack
implementation that is performed.

The basic failure scenarios are:

1. The database instance that supports the CP, Applications,  and Middle-Tier
   processes such as Forms, or iAS can fail.
2. The Database node server that supports the CP, Applications,  and Middle-Tier
   processes such as Forms, or iAS can fail.
3. The Applications/Middle-Tier server that supports the CP (and Applications)
   base can fail.

The concurrent processing tier can reside on either the Applications, Middle-Tier,
or Database Tier nodes.  In a single tier configuration, non PCP environment, a
node failure will impact Concurrent Processing operations due to any of these
failure conditions. In a multi-node configuration the impact of any these types
of failures will be dependent upon what type of failure is experienced, and how
concurrent processing is distributed among the nodes in the configuration. Parallel
Concurrent Processing provides seamless failover for a Concurrent Processing
environment in the event that any of these types of failures takes place.

In an Applications environment where the database tier utilizes Listener
(server) load balancing, and in a non-load balanced environment, there
are changes that must be made to the default configuration generated by
Autoconfig so that CP initialization, processing, and PCP functionality are
initiated properly on their respective/assigned nodes.  These changes are
described in the next section - Concurrent Manager Setup and Configuration
Requirements in an 11i RAC Environment.

The current Concurrent Processing architecture with Global Service Management
consists of the following processes and communication model, where each process
is responsible for performing a specific set of routines and communicating with
parent and dependent processes.

Internal Concurrent Manager (FNDLIBR process) - Communicates with the Service
Manager.

The Internal Concurrent Manager (ICM)  starts, sets the number of active processes,
monitors, and terminates all other concurrent processes through requests made to
the Service Manager, including restarting any failed processes.  The ICM also
starts and stops, and restarts the Service Manager for each node.  The ICM will
perform process migration during an instance or node failure.  The ICM will be
active on a single node.  This is also true in a PCP environment, where the ICM
will be active on at least one node at all times.

Service Manager (FNDSM process) - Communicates with the Internal Concurrent Manager,
Concurrent Manager, and non-Manager Service processes.

The Service Manager (SM) spawns, and terminates manager and service processes (these
could be Forms, or Apache Listeners, Metrics or Reports Server, and any other process
controlled through Generic Service Management).  When the ICM terminates the SM that
resides on the same node with the ICM will also terminate.  The SM is ‘chained’ to
the ICM.  The SM will only reinitialize after termination when there is a function it
needs to perform (start, or stop a process), so there may be periods of time when the
SM is not active, and this would be normal.  All processes initialized by the SM
inherit the same environment as the SM.  The SM’s environment is set by APPSORA.env
file, and the gsmstart.sh script.  The TWO_TASK used by the SM to connect to a RAC
instance must match the instance_name from GV$INSTANCE.  The apps_<sid> listener must
be active on each CP node to support the SM connection to the local instance.  There
should be a Service Manager active on each node where a Concurrent or non-Manager
service process will reside.

Internal Monitor (FNDIMON process) - Communicates with the Internal Concurrent
Manager.

The Internal Monitor (IM)  monitors the Internal Concurrent Manager, and restarts any
failed ICM on the local node.  During a node failure in a PCP environment the IM will
restart the ICM on a surviving node (multiple ICM's may be started on multiple nodes,
but only the first ICM started will eventually remain active, all others will
gracefully terminate).  There should be an Internal Monitor defined on each node
where the ICM may migrate.

Standard Manager (FNDLIBR process) - Communicates with the Service Manager and any
client application process.

The Standard Manager is a worker process, that initiates, and executes client requests
on behalf of Applications batch, and OLTP clients.

Transaction Manager - Communicates with the Service Manager, and any user process
initiated on behalf of a Forms, or Standard Manager request.  See Note:240818.1
regarding Transaction Manager communication and setup requirements for RAC.

SCOPE & APPLICATION
-----------------------------
This article is provided for Applications development, product management, system
architects, and system administrators involved in deploying and configuring Oracle
Applications in a RAC environment. This document will also be useful to field
engineers and consulting organizations to facilitate installations and configuration
requirements of Applications 11i in a RAC environment.

Concurrent Manager Setup and Configuration Requirements in an 11i RAC Environment
-----------------------------
In order to set up Setup Parallel Concurrent Processing Using AutoConfig with GSM,
follow the instructions in the 11.5.8 Oracle Applications System Administrators Guide
under Implementing Parallel Concurrent Processing using the following steps:

1. Applications 11.5.8 and higher is configured to use GSM. Verify the configuration
   on each node (see WebIV Note:165041.1).
2. On each cluster node edit the Applications Context file (<SID>.xml), that resides
   in APPL_TOP/admin, to set the variable <APPLDCP oa_var="s_appldcp"> ON </APPLDCP>.
   It is normally set to OFF.  This change should be performed using the Context
   Editor.
3. Prior to regenerating the configuration, copy the existing tnsnames.ora,
   listener.ora and sqlnet.ora files, where they exist, under the 8.0.6 and iAS
   ORACLE_HOME locations on the each node to preserve the files (i.e./<some_
   directory>/<SID>ora/$ORACLE_HOME/network/admin/<SID>/tnsnames.ora). If any of
   the Applications startup scripts that reside in COMMON_TOP/admin/scripts/<SID>
   have been modified also copy these to preserve the files.
4. Regenerate the configuration by running adautocfg.sh on each cluster node as
   outlined in Note:165195.1.
5. After regenerating the configuration merge any changes back into the tnsnames.ora,
   listener.ora and sqlnet.ora files in the network directories, and the startup
   scripts in the COMMON_TOP/admin/scripts/<SID> directory.  Each nodes tnsnames.ora
   file must contain the aliases that exist on all other nodes in the cluster. When
   merging tnsnames.ora files ensure that each node contains all other nodes
   tnsnames.ora entries.  This includes tns entries for any Applications tier nodes
   where a concurrent request could be initiated, or request output to be viewed.
6. In the tnsnames.ora file of each Concurrent Processing node ensure that there is
   an alias that matches the instance name from GV$INSTANCE of each Oracle instance
   on each RAC node in the cluster.  This is required in order for the SM to establish
   connectivity to the local node during startup.  The entry for the local node will
   be the entry that is used for the TWO_TASK in APPSORA.env (also in the
   APPS<SID>_<HOSTNAME>.env file referenced in the Applications Listener [APPS_<SID>]
   listener.ora file entry "envs='MYAPPSORA=<some directory>/APPS<SID>_<HOSTNAME>.env)
   on each node in the cluster (this is modified in step 12).
7. Verify that the FNDSM_<SID> entry has been added to the listener.ora file under
   the 8.0.6 ORACLE_HOME/network/admin/<SID> directory. See WebiV Note:165041.1 for
   instructions regarding configuring this entry. NOTE: With the implementation of
   GSM the 8.0.6 Applications, and 9.2.0 Database listeners must be active on all PCP
   nodes in the cluster during normal operations.
8. AutoConfig will update the database profiles and reset them for the node from
   which it was last run. If necessary reset the database profiles back to their
   original settings.
9. Ensure that the Applications Listener is active on each node in the cluster where
   Concurrent, or Service processes will execute.  On each node start the database
   and Forms Server processes as required by the configuration that has been
   implemented.
10. Navigate to Install > Nodes and ensure that each node is registered. Use the node
    name as it appears when executing a ‘nodename’ from the Unix prompt on the server.
    GSM will add the appropriate services for each node at startup.
11. Navigate to Concurrent > Manager > Define, and set up the primary and secondary
    node names for all the concurrent managers according to the desired configuration
    for each node’s workload. The Internal Concurrent Manager should be defined on the
    primary PCP node only.  When defining the Internal Monitor for the secondary
    (target) node(s), make the primary node (local node) assignment, and assign a
    secondary node designation to the Internal Monitor, also assign a standard work
    shift with one process.
12. Prior to starting the Manager processes it is necessary to edit the APPSORA.env
    file on each node in order to specify a TWO_TASK entry that contains the
    INSTANCE_NAME parameter for the local nodes Oracle instance, in order to bind
    each Manager to the local instance.  This should be done regardless of whether
    Listener load balancing is configured, as it will ensure the configuration
    conforms to the required standards of having the TWO_TASK set to the instance
    name of each node as specified in GV$INSTANCE.  Start the Concurrent Processes
    on their primary node(s).  This is the environment that the Service Manager
    passes on to each process that it initializes on behalf of the Internal
    Concurrent Manager.  Also make the same update to the file referenced by the
    Applications Listener APPS_<SID> in the listener.ora entry "envs='MYAPPSORA=
    <some directory>/APPS<SID>_<HOSTNAME>.env" on each node.
13. Navigate to Concurrent > Manager > Administer and verify that the Service Manager
    and Internal Monitor are activated on the secondary node, and any other
    addititional nodes in the cluster. The Internal Monitor should not be active on
    the primary cluster node.
14. Stop and restart the Concurrent Manager processes on their primary node(s), and
    verify that the managers are starting on their appropriate nodes. On the target
    (secondary) node in addition to any defined managers you will see an FNDSM
    process (the Service Manager), along with the FNDIMON process (Internal Monitor).

Failover Considerations
-----------------------------
In order to have log and output files available to each node during an extended node
failure, each log and out directory needs to be made accessible to all other CP nodes
in the cluster (placed on shared disk).

In order to view log and output files from a failed node during an outage/node
failure, the FNDFS_<HOST> entry in tnsnames.ora under the 8.0.6 ORACLE_HOME location
on each node should be configured using an ADDRESS_LIST following the example below,
to provide connect time failover.  This entry should be placed in the tnsnames.ora
file on each node that supports Concurrent Processing, with the local node first in
the ADDRESS_LIST entry.

  FNDFS_coe-srv5-pc=(DESCRIPTION=
                (ADDRESS_LIST=
                 (ADDRESS=(PROTOCOL=tcp)(HOST=coe_svr5_pc)(PORT=1231))
                 (ADDRESS=(PROTOCOL=tcp)(HOST=coe_svr7_pc)(PORT=1232)))
                  (CONNECT_DATA=(SID=FNDFS))
                )

See bug 3259441 for issues related to the TNS alias length exceeding 255 characters.

Without configuring connect time failover using the 8.0.6 ORACLE_HOME tnsnames.ora
by specifying the ADDRESS_LIST entry the only other alternative is to perform a manual
update of the fnd_concurrent_requests table for each request, in order to reflect the
change in outfile_node_name, and logfile_node_name from the failed to the surviving
node.

Determine the failed node name, and update the out and log entry to the
surviving node:

SQL> select outfile_node_name from fnd_concurrent_requests
 2  where request_id=166273;

COE-SVR7-PC

SQL> select logfile_node_name from fnd_concurrent_requests
 2* where request_id=166273

COE-SVR7-PC

Update both outfile and logfile_node_name from the failed to the surviving instance:

SQL> update fnd_concurrent_requests set outfile_node_name = 'COE-SVR5-PC'
  2* where request_id =166273

SQL> update fnd_concurrent_requests set logfile_node_name = 'COE-SVR5-PC'
  2* where request_id =166273

Using the ADDRESS_LIST rather than updating fnd_concurrent_reqests is the recommended
method to estabilsh failover access for Concurrent Manager log and output files.


Configuration Examples:
-----------------------------
Finding the instance name -

column host_name format a20;
select host_name, instance_name from gv$instance;

HOST_NAME   INSTANCE_NAME
----------- ----------------
coe-svr7-pc APRA7
coe-svr5-pc APRA5

Modifying the APPSORA.env & APPS<SID>_<HOSTNAME>.env file for node coe-svr5-pc -

> cd $APPL_TOP
> cat APPSORA.env
:
# $Header: APPSORA_ux.env 115.4 2003/03/01 01:02:35 wdgreene ship $
# =============================================================================
# NAME
# APPSORA.env
#
# DESCRIPTION
# Execute environment for Oracle and APPL_TOP
#
# NOTES
#
# HISTORY
#
# =============================================================================
#
# ###############################################################
#
# This file is automatically generated by AutoConfig. It will be read and over.
# If you were instructed to edit this file, or if you are not able to use the ss
# created by AutoConfig, refer to Document 165195.1 for assistance.
#
# ###############################################################
#
. /oralocal/apraora/8.0.6/APRA.env
. /oralocal/apraappl/APRA.env
TWO_TASK=APRA5
export TWO_TASK

TNS alias definition in tnsnames.ora on each node -

cd $TNS_ADMIN
pwd
/oralocal/apraora/8.0.6/network/admin/APRA

APRA5 = (DESCRIPTION=
(ADDRESS=(PROTOCOL=tcp)(HOST= coe-svr5-pc)(PORT=1523))
(CONNECT_DATA=(INSTANCE_NAME=APRA5)(SERVICE_NAME=apradb))
)
APRA7 = (DESCRIPTION=
(ADDRESS=(PROTOCOL=tcp)(HOST= coe-svr7-pc)(PORT=1523))
(CONNECT_DATA=(INSTANCE_NAME=APRA7)(SERVICE_NAME=apradb))
)

FNDFS entry in listener.ora on each node -

cd $TNS_ADMIN
pwd
/oralocal/apraora/8.0.6/network/admin/APRA

( SID_DESC = ( SID_NAME = FNDFS )( ORACLE_HOME = /oralocal/apraora/8.0.6 )
( PROGRAM = /oralocal/apraappl/fnd/11.5.0/bin/FNDFS )
( envs='EPC_DISABLED=TRUE,NLS_LANG=AMERICAN_AMERICA.WE8ISO8859
1,LD_LIBRARY_PATH=/usr/dt/lib:/usr/openwin/lib:/oralocal/apraora/8.0.6/lib,SHLI
_PATH=/usr/lib:/usr/dt/lib:/usr/openwin/lib:/oralocal/apraora/8.0.6/lib,LIBPATH
/usr/dt/lib:/usr/openwin/lib:/oralocal/apraora/8.0.6/lib' )
)
)

RELATED DOCUMENTS
-----------------
Oracle Applications Systems Administrators Guide
Oracle Applications 11i with Real Application Clusters Installation & C
Note 240818.1 Concurrent Processing: Transaction Manager Setup and Configuration Requirement in an 11i RAC Environment

沒有留言:

張貼留言