
-
Aug18
Deploying Exchange Server 2003 in a Cluster
Author: Susanta K Beura; Filed under: Cluster, Computer & Internet; Tagged as: Cluster, cluster requirements, cluster service, clustering, deployment scenarios, deployment strategy, exchange, exchange 2000 server, exchange 2003 installation, exchange 2003 maintenance, exchange 2003 tutorial, exchange 2007 cluster, exchange server, exchange server 2003, exchange server 2003 setup, exchange server 2007 installation, exchange server 5, exchange server basics, exchange server installation, exchange server recovery, exchange server tutorial, exchange server upgrade, fwlink, high availability, how to install, how to install exchange 2003, how to install exchange server, how to setup exchange 2003, how to setup exchange server, install exchange server 2003, install exchange server 2007, installing exchange, installing exchange server 2003, microsoft cluster, microsoft exchange 2003, microsoft exchange server, microsoft exchange server 2003, microsoft windows server, ms exchange server installation, necessary requirements, network configuration requirements, new exchange, node cluster, server, server cluster, server clusters, server install, setup cluster, software requirements, tutorials, web cluster, windows 2000 cluster, windows 2000 server, windows 2003 clustering, windows 2003 server microsoft, windows cluster, windows clustering, windows exchange server, windows small business server 2003
No CommentsAfter planning the cluster deployment strategy, correct deployment of that cluster ensures high availability of your servers that run Microsoft Exchange Server 2003. Although deploying Exchange in a cluster resembles deploying Exchange in a non-clustered organization, there are important differences you must consider. Therefore, to fully understand how to deploy Exchange Server 2003 in a cluster, read this topic together with the previous topics in this guide.
Specifically, this topic provides the following information:
- Cluster Requirements
This section discusses the necessary requirements for installing Exchange Server 2003, including Microsoft Windows and Exchange version requirements, software requirements, and network configuration requirements. - Deployment Scenarios
This section includes the following configuration and procedural information about how to deploy Exchange Server 2003 clusters:- Four-node cluster scenario
- Deploying a new Exchange Server 2003 cluster
- Upgrading an Exchange 2000 Server cluster to Exchange Server 2003
- Migrating an Exchange Server 5.5 cluster to Exchange Server 2003
- Upgrading mixed Exchange 2000 Server and Exchange Server 5.5 clusters
Before continuing with the deployment procedures listed in this topic, follow these steps:
- Read the section “Using Server Clusters” in the guide Planning an Exchange Server 2003 Messaging System ( http://go.microsoft.com/fwlink/?LinkId=47584).
- Create a Windows 2000 Server or Microsoft Windows Server™ 2003 cluster. To create a Windows 2000 or Windows Server 2003 cluster, see the following resources:
- Windows Server 2003 For information about how to create a Windows Server 2003 cluster, see Checklist: Preparation for installing a cluster ( http://go.microsoft.com/fwlink/?linkid=16302).
- Windows 2000 For information about how to create a Windows 2000 cluster, see Step-by-Step Guide to Installing Cluster Service ( http://go.microsoft.com/fwlink/?LinkId=83053).
Cluster RequirementsBefore you deploy Exchange Server 2003 on a Windows 2000 Server or Windows Server 2003 cluster, make sure that your organization meets the requirements listed in this section.System-Wide Cluster Requirements
Before you deploy the Exchange Server 2003 cluster, make sure that the following system-wide requirements are met:
- Make sure that you are running Domain Name System (DNS) and Windows Internet Name Service (WINS). Ideally, the DNS server should accept dynamic updates. If the DNS server does not accept dynamic updates, you must create a DNS Host (A) record for each Network Name resource in the cluster. Otherwise, Exchange does not function correctly. For more about how to configure DNS for Exchange, see Microsoft Knowledge Base article 322856, “HOW TO: Configure DNS for Use with Exchange Server” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=322856).
- If the cluster nodes belong to a directory naming service zone that has a different name than the Microsoft Active Directory directory service domain name that the computer joined, the DNSHostName, by default, does not include the subdomain name. In this situation, you may have to change the DNSHostName property to make sure that some services, such as the File Replication Service (FRS), work correctly. For more information, see Microsoft Knowledge Base article 240942, “Active Directory DNSHostName Property Does Not Include Subdomain” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=240942).
- All cluster nodes must be member servers in the same domain. Exchange Server 2003 is not supported on nodes that are also Active Directory directory servers, or nodes that are members of different Active Directory domains.
- You must have a sufficient number of static IP addresses available when you create the Exchange Virtual Servers. Specifically, an <n>-node cluster with <e> Exchange Virtual Servers requires 2*n + e + 1 IP address. The +1 in this equation represents the additional IP address for the default cluster group. Therefore, for a two-node cluster, the recommended number of static addresses is five plus the number of Exchange Virtual Servers. For a four-node cluster, the recommended number is nine plus the number of Exchange Virtual Servers. For more information about IP addresses, see the section “IP Addresses and Network Names” in the guide Planning an Exchange Server 2003 Messaging System ( http://go.microsoft.com/fwlink/?LinkId=47584).
Note: Throughout this topic, “Exchange Virtual Server” refers to the Exchange Virtual Servers in the cluster and not to protocol virtual servers, such as HTTP virtual servers. - Make sure that the Cluster service is installed and running on all nodes before you install Exchange Server 2003. In Windows 2000, you must install and configure the Cluster service manually. In Windows Server 2003 Enterprise and Datacenter Editions, the Cluster service is installed by default. After the service is installed, you can use Cluster Administrator to configure the cluster. If the Cluster service is not installed and running on each node in a cluster before installation, Exchange Server 2003 Setup cannot install the cluster-aware version of Exchange Server 2003. When deployed in a multi-node cluster, Exchange Server 2003 requires one of the following quorum models:
- Single quorum device (shared disk quorum)
- Majority Node Set (MNS)
- Majority Node Set with File Share Witness (MNS+FSW)
Note: If you installed Exchange Server 2003 before building and configuring the cluster, you must uninstall Exchange Server 2003, build and configure the cluster, and then reinstall Exchange Server 2003. - Do not install Exchange Server 2003 on multiple nodes simultaneously.
- An Exchange Server 2003 cluster server cannot be the first Exchange Server 2003 server to join an Exchange Server 5.5 site. This is because Site Replication Service (SRS) is not supported on an Exchange cluster. You must install a stand-alone (non-clustered) Exchange Server 2003 server into an Exchange Server 5.5 site before installing Exchange Server 2003 on the nodes of the cluster. (The first Exchange Server 2003 server installed in an Exchange Server 5.5 site runs SRS.) For more information about SRS, see Exchange Server 2003 Help.
- Before you install Exchange Server 2003, make sure that the folder to which you will install all the Exchange shared data on the physical disk resource is empty.
- You must install the same version of Exchange Server 2003 on all nodes in the cluster. In addition, the Exchange program files must be installed in the same location on all nodes in the cluster. In Exchange Server 2003, the Exchange binaries are installed on the local storage and not the cluster shared storage.
- At a minimum, you must install Microsoft Exchange Messaging and Collaboration and Microsoft Exchange System Management Tools on all nodes of the cluster.
- The Cluster service account must have local Administrator privileges on the cluster nodes and be a domain user account. You can establish those permissions by creating a domain user account and making this account a member of the local Administrators group on each node.
- By default in Windows 2000 and later versions, any user account has the permission to join a computer to the domain. If this permission has been restricted in accordance with your organization’s security policy, you must explicitly grant that permission. For information about how to verify that the Cluster Service account has the Add Workstations to a Domain User permission, see Microsoft Knowledge Base article 307532, “How to Troubleshoot the Cluster Service Account When It Modifies Computer Objects” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=307532).
- (Recommendation) Install Terminal Services so that administrators can use Remote Desktop to manage clusters. However, administrators can also use the Administrative Tools package (Adminpak.msi) from any Exchange Server 2003 server to remotely manage clusters.
Note: By default, Terminal Services is installed on servers that run Windows Server 2003. Terminal Services is an optional component on servers that run Windows 2000. Server-Specific Cluster Requirements
Before you deploy the Exchange Server 2003 cluster, make sure that your servers meet the requirements described in this section.
Hardware Requirements
The hardware requirements to deploy Exchange Server 2003 clusters depend on the operating system you are running.
- Windows Server 2003 hardware requirements
For Exchange Server 2003 cluster nodes running on Windows Server 2003, Enterprise or Datacenter Editions, you must select from hardware listed in the Windows Server Catalog ( http://go.microsoft.com/fwlink/?LinkId=17219) under the Cluster Solutions category. Additionally, for geographically dispersed clusters, both the hardware and software configuration must be certified and listed in the Windows Server Catalog under the Geographically Dispersed Cluster Solution category. - Windows 2000 Server hardware requirements
Exchange Server 2003 cluster nodes running on Windows 2000 Server must be running the Advanced Server or Datacenter Server editions. For information about the hardware requirements for these editions, see the section “Checklists for Cluster Server Installation” in the technical article Step-by-Step Guide to Installing Cluster Service ( http://go.microsoft.com/fwlink/?LinkId=83053).
Note: To simplify configuration issues and possibly eliminate some compatibility problems, we recommend that the cluster configuration contain identical storage hardware on all cluster nodes. Operating System Version and Exchange Edition Requirements
Specific operating system versions and Exchange editions are required to create Exchange clusters. Table 1 lists the required Windows 2000 and Windows Server 2003 versions and Exchange Server 2003 editions, and the number of cluster nodes available for each.
Important: Exchange Server 2003, Standard Edition does not support clustering. Similarly, Windows 2000 Server and Windows Server 2003, Standard Edition do not support clustering. Table 1 Operating system versions and Exchange edition requirements
Operating system version Exchange Server 2003 edition Cluster nodes available Any server in the Windows 2000 Server or Windows Server 2003 families Exchange Server 2003, Standard Edition None Windows 2000 Server or Windows Server 2003, Standard Edition Exchange Server 2003, Standard Edition or Exchange Server 2003, Enterprise Edition None Windows 2000 Advanced Server Exchange Server 2003, Enterprise Edition Up to two Windows 2000 Datacenter Server Exchange Server 2003, Enterprise Edition Up to four Windows Server 2003, Enterprise Edition Exchange Server 2003, Enterprise Edition Up to eight Windows Server 2003, Datacenter Edition Exchange Server 2003, Enterprise Edition Up to eight Shared Disk Requirements
The following are the minimum shared disk requirements for installing Exchange Server 2003 on a Windows 2000 or Windows Server 2003 cluster:
- Shared disks must be physically attached to a shared bus.
- Disks must be accessible from all nodes in the cluster.
- Disks must be configured as basic disks, and not dynamic disks.
- All partitions on the shared disk must be formatted for NTFS file system.
- Only physical disks can be used as a cluster resource. All partitions on a physical disk will be treated as one resource.
- We recommend that you use Diskpart to align the shared storage disks at the storage level. Diskpart is part of the Windows Server 2003 Service Pack 1 tools. For more information, see “How to Align Exchange I/O with Storage Track Boundaries” in Optimizing Storage for Exchange Server 2003.
Network Configuration Requirements
Make sure that the networks used for client and cluster communications are configured correctly. This section provides links to the procedures necessary to verify that your private and public network settings are configured correctly. In addition, you must make sure that the network connection order is configured correctly for the cluster.
For detailed steps about how to configure the private network in an Exchange cluster, see How to Configure the Private Network in an Exchange Cluster.
For detailed steps about how to configure the public network in an Exchange cluster, see How to Configure the Public Network in an Exchange Cluster.
For detailed steps about how to configure the network connection order in an Exchange cluster, see How to Configure the Network Connection Order in an Exchange Cluster.
Figure 1 illustrates a network configuration for a 4-node cluster.
Figure 1 Network configuration for a four-node cluster

For more information about how to configure public and private networks on a cluster, see Microsoft Knowledge Base article 258750, “Recommended Private ‘Heartbeat’ Configuration on a Cluster Server” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=258750).
Clustering Permission Model Changes
The permissions that you need to create, delete, or modify an Exchange Virtual Server are modified in Exchange Server 2003. The best way to understand these modifications is to compare the Exchange 2000 Server permissions model with the new Exchange Server 2003 permissions model.
Note: In the following sections, the term “cluster administrator” refers to the person who manages Exchange clusters for your organization. Exchange 2000 Server Permissions Model
For an Exchange 2000 Server cluster administrator to create, delete, or modify an Exchange Virtual Server, the cluster administrator’s account and the Cluster Service account require the following permissions:
- If the Exchange Virtual Server is the first Exchange Virtual Server in the Exchange organization, the cluster administrator’s account and the Cluster Service account must each be a member of a group that has the Exchange Full Administrator role applied at the organization level.
- If the Exchange Virtual Server is not the first Exchange Virtual Server in the organization, the cluster administrator’s account and the Cluster Service account must each be a member of a group that has the Exchange Full Administrator role applied at the administrative group level.
Exchange Server 2003 Permissions Model
In Exchange Server 2003, the permissions model has changed. The Windows Cluster Service account no longer requires Exchange-specific permissions. Specifically, the Windows Cluster Service account no longer requires that the Exchange Full Administrator role be applied to it, neither at the Exchange organization level nor at the administrative group level. Its default permissions in the forest are sufficient for it to function in Exchange Server 2003.
As with Exchange 2000 Server, the cluster administrator requires the following permissions:
- If the Exchange Virtual Server is the first Exchange Virtual Server in the organization, the cluster administrator must be a member of a group that has the Exchange Full Administrator role applied at the organization level.
- If the Exchange Virtual Server is not the first Exchange Virtual Server in the organization, you must use an account that is a member of a group that has the Exchange Full Administrator role applied at the administrative group level.
However, depending on the mode in which the Exchange organization is running (native mode or mixed mode), and depending on your topology configuration, the cluster administrators must have the following additional permissions:
- When the Exchange organization is in native mode, if the Exchange Virtual Server is in a routing group that spans multiple administrative groups, the cluster administrator must be a member of a group that has the Exchange Full Administrator role applied at all the administrative group levels that the routing group spans. For example, if the Exchange Virtual Server is in a routing group that spans the First Administrative Group and Second Administrative Group, the cluster administrator must use an account that is a member of a group that has the Exchange Full Administrator role applied at First Administrative Group and must also be a member of a group that has the Exchange Full Administrator role applied at Second Administrative Group.
Note: Routing groups in Exchange native-mode organizations can span multiple administrative groups. Routing groups in Exchange mixed-mode organizations cannot span multiple administrative groups. - In topologies such as parent/child domains where the cluster server is the first Exchange server in the child domain, the cluster administrator must be a member of a group that has the Exchange Administrator role or greater applied at the organization level to be able specify the server responsible for Recipient Update Service in the child domain.
Deployment ScenariosAfter you ensure that the Exchange organization meets the clustering requirements listed in this topic, you are ready to deploy an Exchange Server 2003 cluster. This section provides links to the procedures necessary to deploy active/passive or active/active Exchange Server 2003 clusters on Windows Server 2003. Any procedural differences with regard to deploying Exchange Server 2003 clusters on Windows 2000 are explained.The following deployment scenarios are included in this section:
- Four-node cluster scenario
- Deploying a new Exchange Server 2003 cluster
- Upgrading an Exchange 2000 Server cluster to Exchange Server 2003
- Migrating an Exchange Server 5.5 cluster to Exchange Server 2003
- Upgrading mixed Exchange 2000 Server and Exchange Server 5.5 clusters
Four-Node Cluster Scenario
Although the deployment procedures listed in this section apply to any cluster configuration, it helps understand one of the more typical four-node cluster deployments.
The recommended configuration for a four-node Exchange Server 2003 cluster is one that contains three active nodes and one passive node, where each of the active nodes contains one Exchange Virtual Server. This configuration is helpful because it gives you the capacity of running three active Exchange servers, while maintaining the failover security provided by one passive server.
Figure 2 illustrates the four-node, active/passive Exchange Server 2003 cluster.
Figure 2 Four-node active/passive Exchange Server 2003 cluster

The following sections provide the recommended software, hardware, and storage requirements for an Exchange Server 2003 active/passive four-node cluster.
Software Recommendations
In this scenario, all four nodes in the cluster are running Windows Server 2003, Enterprise Edition and Exchange Server 2003, Enterprise Edition. Additionally, each node is connected to a DNS server configured for dynamic updates.
Hardware Recommendations
In this scenario, the following hardware configurations are recommended.
Server hardware
- Four 1 gigahertz (GHz), 1 megabyte (MB) or 2 MB L2 cache processors
- 4 gigabytes (GB) of Error Correction Code (ECC) RAM
- Two 100 megabits per second (Mbps) or 1000 Mbps network interface cards
- RAID-1 array with two internal disks for the Windows Server 2003 and Exchange Server 2003 program files
- Two redundant 64-bit fiber Host Bus Adapters (HBAs) to connect to the Storage Area Network
Local area network hardware
- Two 100 Mbps or 1000 Mbps network switches (full duplex)
Storage Area Network hardware
- Redundant fiber switches
- 106 disk spindles (Ultra Wide SCSI) with spindle speeds of 10,000 RPM or greater
- 256 MB or more read/write cache memory
Storage Configuration Recommendations
In this scenario, the following storage configurations are recommended:
Storage groups and databases
- Three storage groups per Exchange Virtual Server
- Five databases per storage group
Disk drive configuration
Table 2 lists the recommended disk drive configuration. For more information about this and other disk drive configurations, see “Drive Letter Configurations” in the guide Planning an Exchange Server 2003 Messaging System ( http://go.microsoft.com/fwlink/?LinkId=47584)
Table 2 Disk drive configuration for a four-node active/passive cluster containing three Exchange Virtual Servers
Node 1 (EVS1 active) Node 2 (EVS2 active) Node 3 (EVS3 active) Node 4 (passive) Disk 1: SMTP/MTA Disk 8: SMTP Disk 15: SMTP Disk 22: Quorum Disk 2: SG1 databases Disk 9: SG1 databases Disk 16: SG1 databases Disk 3: SG1 logs Disk 10: SG1 logs Disk 17: SG1 logs Disk 4: SG2 databases Disk 11: SG2 databases Disk 18: SG2 databases Disk 5: SG2 logs Disk 12: SG2 logs Disk 19: SG2 logs Disk 6: SG3 databases Disk 13: SG3 databases Disk 20: SG3 databases Disk 7: SG3 logs Disk 14: SG3 logs Disk 21: SG3 logs Storage Area Network disk configuration
- SMTP/MTA drives RAID-(0+1) array consisting of four spindles. (3 EVSs × 4 disks = 12 disks.)
- Storage group log drives RAID-1 array consisting of two spindles. (3 EVSs × 3 storage groups × 2 disks = 18 disks.)
- Database (.edb and .stm files) drives RAID-(0+1) array consisting of eight spindles. (3 EVSs × 3 storage groups × 8 databases = 72 disks.)
- Quorum disk resource drive RAID-1 array consisting of two spindles (2 disks).
Total shared disk spindles is 104.
Deploying a New Exchange Server 2003 Cluster
This section provides information about how to deploy a new Exchange Server 2003 cluster in your organization. The procedures referenced this section are applicable for any cluster configuration, from an active/passive cluster with two to eight nodes to a two-node active/active cluster with one or two nodes.
Specifically, this section will guide you through the following steps:
- Preparing Active Directory for Exchange Server 2003.
- Installing Exchange Server 2003 on each node.
- Creating the Exchange Virtual Servers.
Step 1: Preparing Active Directory for Exchange Server 2003
Preparing Active Directory for a cluster installation resembles preparing Active Directory for non-clustered servers.
Step 1 includes the following tasks:
- Run ForestPrep.
- Run DomainPrep.
Running ForestPrep
Before you install Exchange Server 2003 anywhere in the forest, you must extend the Windows Active Directory schema. To accomplish this task, you must run ForestPrep.
Note: Running ForestPrep is required only if you are installing Exchange Server 2003 for the first time in your organization. If you already installed Exchange Server 2003 in your organization, you do not have to run ForestPrep. For detailed steps about how to run ForestPrep, see How to Run Exchange Server 2003 ForestPrep.
Note: During the ForestPrep process, you will enter the name of the user or group who is responsible for installing Exchange Server 2003. This account must be a domain account that includes local administrator privileges on the cluster nodes. The account you specify will also have permission to use the Exchange Delegation Wizard to create all levels of Exchange Server 2003 administrator accounts. Running DomainPrep
You must run DomainPrep for each Windows 2000 or Windows Server 2003 domain in which you want to install Exchange Server 2003. However, before you can run DomainPrep, ForestPrep must finish replicating the schema updates.
Note: Running DomainPrep is required only if you are installing Exchange Server 2003 for the first time in your domain. If you already installed Exchange Server 2003 in your domain, you do not have to run DomainPrep. For detailed steps about how to run DomainPrep, see How to Run Exchange Server 2003 DomainPrep.
Step 2: Installing Exchange Server 2003 on Each Node
After you extend the schema with ForestPrep and prepare the domain with DomainPrep, you are ready to install Exchange Server 2003 on the first cluster node.
Step 2 includes the following tasks:
- Make sure that the Cluster service is running on each node.
- Install and enable the required Windows services.
- Install Microsoft Distributed Transaction Coordinator (MSDTC).
- Run Exchange Server 2003 Setup.
However, before you perform these tasks, familiarize yourself with the requirements necessary for installing Exchange Server 2003 on cluster servers (Table 3).
Table 3 Requirements for running Exchange Setup on a cluster server
Area Requirements Permissions Account must be a member of a group that has the Exchange Full Administrator role applied at the organization level. An account that has the Exchange Full Administrator role applied at the administrative group level can run Exchange Setup on a cluster node if the cluster node is a member of the Exchange Domain Servers group on the domain to which the cluster node belongs.Note: When you install Exchange Server 2003 into an existing Exchange Server 5.5 organization, additional permissions are required. For information about the specific permissions that are required to install Exchange Server 2003 into an existing Exchange Server 5.5 organization, see “Permissions for Migrating from Exchange Server 5.5 to Exchange Server 2003″ in Migrating from Exchange Server 5.5 to Exchange Server 2003.
File system - Installation drive cannot be the cluster shared drive.
- Installation drive must be the same across all nodes.
Cluster resources - The MSDTC must be running on one of the nodes in the cluster. The clustered MSDTC resource should exist in the default cluster group.
Other - The fully qualified domain name (FQDN) of the node cannot match the Simple Mail Transfer Protocol (SMTP) proxy domain of any recipient policy.
A cluster with three or more nodes is usually active/passive. In active/passive mode, there can be n – 1 or fewer Exchange Virtual Servers, where n is the number of nodes. For example, if, by installing Exchange on a node, the cluster becomes a three-node cluster, and the number of Exchange Virtual Servers is three or more, then Exchange Setup stops installation until you remove one of the Exchange Virtual Servers.
Note: - The Cluster service must be initialized and running.
- If you have more than two nodes, the cluster must be active/passive. If you have fewer than two nodes, an active/active configuration is allowed.
If running Windows 2000 Windows 2000 Service Pack 4 (SP4) is required. - To obtain Windows 2000 SP4, see the Windows 2000 Service Packs Web site ( http://go.microsoft.com/fwlink/?LinkId=18353).
Ensuring That the Cluster Service is Running on Each Node
To successfully install Exchange Server 2003 on a server in a cluster, the Cluster service must be installed and running on a cluster node. The Cluster service is installed by default with Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition. However, the Cluster service is not installed by default with Windows 2000 Server.
For detailed steps about how to confirm that the Cluster service is running, see How to Verify that the Cluster Service is Running on Each Node.
Installing and Enabling Required Windows Services
Exchange Server 2003 Setup requires that the following components and services be installed and enabled on the server:
- .NET Framework
- ASP.NET
- Internet Information Services (IIS)
- World Wide Web Publishing Service
- Simple Mail Transfer Protocol (SMTP) service
- Network News Transfer Protocol (NNTP) service
If you are installing Exchange Server 2003 on a server running Windows 2000, Exchange Setup installs the Microsoft .NET Framework and ASP.NET automatically. You must manually install and start the World Wide Web Publishing service, the SMTP service, and the NNTP service before running Exchange Server 2003 Setup.
Important: When you install Exchange on a new server, only the required services are enabled. For example, the Post Office Protocol version3 (POP3) and Internet Message Access Protocol version4 (IMAP4) services are disabled by default on all of your Exchange Server 2003 servers. You should only enable services that are essential for performing Exchange Server 2003 tasks. The NNTP service should always remain disabled. Although NNTP is required in order to install Exchange, Exchange NNTP features are not supported and cannot be used on clustered Exchange servers. For detailed steps about how to install and enable the IIS prerequisites for Exchange cluster running on Windows 2000, see How to Install IIS Prerequisites for Exchange Server 2003 on Windows 2000.
For detailed steps about how to install and enable the IIS prerequisites for an Exchange cluster running on Windows Server 2003, see How to Install IIS Prerequisites for Exchange Server 2003 on Windows Server 2003.
Installing Microsoft Distributed Transaction Coordinator
Before you install Exchange Server 2003 on servers running Windows Server 2003 or Windows 2000, you must first install the Microsoft Distributed Transaction Coordinator (MSDTC) resource into the cluster.
It is an Exchange best practice to install the MSDTC resource into the default cluster group. However, the MSDTC resource is the only resource supported in the default cluster group. Exchange resources should not be added to the default cluster group, as that configuration is not supported.
For detailed steps about how to install the MSDTC in a Windows 2000 server cluster, see How to Install the Microsoft Distributed Transaction Coordinator in a Windows 2000 Server Cluster.
For detailed steps about how to install the MSDTC in a Windows Server 2003 server cluster, see How to Install the Microsoft Distributed Transaction Coordinator in a Windows Server 2003 Server Cluster.
Note: For more information, see Microsoft Knowledge Base article 312316, “XADM: Setup Does Not Install Exchange 2000 Server on a Cluster if the MSDTC Resource Is Not Running” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=312316). For more information about adding the MSDTC resource in Windows Server 2003, see Microsoft Knowledge Base article 301600, “How to Configure Microsoft Distributed Transaction Coordinator on a Windows Server 2003 Cluster” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=301600).
Note: Knowledge Base article 301600 includes a reference to article 817064, “How to enable network DTC access in Windows Server 2003″ ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=817064). It is an Exchange Server security best practice to not enable network DTC access for an Exchange cluster. If you are configuring the Distributed Transaction Coordinator resource for an Exchange cluster, do not enable network DTC access. Running Exchange Setup
Installing Exchange Server 2003 on a cluster is similar to installing Exchange Server 2003 on non-clustered servers. For detailed steps about how to run Exchange Setup in a Windows server cluster, see How to Run Exchange Setup in a Windows Server Cluster.
Note: Unattended Setup is not supported when installing Exchange Server 2003 on a Windows cluster. Before installing Exchange Server 2003 on a node, it is recommended that you move all cluster resources owned by the node to another node.
Important: Install Exchange Server 2003 completely on one node before you install it on another node. For important information about post-deployment steps, see Post-Installation Steps for Exchange Server 2003. That topic includes information about how to verify that your Exchange installation was successful. It also includes information about how to upgrade your cluster with the latest Exchange Server 2003 service packs and security patches.
Step 3: Creating the Exchange Virtual Servers
The final step in configuring Exchange Server 2003 on a cluster is to create the Exchange Virtual Servers.
Step 3 includes the following tasks:
- Create the resource group to host the Exchange Virtual Server. A separate cluster group is required for each Exchange Virtual Server. Exchange cluster resources should not be added to the default cluster group, and adding an Exchange Virtual Server to the cluster group is not supported. For detailed steps, see How to Create a Resource Group for an Exchange Virtual Server in a Windows Server Cluster.
- Create an IP Address resource. For detailed steps, see How to Create an IP Address Resource for an Exchange Virtual Server in a Windows Server Cluster.
- Create a Network Name resource. For detailed steps, see How to Create a Network Name Resource for an Exchange Virtual Server in a Windows Server Cluster.
- Add a disk resource to the Exchange Virtual Server. For detailed steps, see How to Move an Existing Disk Resource into an Exchange Virtual Server in a Windows Server Cluster.
- Create an Exchange Server 2003 System Attendant resource. For detailed steps, see How to Create an Exchange System Attendant Resource for an Exchange Virtual Server in a Windows Server Cluster.
- Create any additional Exchange Virtual Servers. You need to repeat these tasks for each Exchange Virtual Server you want to add to your cluster. For example:
- If you are configuring a two-node active/passive Exchange Server 2003 cluster, you create only one Exchange Virtual Server. Therefore, you would only perform these tasks once.
- If you are configuring a four-node 3 active/1 passive Exchange Server 2003 cluster, you create three Exchange Virtual Servers. Therefore, you would perform these tasks three times.
Before performing these tasks, familiarize yourself with the requirements necessary for creating Exchange Virtual Servers (Table 4).
Table 4 Exchange Virtual Server requirements
Area Requirements Permissions - If you are creating either the first Exchange server in the organization or the first Exchange server in the domain, the account must be a member of a group that has the Exchange Full Administrator role applied at the organizational level.
- If the server is not the first Exchange server in the organization and is not the first server in the domain, the account must be a member of a group that has the Exchange Full Administrator role applied at the administrative group level.
File system - MDBDATA folder must be empty.
Cluster resources - Network Name resource must be online.
- Physical disk resources must be online.
Other - The FQDN of the Exchange Virtual Server may not match SMTP proxy domain of any recipient policy.
- Enforce Active/Active restrictions.
- Exchange Virtual Server(s) are installed into their own cluster group(s).
Adding a Disk Resource to the Exchange Virtual Server
You must add a disk resource for each disk that you want to associate with the Exchange Virtual Server. This section includes links to the following procedures:
- If the disk resource you want to add already exists, follow the procedure to move an existing disk resource. For detailed steps, see How to Move an Existing Disk Resource into an Exchange Virtual Server in a Windows Server Cluster.
- If the disk resource you want to add does not yet exist, follow the procedure to create a new disk resource. For detailed steps, see How to Create a Physical Disk Resource for an Exchange Virtual Server in a Windows Server Cluster.
- If you are using mounted drives, follow the procedure to add mounted drives. This procedure applies only to server clusters running Windows Server 2003. Mounted drives are not supported in Windows 2000 server clusters. For detailed steps, see How to Add a Mounted Drive to an Exchange Virtual Server in a Windows Server Cluster.
Note: To prevent possible damage to your hard disk, see “Checklist: Creating a server cluster” in Windows 2000 Help or “Planning and preparing for cluster installation” in Windows Server 2003 Help before connecting a disk to a shared bus. After you successfully create the Exchange System Attendant resource, Exchange System Attendant creates the following additional resources for the Exchange Virtual Server automatically (Figure 3):
- Exchange Information Store Instance
- Exchange Message Transfer Agent Instance
- Exchange Routing Service Instance
- SMTP Virtual Server Instance
- Exchange HTTP Virtual Server Instance
- Exchange MS Search Instance
For improved security, the Windows IMAP4 and POP3 protocol services are no longer enabled by default on servers that are running Windows Server 2003. Similarly, the IMAP4 and POP3 protocol resources are no longer created by default upon creation of an Exchange Server 2003 Virtual Server.
For information about adding IMAP4 and POP3 resources, see “Managing Exchange Clusters,” in the Exchange Server 2003 Administration Guide ( http://go.microsoft.com/fwlink/?LinkId=47617).
Note: The Message Transfer Agent Instance resource is created only in the first Exchange Virtual Server added to a cluster. All Exchange Virtual Servers in the cluster share the single Message Transfer Agent Instance resource. Figure 3 Exchange Virtual Server resources

Repeating Step 3 for the Next Exchange Virtual Server
For each Exchange Virtual Server you want to create, repeat all the procedures in “Step 3: Creating the Exchange Virtual Servers.” For example, if you are creating a four-node active/passive cluster with three Exchange Virtual Servers, repeat this step two more times. If you are creating a two-node active/active cluster, you would repeat this step one more time.
Supporting Multiple SMTP Domains in a Front-End and Back-End Topology
If you run Exchange Server 2003 in a front-end and back-end topology that includes multiple SMTP namespaces, you must create additional HTTP virtual servers in the Exchange Virtual Server for each domain namespace. For example, if contoso.com hosts Exchange Server 2003 for both tailspintoys.com and wingtiptoys.com, three virtual servers are necessary—the default virtual server, a virtual server for tailspintoys.com, and a virtual server for wingtiptoys.com. This configuration provides maximum flexibility in determining which resources are available to each hosted company.
For information about front-end and back-end server architecture, see “Upgrading Front-End and Back-End Servers” in Upgrading from Exchange 2000 Server to Exchange Server 2003. For information about planning a front-end server and for more conceptual information about configuring front-end and back-end servers running Exchange Server 2003, see the guide Planning an Exchange Server 2003 Messaging System ( http://go.microsoft.com/fwlink/?LinkId=47584)
To configure a clustered back-end server to support multiple SMTP domains, you must map each front-end server to the nodes of your cluster, so that any node can accept proxy requests from any front-end server in your organization.
For detailed steps, see How to Support Multiple SMTP Domains in a Front-End and Back-End Topology.
Figure 4 illustrates a front-end/back-end configuration that uses Exchange clustering.
Figure 4 Front-end and back-end configuration that uses Exchange clustering

Upgrading an Exchange 2000 Server Cluster to Exchange Server 2003
Upgrading an Exchange 2000 Server cluster to Exchange Server 2003 requires that you upgrade each of the cluster nodes and all Exchange Virtual Servers to Exchange Server 2003.
For detailed steps, see How to Upgrade an Exchange 2000 Cluster to Exchange Server 2003.
Note: Before upgrading your Exchange 2000 cluster to Exchange Server 2003, you should familiarize yourself with the requirements necessary for upgrading a cluster node (Table 5) and upgrading an Exchange Virtual Server (Table 6). Table 5 Requirements for upgrading a cluster node
Area Requirements Permissions - Account must be a member of a group that has the Exchange Full Administrator role applied at the administrative group level.
Cluster resources - No cluster resources can be running on the node you are upgrading, because Exchange Setup will need to recycle the Cluster service. One-node clusters are exempt.
- The MSDTC resource must be running on one of the nodes in the cluster.
Other - Only servers running Exchange 2000 SP3 can be upgraded to Exchange Server 2003. If your servers are running previous versions of Exchange, you must first upgrade to Exchange 2000 SP3.
- You must upgrade your cluster nodes one at a time.
- The Cluster service must be initialized and running.
- If there are more than two nodes, the cluster must be active/passive. If there are two nodes or fewer, active/active is allowed.
If running Windows 2000 - Windows 2000 SP4 is required.
To obtain Windows 2000 SP4, go to the Windows 2000 Service Packs Web site ( http://go.microsoft.com/fwlink/?LinkId=18353).
Table 6 Requirements for upgrading an Exchange Virtual Server
Area Prerequisites Permissions - If the Exchange Virtual Server is the first server to be upgraded in the organization or is the first server to be upgraded in the domain, the account must be a member of a group that has the Exchange Full Administrator role applied at the organization level.
- If the Exchange Virtual Server is not the first server to be upgraded in the organization or the first Exchange server to be upgraded in the domain, the account only needs to be a member of a group that has the Exchange Full Administrator role applied at the administrative group level.
Cluster resources - The Network Name resource must be online.
- The Physical Disk resources must be online.
- The System Attendant resource must be offline.
Other - The version of Exchange on the computer running Cluster Administrator must be the same version as the node that owns the Exchange Virtual Server.
- You must upgrade your Exchange Virtual Servers one at a time.
Migrating an Exchange Server 5.5 Cluster to Exchange Server 2003
The procedures for upgrading your cluster nodes from Exchange Server 5.5 to Exchange 2000 Server are outside the scope of this document. For information about how to upgrade Exchange Server 5.5 servers to Exchange 2000 Server, see Microsoft Knowledge Base article 316886, “HOW TO: Migrate from Exchange Server 5.5 to Exchange 2000 Server” ( http://go.microsoft.com/fwlink/?linkid=3052&kbid=316886).
Upgrading Mixed Exchange 2000 Server and Exchange Server 5.5 Clusters
To upgrade Exchange clusters that contain both Exchange 2000 Server and Exchange Server 5.5 nodes, use the procedures in “Upgrading an Exchange 2000 Server Cluster to Exchange Server 2003″ earlier in this topic, in conjunction with the procedures listed in Migrating from Exchange Server 5.5 to Exchange Server 2003.
- Cluster Requirements
-
Aug17
Server cluster Concepts
Author: Administrator; Filed under: Cluster, Computer & Internet; Tagged as: 2003 server, active directory cluster, active directory server, application server, backup server, base windows, cluadmin, Cluster, cluster 2003, cluster application, cluster applications, cluster architecture, cluster configuration, cluster configurations, cluster documentation, cluster environment, cluster exe, cluster failover, cluster guide, cluster hardware, cluster installation, cluster management, cluster microsoft, cluster performance, cluster replication, cluster servers, cluster support, cluster windows 2000, clustering, clusters, command line cluster, compatibility list, computer cluster, configuration server, database cluster, database server, external storage, fibre channel arbitrated loop, hardware architectures, hardware cluster, hardware compatibility test, high availability cluster, load balancing, majority node set cluster, microsoft cluster, microsoft cluster service, microsoft hardware, microsoft server, node clusters, quorum resource, raid adapter, raid storage, redundant array of inexpensive disks, server, server cluster, server cluster requirements, server clustering, server clusters, server linux, server management, server setup, server windows 2003, servers, setup cluster, storage configuration, two network cards, ubuntu cluster, unix cluster, web application server, websphere, websphere cluster, windows 2003 cluster, windows cluster, windows operating system, windows server
No CommentsServer cluster Concepts (Server Clusters: Frequently Asked Questions for Windows 2000 and Windows Server 2003)Q. What hardware do you need to build a Server cluster?
A. The most important criteria for Server cluster hardware is that it be included in a validated Cluster configuration on the Microsoft Hardware Compatibility List (HCL), indicating it has passed the Microsoft Cluster Hardware Compatibility Test. All qualified solutions appear on the Microsoft HCL ( http://go.microsoft.com/fwlink/?linkid=67738). Only cluster solutions listed on the HCL are supported by Microsoft.
In general, the criteria for building a server cluster include the following:
- Servers: Two or more PCI-based machines running one of the operating system releases that support Server clusters (see below). Server clusters can run on all hardware architectures supported by the base Windows operating system, however, you cannot mix 32-bit and 64-bit architectures in the same cluster.
- Storage: Each server needs to be attached to a shared, external storage bus(es) that is/are separate from the bus containing the system disk, the startup disk or the pagefile disk. Applications and data are stored on one or more disks attached to this bus. There must be enough storage capacity on the shared cluster bus(es) for all of the applications running in the cluster environment. This shared storage configuration allows applications to failover between servers in the cluster.
Microsoft recommends hardware Redundant Array of Inexpensive Disks (RAID) for all cluster disks to eliminate disk drives as a potential single point of failure. This means using a RAID storage unit, a host-based RAID adapter that implements RAID across disks, etc.
SCSI is supported for 2-node cluster configurations only. Fibre channel arbitrated loop is supported for 2-node clusters only. Microsoft recommends using fibre channel switched fabrics for clusters of more than two nodes. - Network: Each server needs at least two network cards. Typically, one is the public network and the other is a private network between the two nodes. A static IP address is needed for each group of applications that move as a unit between nodes. Server clusters can project the identity of multiple servers from a single cluster by using multiple IP addresses and computer names: this is known as a virtual server.
Q. What is a cluster resource?
A. A cluster resource is the lowest level unit of management in a Server cluster. A resource represents a physical object or an instance of running code. For example, a physical disk, an IP address, an MSMQ queue, a COM object all of these things are considered to be resources. From a management perspective, resources can be independently started and stopped and each is monitored to ensure that it is healthy.
Server cluster can monitor any arbitrary resource type. This is possible because Server clusters define a resource plug-in model. Each resource type has an associated resource plug-in or resource dll that is used to start, stop and provide health information that is specific to the resource type. For example, starting and stopping SQL Server is different from starting and stopping a physical disk. The resource dll takes care of the differences. Application developers and system administrators can build new resource dlls for their applications that can be registered with the cluster service.
Server clusters provides some generic plug-ins that can be used to make existing applications cluster-aware very quickly, known as Generic Service and Generic Application. With Windows Server 2003, a Generic Script resource plug-in was added that allows the resource dll to be written in any scripting language supported by the Windows operating system.
Q. What is a resource dependency?
A. A complete application actually consists of multiple pieces or multiple resources, some pieces are code and others are physical resources required by the application. The resources are related in different ways; for example, an application that writes to a disk cannot come online until the disk is online. If the disk fails, then, by definition, the application cannot continue to run since it writes to the disk. Resource dependencies can be defined by the application developer or system administrator to capture these relationships. Resource dependencies define the order that resources are brought online and control how failures are propagated to the various pieces of the application.
Q. What is a resource group?
A. A resource group is a collection of one or more resources that are managed and monitored as a single unit. A resource group can be started or stopped. If a resource group is started, each resource in the group is started (taking into account any start order defined by the dependencies between resources in the group). If a resource group is stopped, all of the resources in the group are stopped. Dependencies between resources cannot span a group. In other words, the set of resources within a group is an autonomous unit that can be started and stopped independently from any other group. A group is a single, indivisible unit that is hosted on one server in a Server cluster at any point in time and it is the unit of failover.
Q. Can I have dependencies between resources in different groups?
A. No, resource dependencies are confined to a single group.
Q. What is a virtual server?
A. A virtual server is a resource group that contains an IP address resource and a network name resource. When an application is hosted in a virtual server, the application can be accessed by clients using the IP address or network name in that resource group. As the resource group fails over across the cluster, the IP address and network name remain the same, therefore the client becomes unaware of the physical location of the application and will continue to work in the event of a failure of one of the servers in the cluster.
Q. How can I take advantage of extensibility features of ISA Server?
A. A number of third-party vendors offer solutions such as virus detection, content filtering, site categorization, reporting, and administration. Customers and developers also have the ability to create their own extensions to ISA Server. ISA Server includes a comprehensive software development kit for developing tools that build on ISA Server firewall, caching, and management features.
Q. What is failover?
A. Server clusters monitor the health of the nodes in the cluster and the resources in the cluster. In the event of a server failure, the cluster software re-starts the failed server’s workload on one or more of the remaining servers. If an individual resource or application fails (but the server does not), Server clusters will typically try to re-start the application on the same server; if that fails, it moves the application’s resources and re-starts it on the other server. The process of detecting failures and restarting the application on another server in the cluster is known as failover.
The cluster administrator can set various recovery policies such as whether or not to re-start an application on the same server, and whether or not to automatically “failback” (re-balance) workloads when a failed server comes back online.
Q. Is failover transparent to users?
A. Server clusters do not require any special software on client computers, so the user experience during failover depends on the nature of the client side of their client-server application. Client reconnection can be made transparent, because the Server clusters software has restarted the applications, file shares, etc. at exactly the same IP address.
If a client is using “state-less” connections such as a standard browser connection, then the client would be unaware of a failover if it occurred between server requests. If a failure occurs while a client is connected to the failed resource, then the client will receive whatever standard notification is provided by the client side of their application when the server side becomes unavailable. This might be, for example, the standard “Abort, Retry, or Cancel?” prompt you get when using the Windows Explorer to download a file at the time a server or network goes down. In this case, client reconnection is not automatic (the user must choose “Retry”), but the user is fully informed of what is happening and has a simple, well-understood method of re-establishing contact with the server. Of course, in the meantime, the cluster service is busily re-starting the service or application so that, when the user chooses “Retry”, it re-appears as if it never went away.
Q. What is failback?
A. In the event of the failure of a server in a cluster, the applications and resources are failed over to another node in the cluster. When the failed node rejoins the cluster (after reboot for example), that node now is free to be used by applications. A cluster administrator can set policies on resources and resource groups that allow an application to automatically move back to a node if it becomes available, thus automatically taking advantage of a node rejoining the cluster. These policies are known as failback policies. You should take care when defining automatic failback policies since depending on the application, automatically moving the application (which was working just fine) may have undesirable consequences on the clients using the applications.
Q. When an application restarts after failover, does it restore the application state at the time of failure?
A. No, Server clusters provide a fast crash restart mechanism. When an application is failed over and restarted, the application is restarted from scratch. Any persistent data written out to a database or to files is available to the application, but any in-memory state that the application had before the failover is lost.
Q. At what level does failover exist?
A. At the resource group level.
Q. What is a Quorum Resource and how does it help Server clusters provide high availability?
A. Server clusters require a quorum resource to function. The quorum resource, like any other resource, is a resource which can only be owned by one server at a time, and for which servers can negotiate for ownership. Negotiating for the quorum resource allows Server clusters to avoid “split-brain” situations where the servers are active and think the other servers are down. This can happen when, for example, the cluster interconnect is lost and network response time is problematic. The quorum resource is used to store the definitive copy of the cluster configuration so that regardless of any sequence of failures, the cluster configuration will always remain consistent.
Q. What is active/active verses active/passive?
A. Active/Active and Active/Passive are terms used to describe how applications are deployed in a cluster. Unfortunately, they mean different things to different people and so the terms tend to cause confusion.
From the perspective of a single application or database:
- Active/Active means that the same application or pieces of the same service can be run concurrently on different nodes in the cluster. For example SQL Server 2000 can be configured such that the database is partitioned and each node can be running a single instance of the database. SQL Server provides the notion of views to provide a single image of the entire database.
- Active/Passive means that only one node in the cluster can be hosting the given application. For example, a single file share is active/passive. Any given file share can only be hosted on one node at a time.
From the perspective of a set of instances of an application or service:
- Active/Active means that different instances of the same application can be running concurrently on different cluster nodes. For example, each node in a cluster can be running SQL Server against a different database. A single cluster can support many file shares that are hosted on the nodes in a cluster concurrently.
- Active/Passive means that only one instance of a service can be running anywhere in the cluster. For example, there must only be a single instance of the DHCP service running in the cluster at any point in time.
From the perspective of the cluster:
- Active/Active means that all nodes in the cluster are running applications. These may be multiple instances of the same application or different applications (for example, in a 2-node cluster, WINS may be running on one node and DHCP may be running on the other node).
- Active/Passive means that one of the cluster nodes is spare and not being used to host applications.
Server clusters support all of these different combinations; the terms are really about how specific applications or sets of applications are deployed.
With the advent of more than two servers in a cluster, starting with Windows 2000 Datacenter, the term active/active is confusing because there may be four servers. When there are multiple servers, the set of options available for deployment becomes more flexible, allowing different configurations such as N+I.
Q. How do I benefit from more than two nodes in a cluster?
A. Failover is the mechanism that instance applications and the individual partitions of a partitioned application typically employ for high availability (the term Pack has been coined to describe a highly available, single instance application or partition).
In a 2-node cluster, defining failover policies is trivial. If one node fails, the only option is to failover to the remaining node. As the size of a cluster increases, different failover policies are possible and each one has different characteristics.
Failover Pairs
In a large cluster, failover policies can be defined such that each application is set to failover between two nodes. The simple example below shows two applications App1 and App2 in a 4-node cluster.

Figure 1: Failover pairs
Configuration has pros and cons:
Pro Good for clusters that are supporting heavy-weight1 applications such as databases. This configuration ensures that in the event of failure, two applications will not be hosted on the same node. Pro Very easy to plan capacity. Each node is sized based on the application that it will need to host (just like a 2-node cluster hosting one application). Pro Effect of a node failure on availability and performance of the system is very easy to determine. Pro Get the flexibility of a larger cluster. In the event that a node is taken out for maintenance, the buddy for a given application can be changed dynamically (may end up with standby policy below). Con In simple configurations such as the one above, only 50% of the capacity of the cluster is in use. Con Administrator intervention may be required in the event of multiple failures. 1 A heavy-weight application is one that consumes a significant number of system resources such as CPU, memory or IO bandwidth.
Failover pairs are supported by server clusters on all versions of Windows by limiting the possible owner list for each resource to a given pair of nodes.
Hot-Standby Server
To reduce the overhead of failover pairs, the spare node for each pair may be consolidated into a single node, providing a hot standby server that is capable of picking up the work in the event of a failure.

Figure 2: Standby Server
Configuration has pros and cons:
Pro Good for clusters that are supporting heavy-weight applications such as databases. This configuration ensures that in the event of a single failure, two applications will not be hosted on the same node. Pro Very easy to plan capacity. Each node is sized based on the application that it will need to host, the spare is sized to be the maximum of the other nodes. Pro Effect of a node failure on availability and performance of the system is very easy to determine. Con Configuration is targeted towards a single point of failure. Con Does not really handle multiple failures well. This may be an issue during scheduled maintenance where the spare may be in use. Server clusters support standby servers today using a combination of the possible owners list and the preferred owners list. The preferred node should be set to the node that the application will run on by default and the possible owners for a given resource should be set to the preferred node and the spare node.
N+I
Standby server works well for 4-node clusters in some configurations, however, its ability to handle multiple failures is limited. N+I configurations are an extension of the standby server concept where there are N nodes hosting applications and I nodes spare.

Figure 3: N+I Spare node configuration
Configuration has pros and cons:
Pro Good for clusters that are supporting heavy-weight applications such as databases or Exchange. This configuration ensures that in the event of a failure, an application instance will failover to a spare node, not one that is already in use. Pro Very easy to plan capacity. Each node is sized based on the application that it will need to host. Pro Effect of a node failure on availability and performance of the system is very easy to determine. Pro Configuration works well for multiple failures. Con Does not really handle multiple applications running in the same cluster well. This policy is best suited to applications running on a dedicated cluster. Server cluster supports N+I scenarios in the Windows Server 2003 release using a cluster group public property AntiAffinityClassNames. This property can contain an arbitrary string of characters. In the event of a failover, if a group being failed over has a non-empty string in the AntiAffinityClassNames property, the failover manager will check all other nodes. If there are any nodes in the possible owners list for the resource that are NOT hosting a group with the same value in AntiAffinityClassNames, then those nodes are considered a good target for failover. If all nodes in the cluster are hosting groups that contain the same value in the AntiAffinityClassNames property, then the preferred node list is used to select a failover target.
Failover Ring
Failover rings allow each node in the cluster to run an application instance. In the event of a failure, the application on the failed node is moved to the next node in sequence.

Figure 4: Failover Ring
Configuration has pros and cons:
Pro Good for clusters that are supporting several small application instances where the capacity of any node is large enough to support several at the same time. Pro Effect on performance of a node failure is easy to predict. Pro Easy to plan capacity for a single failure. Con Configuration does not work well for all cases of multiple failures. If one Node 1 fails, Node 2 will host two application instances and Nodes 3 and 4 will host one application instance. If Node 2 then fails, Node 3 will be hosting three application instances and Node 4 will be hosting one instance Con Not well suited to heavy-weight applications since multiple instances may end up being hosted on the same node even if there are lightly-loaded nodes. Failover rings are supported by server clusters on the Windows Server 2003 release. This is done by defining the order of failover for a given group using the preferred owner list. A node order should be chosen and then the preferred node list should be set up with each group starting at a different node.
Random
In large clusters or even 4-node clusters that are running several applications, defining specific failover targets or policies for each application instance can be extremely cumbersome and error prone. The best policy in some cases is to allow the target to be chosen at random, with a statistical probability that this will spread the load around the cluster in the event of a failure.
Configuration has pros and cons:
Pro Good for clusters that are supporting several small application instances where the capacity of any node is large enough to support several at the same time. Pro Does not require an administrator to decide where any given application should failover to. Pro Provided that there are sufficient applications or the applications are partitioned finely enough, this provides a good mechanism to statistically load balance the applications across the cluster in the event of a failure. Pro Configuration works well for multiple failures. Pro Very well tuned to handling multiple applications or many instances of the same application running in the same cluster well. Con Can be difficult to plan capacity. There is no real guarantee that the load will be balanced across the cluster. Con Effect on performance of a node failure is not easy to predict. Con Not well suited to heavy-weight applications since multiple instances may end up being hosted on the same node even if there are lightly-loaded nodes. The Windows Server 2003 release of server clusters randomizes the failover target in the event of node failure. Each resource group that has an empty preferred owners list will be failed over to a random node in the cluster in the event that the node currently hosting it fails.
Customized control
There are some cases where specific nodes may be preferred for a given application instance.
Configuration has pros and cons:
Pro Administrator has full control over what happens when a failure occurs. Pro Capacity planning is easy, since failure scenarios are predictable. Con With many applications running in a cluster, defining a good policy for failures can be extremely complex. Con Very hard to plan for multiple cascaded failures. Server clusters provide full control over the order of failover using the preferred node list feature. The full semantics of the preferred node list can be defined as:
Preferred Node List Move group to best possible initiated via administrator Failover due to node or group failure Contains all nodes in cluster Group is moved to highest node in preferred node list that is up and running in the cluster. Group is moved to the next node on the preferred node list. Contains a subset of the nodes in the cluster Group is moved to highest node in preferred node list that is up and running in the cluster. If no nodes in the preferred node list are up and running, the group is moved to a random node.
Group is moved to the next node on the preferred node list. If the node that was hosting the group is the last on the list or was not in the preferred node list, the group is moved to a random node.
Empty Group is moved to a random node. Group is moved to a random node. Q. How many resources can be hosted in a cluster?
A. The theoretical limit for the number of resources in a cluster is 1,674; however, you should be aware that the cluster service periodically polls the resources to ensure they are alive. As the number of resources increases, the overhead of this polling also increases.
-
Aug17
Windows Cluster Architecture
Author: Administrator; Filed under: Cluster, Computer & Internet; Tagged as: 2000 cluster, 2000 windows, additions, beowulf cluster, Cluster, cluster 2003, cluster applications, cluster architecture, cluster computer, cluster computers, cluster computing, cluster configuration, cluster failover, cluster installation, cluster node, cluster nodes, cluster one, cluster operations, cluster performance, cluster replication, cluster resources, cluster server, cluster server 2003, cluster service, cluster software, cluster storage, cluster sun, cluster technology, cluster training, clustered, clustered server, clustering, clustering exchange, clustering software, clustering technology, clusters, common resources, communication interfaces, computer clusters, data replication, database cluster, development cluster, disk cluster, exchange 2003 cluster, exchange cluster, exchange server cluster, external data storage, failover, grid cluster, grid computing cluster, group moves, hardware cluster, hardware devices, high availability cluster, high performance computing, individual resources, install cluster, install exchange 2003 cluster, load balancing, load balancing cluster, logical unit, microsoft cluster, microsoft cluster server, microsoft cluster service, microsoft windows nt server, network names, open cluster, physical hardware, quorum resource, resource dlls, samba cluster, server, server cluster, server cluster architecture, server clustering, server clusters, server high availability, servers, servers cluster, service cluster, setup cluster, sql cluster, sql server cluster, storage arrays, virtual server cluster, web cluster, windows 2003, windows 2003 architecture, windows 2003 clustering, windows cluster, windows cluster backup, windows cluster group, windows cluster heartbeat, windows cluster iis, windows cluster manager, windows cluster msdtc, windows cluster network, windows cluster quorum, windows cluster requirements, windows cluster resource, windows cluster resources, windows cluster san, windows cluster services, windows cluster software, windows cluster virtual, windows clustering, windows configuration, windows high availability, windows installation, windows installing, windows microsoft, windows server, windows server 2003, windows server 2003 architecture, windows server architecture, windows service architecture, windows setup
No CommentsMicrosoft Cluster Server (MSCS) in Microsoft Windows NT Server 4.0 Enterprise Edition was the first server cluster technology offered by Microsoft. Individual servers that compose a cluster are referred to as nodes. A Cluster service is a collection of components on each node that perform cluster-specific tasks. Hardware and software components in the cluster that are managed by the Cluster service are referred to as resources. Server clusters provide the instrumentation mechanism for managing resources through resource DLLs, which define resource abstractions (in other words, they abstract a clustered resource from a specific physical node, enabling the resource to move from one node to another), communication interfaces, and management operations.
Resources are elements in a cluster that are:
- Brought online (in service) and taken offline (out of service)
- Managed in a server cluster
- Owned by only one node at a time
A resource group is a collection of resources, managed by the Cluster service as a single, logical unit. This logical unit is often referred to as a failover unit, because the entire group moves as a single unit between nodes. Resources and cluster elements are grouped logically according to the resources added to a resource group. When a Cluster service operation is performed on a resource group, the operation affects all individual resources contained in the group. Typically, a resource group is created that contains the individual resources required by the clustered program.
Cluster resources may include physical hardware devices, such as disk drives and network cards, and logical items such as IP addresses, network names, and application components.
Clusters also include common resources, such as external data storage arrays and private cluster networks. Common resources are accessible by each node in the cluster. One common resource is the quorum resource, which plays a critical role in cluster operations. The quorum resource must be accessible for all node operations, including forming, joining or modifying a cluster.
Server Clusters
Windows Server 2003 Enterprise Edition provides two types of cluster technologies for use with Exchange Server 2003 Enterprise Edition. The first is Cluster services, which provide failover support for back-end mailbox servers that require a high level of availability. The second is Network Load Balancing (NLB), which complements server clusters by supporting highly available and scalable clusters of front-end Exchange protocol virtual servers (for example, HTTP, IMAP4, and POP3).Server clusters use a shared-nothing model. Model types define how servers in a cluster manage and use local and common cluster devices and resources. In the shared-nothing cluster, each server owns and manages its local devices. Devices common to the cluster, such as common disk arrays and connection media, are selectively owned and managed by only one node at a time.
Server clusters use standard Windows drivers to connect to local storage devices and media. Server clusters support multiple connection media for the external common devices, which must be accessible by all servers in the cluster. External storage devices support standard PCI–based SCSI connections, SCSI over Fibre Channel, and SCSI bus with multiple initiators. Fibre connections are SCSI devices that are hosted on a Fibre Channel bus, instead of on a SCSI bus.
The following figure illustrates components of a two-node server cluster, which is comprised of servers running Windows Server 2003 Enterprise Edition, with shared storage device connections using SCSI or SCSI over Fibre Channel.
Sample two-node Windows cluster
Server Cluster ArchitectureServer clusters are designed as separate, isolated sets of components, which work closely together with Windows Server 2003. Modifications to the operating system are enabled when the Cluster service is installed. These modifications include the following:
- Support for dynamic creation and deletion of network names and addresses
- Modifications to the file system, to enable closing open files during disk drive dismounts
- Modifications to the storage subsystem, to enable sharing disks and volumes among multiple nodes
Apart from these and other minor modifications, a server running the Windows Cluster service runs identically to a server that is not running the Windows Cluster service.
Cluster service is at the core of server clusters. Cluster service is comprised of multiple functional units, including Node Manager, Failover Manager, Database Manager, Global Update Manager, Checkpoint Manager, Log Manager, Event Log Replication Manager, and Backup/Restore Manager.
Cluster Service ComponentsThe Cluster service runs on Windows Server 2003 Enterprise Edition, using network drivers, device drivers, and resource instrumentation processes specifically designed for server clusters and their component processes. The Cluster service includes the following components:
- Checkpoint Manager This component saves application registry keys in a cluster directory stored on the quorum resource. To make sure that the Cluster service can recover from a resource failure, Checkpoint Manager checks registry keys when a resource is brought online and writes checkpoint data to the quorum resource when a resource goes offline. Checkpoint Manager also supports resources with application-specific registry trees that are instantiated at the cluster node, where the resource comes online. A resource can have one or more registry trees associated with it. When the resource is online, Checkpoint Manager monitors changes to these registry trees. If Checkpoint Manager detects changes, it transfers the registry tree to the owner node of the resource. Checkpoint Manager then transfers the file to the owner node of the quorum resource. Checkpoint Manager performs batch transfers, so that frequent changes to registry trees do not place too heavy a load on the Cluster service.
- Database Manager Database Manager maintains cluster configuration information about all physical and logical entities in a cluster. These entities include the cluster itself, cluster node membership, resource groups, resource types, and descriptions of specific resources, such as disks and IP addresses.
Persistent and volatile information stored in the configuration database tracks the current and desired state of a cluster. Each instance of Database Manager running on each node in the cluster cooperates to maintain consistent configuration information across the cluster and to ensure consistency of the configuration database copies on all nodes.
Database Manager also provides an interface for use by other Cluster components, such as Failover Manager and Node Manager. This interface is similar to the registry interface of Microsoft Win32 APIs. However, the Database Manager interface writes changes made to cluster entities in both the registry and in the quorum resource.
Database Manager supports transactional updates of the cluster registry hive and only presents interfaces to internal Cluster service components. Failover Manager and Node Manager typically use this transactional support to get replicated transactions. The Cluster API presents all Database Manager functions to clients, with the exception of transactional support functions. For additional information on the Cluster API, see Cluster API on MSDN.
Note: The application registry key data and changes are recorded by Checkpoint Manager in quorum log files, in the quorum resource. - Event Service Event Service serves as a switchboard, sending events to and from applications, and to the Cluster service components on each node. The Event Processor component of the Event Service helps Cluster service components to disseminate information about important events to all other components. The Event Processor component supports the Cluster API event mechanism. It also performs miscellaneous services, such as delivering signal events to cluster-aware applications and maintaining cluster objects.
- Event Log Replication Manager The Event Log Replication Manager replicates event log entries from one node to all other nodes in the cluster. By default, the Cluster service interacts with the Windows Event Log service in the cluster to replicate event log entries to all cluster nodes. When the Cluster service starts on the node, it invokes a private API in the local Event Log service and requests that the Event Log service bind to the Cluster service. The Event Log service then binds to the CLUSAPI interface by using local remote procedure calls (RPCs). When the Event Log service receives an event to be logged, it logs it locally, drops the event into a persistent batch queue, and schedules a timer thread to run within the next 20 seconds, if there is no timer thread that is active already. When the timer threads fires, it processes the batch queue and sends the events, as one consolidated buffer, to the Cluster API interface, where the Event Log service was previously bound. The Cluster API interface then sends the event to the Cluster service.
After the Cluster service receives batched events from the Event Log service, it drops the events into a local outgoing queue and returns from the RPC. The event broadcaster thread, in the Cluster service, then processes this queue and sends the events, using the intra-cluster RPC, to all active cluster nodes. The server side API then drops the events into an incoming queue. An event log writer thread then processes this queue and requests, through a private RPC, that the local Event Log service write the events locally.
The Cluster service uses lightweight remote procedure call (LRPC) to invoke the Event Log service’s private RPC interfaces. The Event Log service also uses LRPCs to invoke the Cluster API interface and then request that the Cluster service replicate events. - Failover Manager Failover Manager performs resource management and initiates appropriate actions, such as startup, restart, and failover. Failover Manager stops and starts resources, manages resource dependencies, and initiates failover of resource groups. To perform these actions, Failover Manager receives resource and system state information from Resource Monitors and cluster nodes.
Failover Manager also decides which nodes in the cluster should own which resource group. When resource group arbitration finishes, nodes that own an individual resource group return control of the resources in the resource group to Node Manager. If a node cannot handle a failure of one of its resource groups, Failover Managers on each node work together to reassign ownership of the resource group.
If a resource fails, Failover Manager restarts the resource or takes the resource offline together with its dependent resources. If Failover Manager takes the resource offline, it indicates that the ownership of the resource will be moved to another node. The resource is then restarted, under the ownership of the new node. This is referred to as failover, as explained in the section “Cluster Failover” later in this topic. - Global Update Manager Global Update Manager provides the global update service that is used by cluster components. Global Update Manager is used by internal cluster components, such as Failover Manager, Node Manager, and Database Manager, to replicate changes to the cluster database across nodes. Global Update Manager updates are typically initiated as a result of a Cluster API call. When a Global Update Manager update is initiated at a client node, it first requests a locker node to obtain a global lock. If the lock is not available, the client waits for one to become available.
When the lock is available, the locker grants the lock to the client, and issues the update locally (on the locker node). The client then issues the update to all other healthy nodes, including itself. If an update succeeds on the locker, but fails on some other node, that node will be removed from the current cluster membership. If the update fails on the locker node itself, the locker merely returns the failure to the client. - Log Manager Log Manager writes changes to recovery logs that are stored on the quorum resource. Log Manager, together with Checkpoint Manager, ensures that the recovery log on the quorum resource contains the most recent configuration data and change checkpoints. If one or more cluster nodes are down, configuration changes can still be made to the remaining nodes. While these nodes are down, Database Manager uses Log Manager to log configuration changes to the quorum resource.
When failed nodes return to service, they read the location of the quorum resource from their local cluster registry hives. Because the hive data could be stale, mechanisms are in place to detect invalid quorum resources read from a stale cluster configuration database. Database Manager then requests that Log Manager update the local copy of the cluster hive, using the checkpoint file in the quorum resource. The log file is then replayed in the quorum disk, starting from the checkpoint log sequence number. The result is a completely updated cluster hive. Cluster hive snapshots are taken whenever the quorum log is reset and once every four hours. - Membership Manager Membership Manager monitors cluster membership and the health of all nodes in the cluster. Membership Manager (also referred to as the Regroup Engine) maintains a consistent view of which cluster nodes are currently up or down. The core of the Membership Manager component is a regroup algorithm that is invoked whenever there is evidence that one or more nodes failed. At the completion of the algorithm, all participating nodes reach identical conclusions on the new cluster membership.
- Node Manager Node Manager assigns resource group ownership to nodes, based on group preference lists and node availability. Node Manager runs on each node and maintains a local list of nodes that belong to the cluster. Periodically, Node Manager sends messages, named heartbeats, to its counterparts running on other nodes in the cluster to detect node failures. All nodes in the cluster must have exactly the same view of cluster membership.
If a cluster node detects a communication failure with another cluster node, it transmits a multicast message to the entire cluster. This regroup event causes all members to verify their view of the current cluster membership. During the regroup event, the Cluster service prevents write operations to any disk devices common to all nodes in the cluster, until the membership stabilizes. If an instance of Node Manager on an individual node does not respond, the node is removed from the cluster, and its active resource groups are moved to another active node. To make this change, Node Manager identifies possible owners (nodes) that may own individual resources and the node on which a resource group prefers to run. Node Manager then selects the node and moves the resource group. In a two-node cluster, Node Manager simply moves resource groups from a failed node to the remaining node. In a cluster comprised of three or more nodes, Node Manager selectively distributes resource groups among the remaining nodes.
Node Manager also acts as a gatekeeper, allowing joiner nodes into the cluster and processing requests to add or evict a node. - Resource Monitor Resource Monitor verifies the health of each cluster resource by using callbacks to resource DLLs. Resource Monitors run a separate process and communicate with Cluster Server through RPCs. This protects the Cluster service from failures of individual cluster resources.
Resource Monitors provide the communication interface between resource DLLs and the Cluster service. When the Cluster service must obtain data from a resource, Resource Monitor receives the request and forwards it to the appropriate resource DLL. Conversely, when a resource DLL must report its status or notify the Cluster service of an event, Resource Monitor forwards the information from the resource to the Cluster service.
The Resource Monitor process (RESRCMON.EXE), is a child process of the Cluster service process (CLUSSVC.EXE). Resource Monitor loads resource DLLs that monitor cluster resources in its process space. Loading the resource DLLs in a process separate from the Cluster service process helps to isolate faults. Multiple Resource Monitors can be instantiated at the same time.
Each Resource Monitor functions as an LRPC server for the Cluster service process. When the Cluster service receives a Cluster API call that requires talking to a resource DLL, it uses the LRPC interface to invoke the Resource Monitor RPC. To receive responses from Resource Monitor, the Cluster service creates one notification thread per Resource Monitor process. This notification thread invokes an RPC that is located permanently in Resource Monitor. The thread acquires notifications when they are generated. The thread is released only when Resource Monitor fails or when the thread is manually stopped by a shutdown command from the Cluster service.
Resource Monitor does not maintain a persistent state on its own. It retains a limited, in-memory state of the resources, but all of its initial state information is supplied by the Cluster service. Resource Monitor communicates with the resource DLLs through well-defined entry points that the DLLs must present. Resource Monitor completes the following operations on its own:- It polls resource DLLs through the IsAlive and LooksAlive entry points, alternately checking failure events signaled by resource DLLs.
- To monitor pending timeouts of resource DLLs, it spawns timer threads that return ERROR_IO_PENDING from the DLL’s Online or Offline entry points.
- It detects crashes of the Cluster service and shuts down the resources.
Its other actions occur as a result of operations requested by the Cluster service through the RPC interface. No hang detection is perfomed by the Cluster service. The Cluster service does, however, monitor crashes, and it restarts a monitor if it detects a process crash.
The Cluster service and Resource Monitor process share a memory-mapped section backed by the paging file. The handle to the section is passed to Resource Monitor at Resource Monitor startup. Resource Monitor then duplicates the handle and records the entry point number and resource name into this section immediately before calling a resource DLL entry point. If Resource Monitor crashes, the Cluster service reads the shared section to detect the resource and the entry point that caused the crash. - Backup/Restore Manager Backup/Restore Manager works with Failover Manager and Database Manager to back up or restore the quorum log file and all checkpoint files. The Cluster service uses the BackupClusterDatabase API for database backup. First, the BackupClusterDatabase API contacts the Failover Manager layer. The Failover Manager layer forwards the request to the node that currently owns the quorum resource. That node then invokes Database Manager, which makes a backup of the quorum log file and all checkpoint files.
The Cluster service also registers itself at startup as a backup writer with Volume Shadow Copy service. When a backup client invokes the Volume Shadow Copy service to perform a system state backup, it also invokes the Cluster service, through a series of entry point calls, to perform the cluster database backup. The server code in the Cluster service invokes the Failover Manager to perform the backup, and the rest of the operation occurs via the BackupClusterDatabase API.
The Cluster service uses the RestoreClusterDatabase API to restore the cluster database from a backup path. This API can only be invoked locally from one of the cluster nodes. When the RestoreClusterDatabase API is invoked, it stops the Cluster service, restores the cluster database from the backup, sets a registry value that contains the backup path, and then re-starts the Cluster service. On startup, the Cluster service detects that a restore is requested and restores the cluster database from the backup path to the quorum resource.
Cluster FailoverFailover can occur automatically because of an unplanned hardware or software failure, or it can occur as the result of manual initiation by an administrator. The algorithm and behavior in both situations is almost identical. However, in a manually initiated failover, resources are shut down in an orderly way; whereas in unplanned failovers, resources are shut down in a sudden and disruptive way (for example, the power goes out, or a crucial hardware component fails).
When an entire node in a cluster fails, its resource groups transfer to one or more available nodes in the cluster. Automatic failover is similar to planned administrative reassignment of resource ownership. However, it is more complicated, because the orderly steps of a planned shutdown might be interrupted or might not have occurred at all. Therefore, extra steps are required to evaluate the state of the cluster at the time of failure.
When your network experiences an automatic failover, it is important to determine what groups were running on the failed node and which nodes should take ownership of the various resource groups. All nodes in the cluster that are capable of hosting the resource groups negotiate for ownership. This negotiation is based on node capabilities, current load, application feedback, the node preference list, or the use of the AntiAffinityClassNames property, which is discussed in the Cluster-Specific Configurations. When negotiation of the resource group is completed, all nodes in the cluster update their databases and track which node owns the resource group.
In clusters with more than two nodes, the node preference list for each resource group can specify a preferred server, plus one or more prioritized alternatives. This enables cascading failover, in which a resource group can survive multiple server failures, each time cascading, or failing over to the next server on its node preference list.
An alternative to automatic failover, is commonly called N+I failover. This method establishes the node preference lists for all cluster groups. The node preference list identifies the standby cluster nodes, to which resources are moved at the first failover. The standby nodes are servers in the cluster that are mostly idle or that have workloads that can be easily pre-empted if a failed server’s workload must be moved to the standby node.
Cascading failover assumes that every other server in the cluster has some excess capacity and can absorb a portion of any other failed server’s workload. N+I failover assumes, that the +I standby servers are the primary recipients of excess capacity.
Cluster FailbackWhen a node comes back online, Failover Manager can decide to move one or more resource groups back to the recovered node. This is referred to as failback. The properties of a resource group must have a preferred owner defined to fail back to a recovered or restarted node. Resource groups for which the recovered or restarted node is the preferred owner are moved from the current owner to the recovered or restarted node.
Failback properties of a resource group can include the hours of the day during which failback is allowed and a limit on the number of times failback is attempted. This enables the Cluster service to prevent failback of resource groups during peak processing times or to nodes that have not been correctly recovered or restarted.
Cluster QuorumEach cluster has a special resource referred to as the quorum resource. A quorum resource can be any resource that does the following:- Provides a means for arbitration leading to membership and cluster state decisions
- Provides physical storage to hold configuration information
A quorum log is a configuration database for the entire server cluster. The quorum log contains cluster configuration information, such as the servers that are part of the cluster, the resources that are installed in the cluster, and the state of those resources (for example, online or offline).
The quorum is important in a cluster for the following two reasons:
- Consistency A cluster is made up of multiple physical servers acting as a single virtual server. It is critical that each of the physical servers has a consistent view of the cluster configuration. The quorum acts as the definitive repository for all configuration information relating to the cluster. If the Cluster service is unable to access and read the quorum, it cannot start.
- Tie-breaking The quorum is used as a tie-breaker to avoid split-cluster scenarios. A split-cluster scenario occurs when all network communication links between two or more cluster nodes fail. If this occurs, the cluster might be split into two or more partitions that cannot communicate with each other. The quorum ensures that cluster resources are brought online on one node only. It does this by allowing the partition that owns the quorum to continue, while the other partitions are evicted from the cluster.
Standard QuorumAs mentioned earlier in this section, the quorum is a configuration database for the Cluster service that is stored in the quorum log file. A standard quorum uses a quorum log file, located on a disk hosted in the shared storage array, which is accessible by all members of the cluster.
Each member connects to the shared storage using SCSI or Fibre Channel. Storage is made up of external hard disks (usually configured as RAID disks) or a SAN, in which logical slices of the SAN are presented as physical disks.
Note: It is important that the quorum uses a physical disk resource, rather than a disk partition, because the entire physical disk resource is moved during failover. Furthermore, it is possible to configure server clusters to use the local hard disk on a server to store the quorum. This type of implementation, referred to as a lone wolf cluster, is supported only for testing and development purposes. Lone wolf clusters should not be used to cluster Exchange 2003 in a production environment because, being singular, they are incapable of providing failover. Majority Node Set QuorumsFrom a server cluster perspective, a majority node set (MNS) quorum is a single quorum resource. The data is stored by default on the local disk of each node in the cluster. The MNS resource makes sure that the cluster configuration data, stored on the MNS resource, is consistent across different disks. The MNS implementation provided in Windows Server 2003 uses a directory on each node’s local disk to store the quorum data. If the configuration of the cluster changes, that change is reflected across each node’s local disk. The change is considered committed, or made persistent, only if the change is made to: (Number of nodes/2) + 1.The MNS quorum makes sure that most nodes have an up-to-date copy of the data. The Cluster service starts up and brings resources online only if a majority of the nodes that are configured as part of the cluster are up and are running the Cluster service. If the MNS quorum determines that a majority does not exist, the cluster is considered not to have quorum, and the Cluster service waits in a restart loop until more nodes try to join. When a majority or quorum of nodes is available, the Cluster service starts and brings the resources online. Because the up-to-date configuration is written to a majority of the nodes, regardless of node failures, the cluster always guarantees that it has the most current configuration at startup.
If a cluster failure occurs, or if the cluster somehow enters a split-cluster scenario, all partitions that do not contain a majority of nodes are taken offline. This ensures that if there is a partition running that contains a majority of the nodes, it can safely start any resources that are not running on that partition, because it is the only partition in the cluster that is running resources.
Because of the differences in the way the shared disk quorum clusters behave compared to MNS quorum clusters, you must consider carefully when deciding which model to use. For example, if you have only two nodes in your cluster, the MNS model is not recommended. In this instance, failure of one node leads to failure of the entire cluster, because a majority of nodes is impossible.
Majority node set (MNS) quorums are available in Windows Server 2003 Enterprise Edition and Windows Server 2003 Datacenter Edition clusters. The only benefit that MNS clusters provide for Exchange clusters is to eliminate the need for a dedicated disk in the shared storage array on which to store the quorum resource.
Cluster ResourcesThe Cluster service manages all resource objects using Resource Monitors and resource DLLs. The Resource Monitor interface provides a standard communication interface that enables the Cluster service to initiate resource management commands and obtain resource status data. The Resource Monitor obtains actual command functions and data through resource DLLs. The cluster Service uses resource DLLs to bring resources online, manage their interaction with other resources in the cluster, and monitor their health.To enable resource management, a resource DLL uses a few simple resource interfaces and properties. Resource Monitor loads a particular resource DLL in its address space, as privileged code running under the SYSTEM account. The SYSTEM account (that is, LocalSystem), is a security principal account that represents the operating system. The Cluster service, which runs under a user security context, uses the SYSTEM account to perform security functions within the operating system.When resources depend on the availability of other resources to function, these dependencies can be defined by the resource DLL. When a resource is dependent on other resources, the Cluster service brings the dependent resource online only after it brings the resources on which it depends online in the correct sequence.
Resources are taken offline in a similar manner. The Cluster service takes resources offline only after any dependent resources are taken offline. This prevents introducing circular dependencies when loading resources.
Each resource DLL can also define the type of computer and device connection required by the resource. For example, a disk resource may require ownership only by a node that is physically connected to the disk device. Local restart policies and desired actions during failover events can also be defined in the resource DLL.
Cluster AdministrationClusters are managed using Cluster Administrator. Cluster Administrator is a graphical administrator’s tool that enables the Cluster.exe command line tool to perform maintenance, monitoring, and failover administration. Server clusters also provide an automation interface. This interface can be used to create custom scripting tools for administering cluster resources, nodes, and the cluster itself. Applications and administration tools, such as Cluster Administrator, can access this interface using RPCs, whether the tool is running on a node in the cluster or on an external computer.Cluster Formation and OperationWhen the Cluster service is installed and running on a server, the server can participate in a cluster. Cluster operations reduce single points of failure and enable high availability of clustered resources. The following sections briefly describe node behavior during cluster creation and operation.Creating a ClusterServer clusters include a cluster installation utility that is used to install the cluster software on a server and create a new cluster. To create a new cluster, the utility is run on the computer selected as the first member of the cluster. This first step defines the new cluster by establishing a cluster name, and creating the cluster database and initial cluster membership list.The next step in creating a cluster is to add the common data storage devices that will be available to all members of the cluster. This establishes the new cluster with a single node and its own local data storage devices and cluster common resources (generally disk or data storage and connection media resources).The final step in creating a cluster is to run the installation utility on each additional computer that will be a member of the cluster. As each new node is added to the cluster, it automatically receives a copy of the existing cluster database from the original member of the cluster. When a node joins or forms a cluster, the Cluster service updates the node’s private copy of the configuration database.
Forming a ClusterA server can form a cluster if it is running the Cluster service and cannot locate other nodes in the cluster. To form the cluster, a node must be able to acquire exclusive ownership of the quorum resource.When a cluster is formed, the first node in the cluster contains the cluster configuration database. As each additional node joins the cluster, it receives and maintains its own local copy of the cluster configuration database. The quorum resource stores the most current version of the configuration database as recovery logs. The logs contain node-independent cluster configuration and state data.During cluster operations, the Cluster service uses the quorum recovery logs to do the following:
- Guarantee that only one set of active nodes is allowed to form a cluster
- Enable a node to form a cluster only if it can gain control of the quorum resource
- Allow a node to join or remain in an existing cluster only if it can communicate with the node that controls the quorum resource
When a cluster is formed, each node in the cluster can be in one of three distinct states. These states are recorded by Event Processor (described below) and replicated by Event Log Manager to other nodes in the cluster. The three Cluster service states are as follows:
- Offline The node is not an active member of the cluster. The node and its Cluster service might or might not be running.
- Online The node is an active member of the cluster. It adheres to cluster database updates, contributes input into the quorum algorithm, maintains cluster network and storage heartbeats, and can own and run resource groups.
- Paused The node is an active member of the cluster. The node adheres to cluster database updates, contributes input into the quorum algorithm, and maintains network and storage heartbeats, but it cannot accept resource groups. It can support only those resource groups that it currently owns. The paused state enables maintenance to be performed. Online and paused states are treated as equivalent states by the majority of the server cluster components.
Joining a ClusterTo join an existing cluster, a server must be running the Cluster service, and it must successfully locate another node in the cluster. After finding another node in the cluster, the joining server must be authenticated for membership in the cluster and must receive a replicated copy of the cluster configuration database.The process of joining an existing cluster begins when Windows Service Control Manager starts the Cluster service on the node. During the startup process, the Cluster service configures and mounts the node’s local data devices. It does not attempt to bring the common cluster data devices online as nodes, because the existing cluster might be using the devices.To locate other nodes, a discovery process is started. When the node discovers any member of the cluster, it performs an authentication sequence. The first cluster member authenticates the new node and returns a status of success if the new node is successfully authenticated. If authentication is not successful, as when a joining node is not recognized as a cluster member or has an invalid account password, the request to join the cluster is denied.
After successful authentication, the first node online in the cluster checks the copy of the configuration database of the joining node. If it is out-of-date, the cluster node sends the joining server an updated copy of the database. After receiving the replicated database, the node joining the cluster can use it to find shared resources and bring them online as needed.
Leaving a ClusterA node can leave a cluster when it shuts down or when the Cluster service is stopped. However, a node can also be evicted from a cluster when the node fails to perform cluster operations (such as failure to commit an update to the cluster configuration database).When a node leaves a cluster, as in a planned shutdown, it sends a ClusterExit message to all other members of the cluster, notifying them that it is leaving. The node does not wait for any responses and immediately proceeds to shut down resources and close all cluster connections. Because the remaining nodes receive this exit message, they do not perform the regroup process to reestablish cluster membership that occurs when a node unexpectedly fails or network communications stop.Failure DetectionFailure detection and prevention are key benefits provided by server clusters. When a node or application in a cluster fails, server clusters can respond by restarting the failed application or distributing the work from the failed system to remaining nodes in the cluster. Server cluster failure detection and prevention include bi-directional failover, application failover, parallel recovery, and automatic failback.When the Cluster service detects failures of individual resources or an entire node, it dynamically moves and restarts application, data, and file resources on an available, healthy server in the cluster. This allows resources such as database, file shares, and applications to remain highly available to users and to client applications.Server clusters are designed with two different failure detection mechanisms:
- Heartbeats for detecting node failures Periodically, each node exchanges user datagram protocol-based messages with other nodes in the cluster over the private cluster network. These messages are referred to as the heartbeat. The heartbeat exchange enables each node to check the availability of other nodes and their resources. If a server fails to respond to a heartbeat exchange, the surviving servers initiate failover processes, including ownership arbitration for resources and applications owned by the failed server. Arbitration is performed using a challenge and defense protocol. The node that appears to have failed is given a time window to demonstrate, in any one of several ways, that it is still running correctly and can communicate with the surviving nodes. If the node is unable to respond, it is removed from the cluster. Failure to respond to a heartbeat message is caused by several events, such as computer failure, network interface failure, network failure, or even periods of unusually high activity. Typically, when all nodes are communicating, the Configuration Database Manager sends global configuration database updates to each node. When a heartbeat exchange failure occurs, Log Manager saves configuration database changes to the quorum resource. This ensures that remaining nodes can access the most recent cluster configuration and local node registry data during the recovery processes.
The failure detection algorithm is very conservative. If the cause of the heartbeat response failure is temporary, it is best to avoid the potential disruption a failover might cause. However, there is no way to know whether the node will respond in another millisecond, or if it suffered a catastrophic failure. Therefore, a failover is initiated after a timeout period. - Resource Monitor and resource DLLs for detecting resource failures Failover Manager and Resource Monitor work together to detect and recover from resource failures. Resource Monitors keep track of resource status by using the resource DLLs to periodically poll resources. Polling involves two steps, a cursory LooksAlive query and a longer, more definitive, IsAlive query. When Resource Monitor detects a resource failure, it notifies Failover Manager and continues to monitor the resource.
Failover Manager maintains resources and resource group status. It also performs recovery when a resource fails and invokes Resource Monitors in response to user actions or failures.
After a resource failure is detected, Failover Manager performs recovery actions that include restarting a resource and its dependent resources, or moving the entire resource group to another node. The recovery action that is taken is determined by resource and resource group properties, in addition to node availability.
During failover, the resource group is treated as the unit of failover. This ensures that resource dependencies are correctly recovered. When a resource recovers from a failure, Resource Monitor notifies Failover Manager. Failover Manager then performs automatic failback of the resource group, based on the configuration of the resource group failback properties.
Freelance Jobs At Scriptlance
- Adult Toy Website With StoreI am in need of a website for adult toys. This is concept that is soft in nature. Will have a customizable appearance so that articles of interest can be rss fed along with educational material from sex experts. This should be a PROFESSIONAL site with a store without the pictures but which will allow me to customize and upload them with ease depending on investory with minimum programming once done. The cart should be: Miva Merchant, Monster Commerce, OScommerce, or Storefront.
- Magento WebsiteI need a Magento Ecommerce Expert to develop my B2C website. I will provide a layered .psd to be used for the design. The site will consist of a basic store (homepage, dynamic category page, dynamic product page) The site will also include the following integrations: 1) Powerreviewsexpress.com for user reviews on product page 2) Google earth integration. I need to be able to add a .kmz file to each item in the back end. The .kmz needs to be displayed using the google earth plugin 3) custom…
- Clone Value-domain Tools Site Need a CMS website value- domain tools site. I need the design and features of www.sitelogr.com. (This script must be able to pull out different website data and post inside mysql database and on the site, this script should work amazingly fast and be completely bug-free. It must be easy to use some include function on any site and upload the script file to the server / install the db – and make the script show the scrapped/generated data.) SEO friendly links, AJAX, commented code/or docume…
- Clone Site And Applications We require a Complete Clone of www.studiom.com.au's site including the Envisage application. 1) I want a website with a strong CMS at the backend for us to upload text, images, videos on the site. There will also be a blog section in the front end. 2) We need an application like Envisage where a user visiting the site will be able to select the exterior, interior,kitchen etc of a house(Which we will supply for you to implement) and change its look by selecting and assigning material of th…
- JoomlaNeed to add some features inside phocadownload. Please do not bid if you do not have experience and ability to complete project fast and make a good work. Need this done in not more than a week since it is is not a lot of modification. Would love to have reliable and experienced person to work with since I have a lot more joomla projects that are coming. Project description: This component is phoca download that I like because it has play and preview features but I need following to be done. …
- Swoopo.com Auctions CloneIm looking for a team to develop a swoopo.com auctions clone in functions with a new aspect and in spanish language (translations could be done by me) I dont think that i have to give more details , just visit the web to see the features.
- Tv On Pc SoftwareLooking to rebrand a TV on PC Software application. The app should be windows XP – windows 7 Compatible. I will also need monthly channel updates. If you have a Mac version I would also be interested. I am not looking to make something from scratch so please dont ask. If you have please open a PMB with more details. Sample of the App is a plus.
- Website Design WorkHello, I am looking for an experienced Web designer to build some simple website pages as a test with the intention to have him build sites on a fulltime basis. The designer should have excellent experience with Photoshop, Dreamweaver, Flash, CSS, & other web designing tools. Joomla experience would be an asset. I am only looking for experienced designers that can build html (with flash, etc.) professional sites in excellent time frames (1 day for 3 pages if i give you an exact example s…
Partner links
Breaking News
- S M Krishna slams Pak judge SharifThe Indian government has sent a stern rebuke to the Pakistani judiciary for a judge’s latest remarks allegedly blaming Hindus in Pakistan for fomenting terror. Speaking in the issue on Thursday (March 18), Minister for External Affairs S M Krishna said, ‘judges should give judgements, not make judgements’.…
- DIG: Arson behind bee attack on MayaThe DIG has on Thursday (March 18) released the probe report into the Chief Minister Mayawati’s bee attack. The report confirms foul play behind the bee attack on Maya, and states that arson was behind the bee attack. The DIG’s report states that Mayawati’s composure saved her.…
Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.



