Share via


Managing High Availability and Site Resilience

 

Applies to: Exchange Server 2010 SP3, Exchange Server 2010 SP2

After you build, validate, and deploy a Microsoft Exchange Server 2010 high availability or site resilience solution, the solution transitions from the deployment phase to the operational phase of the overall solution lifecycle. The operational phase consists of several tasks, and all tasks are related to one of the following areas: database availability groups (DAGs), mailbox database copies, performing proactive monitoring, and managing switchovers and failovers.

Management of an Exchange 2010 high availability or site resilience solution is performed differently from previous versions of Exchange. Several architectural and design changes have been made in Exchange 2010 that eliminate the need to perform tasks required in previous versions of Exchange, and that provide you with greater granularity and control over the solution. For example:

  • Exchange 2010 doesn't use the concept of a clustered mailbox server (referred to as an Exchange Virtual Server in Exchange Server 2003 and earlier). As a result, Exchange is no longer a clustered application, and Exchange server identities no longer move between clustered servers.

  • Exchange 2010 doesn't use the concept of storage groups. As a result, databases are uncoupled from servers and are now managed globally, databases no longer share log streams, and continuous replication (including switchovers and failovers) operates at the database level.

  • Exchange 2010 doesn't use the concepts of public and private networks. These concepts are replaced with the concepts of MAPI networks and replication networks. Each DAG should contain one MAPI network and one or more replication networks.

Contents

Database Availability Group Management

Mailbox Database Copy Management

Proactive Monitoring

Switchovers and Failovers

Database Availability Group Management

The operational management tasks associated with DAGs include:

  • Creating one or more DAGs   Creating a DAG is typically a one-time procedure performed during the deployment phase of the solution lifecycle. However, there may be reasons for creating DAGs that occur during the operational phase. For example:

    • The DAG is configured for third-party replication mode, and you want to revert to using continuous replication. You can't convert a DAG back to continuous replication; you need to create a DAG.

    • You have servers in multiple domains. All members of the same DAG must also be members of the same domain.

  • Managing DAG membership   Managing DAG members is an infrequent task typically performed during the deployment phase of the solution lifecycle. However, because of the flexibility provided by incremental deployment, managing DAG membership may also be performed throughout the solution lifecycle.

  • Configuring DAG properties   Each DAG has various properties that can be configured as needed. These properties include:

    • Witness server and witness directory   The witness server is a server outside the DAG that acts as a quorum voter when the DAG contains an even number of members. The witness directory is a directory created and shared on the witness server for use by the system in maintaining a quorum.

    • IP addresses   Each DAG will have one or more IPv4 addresses, and optionally, one or more IPv6 addresses. The IP addresses assigned to the DAG are used by the DAG's underlying cluster. The number of IPv4 addresses assigned to the DAG equals the number of subnets that comprise the MAPI network used by the DAG. You can configure the DAG to use static IP addresses or to obtain addresses automatically by using Dynamic Host Configuration Protocol (DHCP).

    • Database activation coordination mode   Database activation coordination mode is a property setting on a DAG that's designed for DAGs with three or more members that have been deployed to multiple sites. Database activation coordination mode is used to handle conditions that would otherwise lead to a split-brain syndrome within the DAG, such as a site failure. For more information about database activation coordination mode, see Understanding Datacenter Activation Coordination Mode.

    • Alternate witness server and alternate witness directory   The alternate witness server and alternate witness directory are values that you can preconfigure as part of the planning process for DAGs configured for site resilience.

    • Replication port   By default, all DAGs use TCP port 64327 for continuous replication. You can modify the DAG to use a different TCP port for replication by using the ReplicationPort parameter of the Set-DatabaseAvailabilityGroup cmdlet.

    • Network discovery   You can force the DAG to rediscover networks and network interfaces. This operation is used when you add or remove networks or change DAG network subnets. Rediscovery of all DAG networks can be forced by using the DiscoverNetworks parameter of the Set-DatabaseAvailabilityGroup cmdlet.

    • Network compression   By default, DAGs use compression only between DAG networks on different subnets. You can enable compression for all DAG networks or for seeding operations only, or you can disable compression for all DAG networks.

    • Network encryption   By default, DAGs use encryption only between DAG networks on different subnets. You can enable encryption for all DAG networks or for seeding operations only, or you can disable encryption for all DAG networks.

  • Managing DAG networks   Although using a single network interface card (NIC) is supported, we recommend that each DAG member have at least two NICs. One NIC is used for the MAPI network, and one NIC is used for the replication network. Additional NICs can be added to create additional replication networks, for use as dedicated backup networks, or for use by the system as Internet SCSI (iSCSI) storage. DAG network management involves designating a network as a MAPI network or as a replication network, and configuring network subnets.

  • Shutting down DAG members   The Exchange 2010 high availability solution is integrated with the Windows shutdown process. If an administrator or application initiates a shutdown of a Windows server in a DAG that has a mounted database that's replicated to one or more DAG members, the system will try to activate another copy of the mounted databases prior to allowing the shutdown process to complete. However, this new behavior doesn't guarantee that all of the databases on the server being shut down will experience a loss-less activation. As a result, it's a best practice to perform a server switchover prior to shutting down a server that's a member of a DAG.

For detailed steps to create a DAG, see Create a Database Availability Group. For detailed steps to configure DAGs and DAG properties, see Configure Database Availability Group Properties. For more information about each of the preceding management tasks, and about managing DAGs in general, see Managing Database Availability Groups.

Return to top

Mailbox Database Copy Management

The operational management tasks associated with mailbox database copies include:

  • Adding mailbox database copies   When you add a copy of a mailbox database, continuous replication is automatically enabled between the existing database and the database copy.

  • Configuring mailbox database copy properties   You can configure a variety of properties, such as the database activation policy, the amount of time, if any, for replay lag and truncation lag, and the activation preference for the database copy.

  • Suspending or resuming a mailbox database copy   You can suspend a mailbox database copy in preparation for seeding, or for other forms of maintenance. You can also suspend a mailbox database copy for activation only. This configuration prevents the system from automatically activating the copy as a result of a failure, but it still allows the system to keep the database copy up to date with log shipping and replay.

  • Updating a mailbox database copy   Updating, also known as seeding, is the process in which a copy of a mailbox database is added to another Mailbox server. This becomes the baseline database for the copy. After the initial first seed of the baseline database copy, only in rare circumstances will the database need to be seeded again.

  • Activating a mailbox database copy   Activating is the process of designating a specific passive copy as the new active copy of a mailbox database. This process is referred to as a switchover. For more information, see "Switchovers and Failovers" later in this topic.

  • Removing a mailbox database copy   You can remove a mailbox database copy at any time. Occasionally, it may be necessary to remove a mailbox database copy. For example, you can't remove a Mailbox server from a DAG until all mailbox database copies are removed from the server. In addition, you must remove all copies of a mailbox database before you can change the path for a mailbox database.

For detailed steps to add a mailbox database copy, see Add a Mailbox Database Copy. For detailed steps to configure mailbox database copies, see Configure Mailbox Database Copy Properties. For more information about each of the preceding management tasks, and about managing mailbox database copies in general, see Managing Mailbox Database Copies. For detailed steps to remove a mailbox database copy, see Remove a Mailbox Database Copy.

Return to top

Proactive Monitoring

Making sure that your servers are operating reliably and that your database copies are healthy are key objectives for daily messaging operations. Exchange 2010 includes a number of features that can be used to perform a variety of health monitoring tasks for DAGs and mailbox database copies, including:

In addition to monitoring the health and status, it is also critical to monitor for situations that can compromise availability. For example, we recommend that you monitor the redundancy of your replicated databases. It is critical to avoid situations where you are down to a single copy of a database. This scenario should be treated with the highest priority and resolved as soon as possible.

For more detailed information about monitoring the health and status of DAGs and mailbox database copies, see Monitoring High Availability and Site Resilience.

Return to top

Switchovers and Failovers

A switchover is a manual process in which an administrator manually activates one or more mailbox database copies. Switchovers, which can occur at the database or server level, are typically performed as part of preparation for maintenance activities. Switchover management involves performing database or server switchovers as needed. For example, if you need to perform maintenance on a Mailbox server in a DAG, you would first perform a server switchover so that the server didn't host any active mailbox database copies. For detailed steps to perform a database switchover, see Move the Active Mailbox Database. For detailed steps to perform a server switchover, see Perform a Server Switchover. Switchovers can also be performed at the datacenter level. For more information about datacenter switchovers, see Datacenter Switchovers.

A failover is the automatic activation by the system of one or more database copies in reaction to a failure. For example, the loss of a disk drive will trigger a database failover. The loss of the MAPI network or a power failure will trigger a server failover.

For more information about switchovers and failovers, see Switchovers and Failovers.

Return to top

 © 2010 Microsoft Corporation. All rights reserved.