Managing Mailbox Database Copies
Applies to: Exchange Server 2010
Database mobility is a new architecture in Microsoft Exchange Server 2010 that removes the concept of storage groups and uncouples an Exchange 2010 mailbox database from a Mailbox server. Because storage groups have been removed from Exchange 2010, continuous replication now operates at the database level. In Exchange 2010, transaction logs are replicated to one or more Mailbox servers, and replayed into one or more copies of a mailbox database stored on those servers. Several concepts used in Exchange Server 2007 continuous replication remain in Exchange 2010. These include the concepts of divergence, the use of the automatic database mount dial, and the use of public and private networks.
Managing Database Copies
After multiple copies of a database are created, you can use the Exchange Management Console (EMC) and the Exchange Management Shell to monitor the health and status of each copy and to perform other management tasks associated with database copies. Some of the management tasks you may need to perform include suspending or resuming a database copy, seeding a database copy, monitoring database copies, configuring database copy settings, and removing a database copy.
Suspending and Resuming Database Copies
For a variety of reasons, such as performing planned maintenance, it may be necessary to suspend and resume continuous replication activity for a database copy. In addition, some administrative tasks, such as seeding, require you to first suspend a database copy. We also recommend that all replication activity be suspended when the path for the database or its log files is being changed. You can suspend and resume database copy activity by using the EMC, or by running the Suspend-DatabaseCopy and Resume-DatabaseCopy cmdlets in the Shell. For detailed steps to suspend or resume continuous replication activity for a database copy, see Suspend or Resume a Mailbox Database Copy.
Log truncation doesn't occur on the active mailbox database copy when one or more passive copies are suspended. If your planned maintenance activities are going to take an extended period of time (for example, several days), you may have considerable log file buildup. To prevent the log drive from filling up with transaction logs, you can remove the affected passive database copy instead of suspending it. When the planned maintenance is completed, you can re-add the passive database copy.
Seeding a Database Copy
Seeding, also known as updating, is the process in which a database, either a blank database or a copy of the production database, is added to the target copy location on another Mailbox server in the same database availability group (DAG) as the production database. This becomes the baseline database for the copy maintained by that server.
Depending on the situation, seeding can be an automatic process or a manual process that you initiate. When a database copy is added, the copy will be automatically seeded, provided that the target server and its storage are properly configured. If you want to manually seed a database copy and don't want automatic seeding to occur when creating the copy, you can use the SeedingPostponed parameter when running the Add-MailboxDatabaseCopy cmdlet.
Database copies rarely need to be reseeded after the initial seeding has occurred. But if reseeding is necessary, or if you want to manually seed a database copy instead of having the system automatically seed the copy, these tasks can be performed by using the Update Database Copy wizard in the EMC or by using the Update-MailboxDatabaseCopy cmdlet in the Shell. Before seeding a database copy, you must first suspend the mailbox database copy. For detailed steps to seed a database copy, see Update a Mailbox Database Copy.
After a manual seed operation has completed, replication for the seeded mailbox database copy is automatically resumed. If you don't want replication to automatically resume, you can use the ManualResume parameter when running the Update-MailboxDatabaseCopy cmdlet.
Choosing What to Seed
When performing a seed operation, you can choose to seed the mailbox database copy, the content index catalog for the mailbox database copy, or both the database copy and the content index catalog copy. The default behavior of the Update Database Copy wizard and the Update-MailboxDatabaseCopy cmdlet is to seed both the mailbox database copy and the content index catalog copy. To seed just the mailbox database copy without seeding the content index catalog, use the DatabaseOnly parameter when running the Update-MailboxDatabaseCopy cmdlet. To seed just the content index catalog copy, use the CatalogOnly parameter when running the Update-MailboxDatabaseCopy cmdlet.
Selecting the Seeding Source
In Exchange 2007, continuous replication could only seed a database copy by copying the active copy of the database. In Exchange 2010, any healthy database copy can be used as the seeding source for an additional copy of that database. This is particularly useful when you have a DAG that has been extended across multiple physical locations. For example, consider a four-member DAG deployment, where two members (MBX1 and MBX2) are located in Portland, Oregon and two members (MBX3 and MBX4) are located in New York, New York. A mailbox database named DB1 is active on MBX1 and there are passive copies of DB1 on MBX2 and MBX3. When adding a copy of DB1 to MBX4, you have the option of using the copy on MBX3 as the source for seeding, and in doing so, you avoid seeding over the wide area network (WAN) link between Portland and New York.
To use a specific copy as a source for seeding when adding a new database copy, you would do the following:
- Use the SeedingPostponed parameter when running the Add-MailboxDatabaseCopy cmdlet to add the database copy. If the SeedingPostponed parameter isn't used, the database copy will be explicitly seeded using the active copy of the database as the source.
- Use the SourceServer parameter when running the Update-MailboxDatabaseCopy cmdlet and specify the desired source server for seeding. In the preceding example, you would specify MBX3 as the source server. If the SourceServer parameter isn't used, the database copy will be explicitly seeded using the active copy of the database as the source.
Seeding and Networks
In addition to selecting a specific source server for seeding a mailbox database copy, you can also specify which DAG networks to use, and optionally override the DAG network's compression and encryption settings during the seed operation.
To specify the networks you want to use for seeding, use the Network parameter when running the Update-MailboxDatabaseCopy cmdlet and specify the DAG networks that you want to use. If you don't use the Network parameter, the system uses the following default behavior for selecting a network to use for the seeding operation:
- If the source server and target server are on the same subnet and a replication network has been configured that includes the subnet, the replication network will be used.
- If the source server and target server are on different subnets, even if a replication network that contains those subnets has been configured, the client (MAPI) network will be used for seeding.
- If the source server and target server are in different datacenters, the client (MAPI) network will be used for seeding.
At the DAG level, DAG networks are configured for encryption and compression. The default settings are to use encryption and compression only for communications on different subnets. If the source and target are on different subnets and the DAG is configured with the default values for NetworkCompression and NetworkEncryption, you can override these values by using the NetworkCompressionOverride and NetworkEncryptionOverride parameters, respectively, when running the Update-MailboxDatabaseCopy cmdlet.
Seeding Process
When you initiate a seeding process by using the Add-MailboxDatabaseCopy or Update-MailboxDatabaseCopy cmdlets, the following tasks are performed:
- Database properties from Active Directory are read to validate the specified database and servers, and to verify that the source and target servers are running Exchange 2010, they are both members of the same DAG, and that the specified database isn't a recovery database. The database file paths are also read.
- Preparations occur for reseed checks from the Microsoft Exchange Replication service on the target server.
- The Microsoft Exchange Replication service on the target server checks for the presence of database and transaction log files in the file directories read by the Active Directory checks in step 1.
- The Microsoft Exchange Replication service returns the status information from the target server to the administrative interface from where the cmdlet was run.
- If all preliminary checks have passed, you are prompted to confirm the operation before continuing. If you confirm the operation, the process continues. If an error is encountered during the preliminary checks, the error is reported and the operation fails.
- The seed operation is started from the Microsoft Exchange Replication service on the target server.
- The Microsoft Exchange Replication service suspends database replication for the active database copy.
- The state information for the database is updated by the Microsoft Exchange Replication service to reflect a status of Seeding.
- If the target server doesn't already have the directories for the target database and log files, they are created.
- A request to seed the database is passed from the Microsoft Exchange Replication service on the target server to the Microsoft Exchange Replication service on the source server using TCP. This request and the subsequent communications for seeding the database occur on a DAG network that has been configured as a replication network.
- The Microsoft Exchange Replication service on the source server initiates an Extensible Storage Engine (ESE) streaming backup via the Microsoft Exchange Information Store service interface.
- The Microsoft Exchange Information Store service streams the database data to the Microsoft Exchange Replication service.
- The database data is moved from the source server's Microsoft Exchange Replication service to the target server's Microsoft Exchange Replication service.
- The Microsoft Exchange Replication service on the target server writes the database copy to a temporary directory located in the main database directory called temp-seeding.
- The streaming backup operation on the source server ends when the end of the database is reached.
- The write operation on the target server completes and the database is moved from the temp-seeding directory to the final location. The temp-seeding directory is deleted.
- On the target server, the Microsoft Exchange Replication service proxies a request to the Microsoft Exchange Search service to mount the content index catalog for the database copy, if it exists. If there are existing out-of-date catalog files from a previous instance of the database copy, the mount operation fails, which triggers the need to replicate the catalog from the source server. Likewise, if the catalog doesn't exist, and it doesn't on a new instance of the database copy on the target server, a copy of the catalog is required. The Microsoft Exchange Replication service directs the Microsoft Exchange Search service to suspend indexing for the database copy while a new catalog is copied from the source.
- The Microsoft Exchange Replication service on the target server sends a seed catalog request to the Microsoft Exchange Replication service on the source server.
- On the source server, the Microsoft Exchange Replication service requests the directory information from the Microsoft Exchange Search service and requests that indexing be suspended.
- The Microsoft Exchange Search service on the source server returns the search catalog directory information to the Microsoft Exchange Replication service.
- The Microsoft Exchange Replication service on the source server reads the catalog files from the directory.
- The Microsoft Exchange Replication service on the source server moves the catalog data to the Microsoft Exchange Replication service on the target server using a connection across the replication network. After the read is complete, the Microsoft Exchange Replication service sends a request to the Microsoft Exchange Search service to resume indexing of the source database.
- If there are any existing catalog files on the target server in the directory, the Microsoft Exchange Replication service on the target server deletes them.
- The Microsoft Exchange Replication service on the target server writes the catalog data to a temporary directory called CiSeed.Temp until the data is completely transferred.
- The Microsoft Exchange Replication service moves the complete catalog data to the final location.
- The Microsoft Exchange Replication service on the target server resumes search indexing on the target database.
- The Microsoft Exchange Replication service on the target server returns a completion status.
- The final result of the operation is passed to the administrative interface from which the cmdlet was called.
Configuring Database Copies
After a database copy is created, you can view and modify its configuration settings when needed. You can view some configuration information by examining the Properties page for a database copy in the EMC. You can also use the Get-MailboxDatabase and Set-MailboxDatabaseCopy cmdlets in the Shell to view and configure database copy settings, such as replay lag time, truncation lag time, and activation preference order. For detailed steps to view and configure database copy settings, see Configure Mailbox Database Copy Properties.
Using Replay Lag and Truncation Lag Options
Mailbox database copies support the use of a replay lag time and a truncation lag time, both of which are configured in minutes. Setting a replay lag time enables you to take a database copy back to a specific point in time. Setting a truncation lag time enables you to use the logs on a passive database copy to recover from the loss of log files on the active database copy. Because both of these features result in the temporary build-up of log files, using either of them will affect your storage design.
Replay Lag Time
Replay lag time is a property of a mailbox database copy that specifies the amount of time, in minutes, to delay log replay for the database copy. The replay lag timer starts when a log file has been replicated to the passive copy and has successfully passed inspection. By delaying the replay of logs to the database copy, you have the capability to recover the database to a specific point in time in the past. A mailbox database copy configured with a replay lag time greater than 0 is referred to as a lagged mailbox database copy, or simply, a lagged copy.
A strategy that uses database copies and the litigation hold features in Exchange 2010 can provide protection against a range of failures that would ordinarily cause data loss. However, these features can't provide protection against data loss in the event of logical corruption, which although rare, can cause data loss. Lagged copies are designed to prevent loss of data in the case of logical corruption. Generally, there are two types of logical corruption:
- Database logical corruption The database pages checksum matches, but the data on the pages is wrong logically. This can occur when ESE attempts to write a database page and even though the operating system returns a success message, the data is either never written to the disk or it's written to the wrong place. This is referred to as a lost flush. To prevent lost flushes from losing data, ESE includes a lost flush detection mechanism in the database along with a page patching feature (single page restore).
- Store logical corruption Data is added, deleted, or manipulated in a way that the user doesn't expect. These cases are generally caused by third-party applications. It is generally only corruption in the sense that the user views it as corruption. The Exchange store considers the transaction that produced the logical corruption to be a series of valid MAPI operations. The litigation hold feature in Exchange 2010 provides protection from store logical corruption (because it prevents content from being permanently deleted by a user or application). However, there may be scenarios where a user mailbox becomes so corrupted that it would be easier to restore the database to a point in time prior to the corruption, and then export the user mailbox to retrieve uncorrupted data.
The combination of database copies, hold policy, and ESE single page restore leaves only the rare but catastrophic store logical corruption case. Your decision on whether to use a database copy with a replay lag (a lagged copy) will depend on which third-party applications you use and your organization's history with store logical corruption.
If you choose to use lagged copies, be aware of the following implications for their use:
- Unlike standby continuous replication (SCR) in Exchange 2007, which had a hard-coded replay lag of 50 log files, there's no hard-coded number of lagged log files. Instead, the replay lag time is an administrator-configured value, and by default, it's disabled.
- The replay lag time setting has a default setting of 0 days, and a maximum setting of 14 days.
- Lagged copies aren't considered highly available copies. Instead, they are designed for disaster recovery purposes, to protect against store logical corruption.
- The greater the replay lag time, the longer the database recovery process. Depending on the number of log files that need to replayed during recovery, and the speed at which your hardware can replay them, it may take several hours or more to recover a database.
- We recommend that you determine whether lagged copies are critical for your overall disaster recovery strategy. If using them is critical to your strategy, we recommend using multiple lagged copies, or using a redundant array of independent disks (RAID) to protect a single lagged copy, if you don't have multiple lagged copies. If you lose a disk or if corruption occurs, you don't lose your lagged point in time.
- Lagged copies aren't patchable with the ESE single page restore feature. If a lagged copy encounters database page corruption (for example, a -1018 error), it will have to be reseeded (which will lose the lagged aspect of the copy).
Activating and recovering a lagged mailbox database copy is an easy process if you want the database to replay all log files and make the database copy current. If you want to replay log files up to a specific point in time, it's a more difficult operation because you manually manipulate log files and run the Eseutil tool.
For detailed steps to activate a lagged mailbox database copy, see Activate a Lagged Mailbox Database Copy.
Truncation Lag Time
Truncation lag time is a property of a mailbox database copy that specifies the amount of time, in minutes, to delay log deletion for the database copy after the log file has been replayed into the database copy. The truncation lag timer starts when a log file has been replicated to the passive copy, and successfully passed inspection, and has been successfully replayed into the copy of the database. By delaying the truncation of log files from the database copy, you have the capability to recover from failures that affect the log files for the active copy of the database.
Database Copies and Log Truncation
Log truncation works the same in Exchange 2010 as it did in Exchange 2007. Truncation behavior is determined by the replay lag time and truncation lag time settings for the copy.
The following criteria must be met for a database copy's log file to be truncated when lag settings are left at their default values of 0 (disabled):
- The log file must have been successfully backed up, or circular logging must be enabled.
- The log file must be below the checkpoint (the minimum log file required for recovery) for the database.
- All other lagged copies must have inspected the log file.
- All other copies (not lagged copies) must have replayed the log file.
The following criteria must be met for truncation to occur for a lagged database copy:
- The log file must be below the checkpoint for the database.
- The log file must be older than ReplayLagTime + TruncationLagTime.
- The log file must have been truncated on the active copy.
Database Activation Policy
There are scenarios in which you may want to create a mailbox database copy and prevent the system from automatically activating that copy in the event of a failure. For example:
- If you deploy one or more mailbox database copies to a second or standby data center.
- If you configure a database copy as a lagged copy for recovery purposes.
- If you are performing maintenance or an upgrade of a server.
In each of the preceding scenarios, you have database copies that you don't want the system to activate automatically. To prevent the system from automatically activating a mailbox database copy, you can configure the copy to be blocked (suspended) for activation. This allows the system to maintain the currency of the database through log shipping and replay, but prevents the system from automatically activating and using the copy. Copies blocked for activation must be manually activated by an administrator. You can configure the database activation policy by using the Set-MailboxServer cmdlet to set the DatabaseCopyAutoActivationPolicy parameter to Blocked.
For more information about configuring database activation policy, see Configure Activation Policy for a Mailbox Database Copy.
Balancing Database Copies
Due to the inherent nature of DAGs, as the result of database switchovers and failovers, active mailbox database copies will change hosts several times throughout a DAG's lifetime. As a result, DAGs can become unbalanced in terms of active mailbox database copy distribution. The following table shows an example of a DAG that has four databases with four copies of each database (for a total of 16 databases on each server) with an uneven distribution of active database copies.
DAG with unbalanced active copy distribution
Server | Number of active database | Number of passive databases | Number of mounted databases | Number of dismounted databases | Preference count list |
---|---|---|---|---|---|
EX1 |
5 |
11 |
5 |
0 |
4, 4, 3, 5 |
EX2 |
1 |
15 |
1 |
0 |
1, 8, 6, 1 |
EX3 |
12 |
4 |
12 |
0 |
13, 2, 1, 0 |
EX4 |
1 |
15 |
1 |
0 |
1, 1, 5, 9 |
In the preceding example, there are four copies of each database, and therefore, only four possible values for activation preference (1, 2, 3, or 4). The Preference count list column shows the count of the number of databases with each of these values. For example, on EX3, there are 13 database copies with an activation preference of 1, two copies with an activation preference of 2, one copy with an activation preference of 3, and no copies with an activation preference of 4.
As you can see, this DAG is not balanced in terms of the number of active databases hosted by each DAG member, the number of passive databases hosted by each DAG member, or the activation preference count of the hosted databases.
You can use the RedistributeActiveDatabases.ps1 script to balance the active mailbox databases copies across a DAG. This script moves databases between their copies in an attempt to have an equal number of mounted databases on each server in DAG. If required, the script also attempts to balance active databases across sites.
The script provides two options for balancing active database copies within a DAG:
- BalanceDbsByActivationPreference When this option is specified, the script attempts to move databases to their most preferred copy (based on Activation Preference) without regard to Active Directory site.
- BalanceDbsBySiteAndActivationPreference When this option is specified, the script attempts to move active databases to their most preferred copy, while also trying to balance active databases within each Active Directory site.
After running the script with the first option, the preceding unbalanced DAG becomes balanced, as shown in the following table.
DAG with balanced active copy distribution
Server | Number of active database | Number of passive databases | Number of mounted databases | Number of dismounted databases | Preference count list |
---|---|---|---|---|---|
EX1 |
4 |
12 |
4 |
0 |
4, 4, 4, 4 |
EX2 |
4 |
12 |
4 |
0 |
4, 4, 4, 4 |
EX3 |
4 |
12 |
4 |
0 |
4, 4, 4, 4 |
EX4 |
4 |
12 |
4 |
0 |
4, 4, 4, 4 |
As shown in the preceding table, this DAG is now balanced in terms of number of active and passive databases on each server and activation preference across the servers.
The following table lists the available parameters for the RedistributeActiveDatabases.ps1 script.
RedistributeActiveDatabases.ps1 script parameters
Parameter | Description |
---|---|
DagName |
Specifies the name of the DAG you want to rebalance. If this parameter is omitted, the DAG of which the local server is a member is used. |
BalanceDbsByActivationPreference |
Specifies that the script should move databases to their most preferred copy without regard to Active Directory site. |
BalanceDbsBySiteAndActivationPreference |
Specifies that the script should attempt to move active databases to their most preferred copy, while also trying to balance active databases within each Active Directory site. |
ShowFinalDatabaseDistribution |
Specifies that a report of current database distribution be displayed after redistribution is complete. |
AllowedDeviationFromMeanPercentage |
Specifies the allowed variation of active databases across sites, expressed as a percentage. The default is 20%. For example, if there were 99 databases distributed between three sites, the ideal distribution would be 33 databases in each site. If the allowed deviation is 20%, the script attempts to balance the databases so that each site has no more than 10% more or less than this number. 10% of 33 is 3.3, which is rounded up to 4. Therefore, the script attempts to have between 29 and 37 databases in each site. |
ShowDatabaseCurrentActives |
Specifies that the script produce a report for each database detailing how the database was moved and whether it is now active on its most-preferred copy. |
ShowDatabaseDistributionByServer |
Specifies that the script produce a report for each server showing its database distribution. |
RunOnlyOnPAM |
Specifies that the script run only on the DAG member that currently has the PAM role. The script verifies it is being run from the PAM. If it is not being run from the PAM, the script exits. |
LogEvents |
Specifies that the script logs an event (MsExchangeRepl event 4115) containing a summary of the actions. |
IncludeNonReplicatedDatabases |
Specifies that the script should include non-replicated databases (databases without copies) when determining how to redistribute the active databases. Although non-replicated databases can't be moved, they may affect the distribution of the replicated databases. |
Confirm |
The Confirm switch can be used to suppress the confirmation prompt that appears by default when this script is run. To suppress the confirmation prompt, use the syntax -Confirm:$False. You must include a colon ( : ) in the syntax. |
RedistributeActiveDatabases.ps1 Examples
This example shows the current database distribution for a DAG, including preference count list.
RedistributeActiveDatabases.ps1 -DagName DAG1 -ShowDatabaseDistributionByServer | Format-Table
This example redistributes and balances the active mailbox database copies in a DAG using activation preference without prompting for input.
RedistributeActiveDatabases.ps1 -DagName DAG1 -BalanceDbsByActivationPreference -Confirm:$False
This example redistributes and balances the active mailbox database copies in a DAG using activation preference, and produces a summary of the distribution.
RedistributeActiveDatabases.ps1 -DagName DAG1 -BalanceDbsByActivationPreference -ShowFinalDatabaseDistribution
Monitoring Database Copies
A database copy is your first defense if a failure occurs that affects the active copy of a database. It is therefore critical to monitor the health and status of database copies to ensure that they will be available when needed. You can view some health and status information by examining the Properties page for a database copy in the EMC. You can also use the Get-MailboxDatabaseCopyStatus cmdlet in the Shell to view a variety of status information for a database copy.
For more information about monitoring database copies, see Monitoring High Availability and Site Resilience.
Removing a Database Copy
A database copy can be removed at any time by using the EMC or by using the Remove-MailboxDatabaseCopy cmdlet in the Shell. After removing a database copy, you must manually delete any database and transaction log files from the server from which the database copy is being removed. For detailed steps to remove a database copy, see Remove a Mailbox Database Copy.
Database Switchovers
The Mailbox server that hosts the active copy of a database is referred to as the mailbox database master. The process of activating a passive database copy changes the mailbox database master for the database and turns the passive copy into the new active copy. This process is called a database switchover. In a database switchover, the active copy of a database is dismounted on one Mailbox server and a passive copy of that database is mounted as the new active mailbox database on another Mailbox server. When performing a switchover, you can optionally override the database mount dial setting on the new mailbox database master.
You can quickly identify which Mailbox server is the current mailbox database master by reviewing the Copy Status column under the Database Copies tab in the EMC. Only the active copy will have a status of Mounted. All other database copies will display the current status of replication for the database copy. You can perform a switchover by using the Move Mailbox Database Master wizard in the EMC, or by using the Move-ActiveMailboxDatabase cmdlet in the Shell.
There are several internal checks that will be performed before activating a passive copy:
- The status of the database copy is checked. If the database copy is in a failed state, the switchover is blocked. You can override this behavior and bypass the health check by using the SkipHealthChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move the active copy to a database copy in a failed state.
- The copy queue and replay queue lengths for the database copy are checked to ensure their values are within the configured criteria. Also, the database copy is verified to ensure that it isn't currently in use as a source for seeding. If the values for the queue lengths are outside the configured criteria, or if the database is currently used as a source for seeding, the switchover is blocked. You can override this behavior and bypass these checks by using the SkipLagChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows a copy to be activated that has replay and copy queues outside of the configured criteria.
- The state of the search catalog (content index) for the database copy is checked. If the search catalog isn't up to date, is in an unhealthy state, or is corrupt, the switchover is blocked. You can override this behavior and bypass the search catalog check by using the SkipClientExperienceChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter causes this search to skip the catalog health check. If the search catalog for the database copy you are activating is in an unhealthy or unusable state and you use this parameter to skip the catalog health check and activate the database copy, you will need to either crawl or seed the search catalog again.
When performing a database switchover, you also have the option of overriding the mount dial settings configured for the server that hosts the passive database copy being activated. Using the MountDialOverride parameter of the Move-ActiveMailboxDatabase cmdlet instructs the target server to override its own mount dial settings and use those specified by the MountDialOverride parameter.
For detailed steps to perform a switchover of a database copy, see Activate a Mailbox Database Copy. For more information about database switchovers, see Switchovers and Failovers.