Share via


Mailbox Server Counters

[This topic is in progress.]

This topic describes common performance and scalability Mailbox Server counters. The following table shows active database copy IO latency requirements counters. When these values are exceeded the client experience will degrade. For example, slow user experience and message delivery delays.

Counter Description Threshold Troubleshooting

MSExchange Database\I/O Database Reads (Attached) Average Latency

Indicates the average time, in milliseconds (ms) to read from the database file.

The average value should be below 20ms.

Spikes (maximum values) should not be higher than 100ms.

MSExchange Database\I/O Database Writes (Attached) Average Latency

Indicates the average time, in milliseconds (ms) to read to the database file.

This counter is not a good indicator for client latency since database writes are async.

In general, however this latency should be less than the MSExchange Database\I/O Database Reads (Attached) Average Latency when battery-backed write caching is utilized.

Database\Database Page Fault Stalls/sec

Indicates the rate of page faults that cannot be serviced because there are no pages available for allocation from the database cache.

This counter should be 0 on production servers. 

If this counter is above 0, it is an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.

The following table shows active log IO latency requirements counters. When these values are exceeded the client experience will degrade. For example, slow user experience and message delivery delays.

Counter Description Threshold Troubleshooting

MSExchange Database\IO Log Writes Average Latency

Indicates the average time in milliseconds (ms) to write a log buffer to the active log file.

This counter should be 0 on production servers.

If this counter is greater than 0, it is an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.

Database\Log Record Stalls/sec

Indicates the number of log records that cannot be added to the log buffers per second because the log buffers are full.

The average value should be below 10 per second.

Spikes (maximum values) should not be higher than 100 per second.

Database\Log Threads Waiting

Indicates the number of threads waiting to complete an update of the database by writing their data to the log.

The average value should be less than 10 threads waiting.

The following table shows passive database copy IO latency requirements counters. When these values are exceeded, the database copy may fall behind by not replaying logs to the passive database copy fast enough. Log replication performance may also be impacted.

Counter Description Threshold Troubleshooting

MSExchange Database\I/O Database Reads (Recovery) Average Latency

Indicates the average time in milliseconds (ms) to read from the database file.

The average value should be below 200 ms.

Spikes (maximum values) should not be higher than 1000 ms.

MSExchange Database\I/O Database Writes (Recovery) Average Latency

Indicates the average time, in milliseconds (ms) to write to the database file.

In general, this latency should be less than the MSExchange Database\I/O Database Reads (Attached) Average Latency when battery-backed write caching is utilized.

Database\Database Page Fault Stalls/sec

Indicates the rate of page faults that cannot be serviced because there are no pages available for allocation from the database cache.

This counter should be 0 on production servers. 

If this counter is greater than 0, it is an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.

The following table shows a replay log IO latency requirements counter. When these values are exceeded, the database copy may fall behind, by not replaying logs in to the passive database copy fast enough.  Log replication performance may also be impacted.

Counter Description Threshold Troubleshooting

MSExchange Database\IO Log Read Average Latency

Indicates the average time (in millisecond) to read data from a log file. Specific to log replay and database recovery operations.

The average value should be below 200 ms.

Spikes (maximum values) should not be higher than 1000 ms.

The following table shows Information Store RPC processing counters.

Counter Description Threshold Troubleshooting

MSExchangeIS\RPC Requests

Indicates the overall RPC requests that are currently executing within the information store process.

Should be below 70 at all times.

MSExchangeIS\RPC Averaged Latency

Indicates the RPC latency, in milliseconds, averaged for all operations in the last 1024 packets.

For information about how clients are affected when overall server RPC averaged latencies increase, see Understanding Client Throttling Policies.

Should not be higher than 10ms on average.

To determine if certain protocols are causing overall RPC latencies, monitor MSExchangeIS Client (*)\RPC Average Latency to separate latencies based on client protocol.

MSExchangeIS Mailbox\RPC Averaged Latency

Indicates the RPC latency, in milliseconds, averaged for all operations in the last 1024 packets.

Should not be higher than 10ms on average.

MSExchangeIS Client (*)\RPC Average Latency

Shows a server RPC latency, in milliseconds (ms), averaged for the past 1024 packets for a particular client protocol.

Should be less than 10 ms on average.

Wide disparities between different client types, such as IMAP4, Outlook Anywhere, or Other Clients (MAPI), can help direct troubleshooting to appropriate subcomponents.

The following table shows RPC client throttling counters.

Counter Description Threshold Troubleshooting

MSExchangeIS Client (*)\RPC Average Latency

Should be less than 10 ms on average.

MSExchangeIS\Client: RPCs Failed:Server Too Busy/sec

Shows the client-reported rate of failed RPCs (since the store was started) due to the server too busy ROC error.

Should be 0 at all times.

Higher values may indicate RPC threads are exhausted or client throttling is occurring for clients running versions of Outlook earlier than Outlook 2007.

MSExchangeIS\Client: RPCs Failed:Server Too Busy

The client-reported number of failed RPCs (since the store was started) due to the server too busy ROC error.

Should be 0 at all times.

The following table shows message queuing counters.

Counter Description Threshold Troubleshooting

MSExchangeIS Mailbox(_Total)\Messages Queued for Submission

Shows the current number of submitted messages that are not yet processed by the transport layer.

Should be below 50 at all times. Should not be sustained for more than 15 minutes.

This may indicate that there are connectivity issues to the transport server.

MSExchangeIS Public(_Total)\Messages Queued for Submission

Shows the current number of submitted messages that are not yet processed by the transport layer.

Should be less than 20 at all times.

The following table shows database counters.

Counter Description Threshold Troubleshooting

MSExchange Database ==> Instances(*)\Log Generation Checkpoint Depth

Represents the amount of work in the log file count that will need to be redone or undone to the database files if the process fails.

Should be below 500 at all times for the Mailbox server role. A healthy server should indicate between 20 and 30 for each database instance.

If checkpoint depth increases continually for a sustained period, this is an indicator of either a long-running transaction (which will impact the version store) or of a bottleneck involving the database disks.

MSExchange Database(Information Store)\Database Page Fault Stalls/sec

Shows the rate that database file page requests require of the database cache manager to allocate a new page from the database cache.

If this value is non-zero, this indicates that the database is not able to flush dirty pages to the database file fast enough to make pages free for new page allocations.

MSExchange Database(Information Store)\Log Record Stalls/sec

Shows the number of log records that cannot be added to the log buffers per second because the log buffers are full. If this counter is non-zero for a long period of time, the log buffer size may be a bottleneck.

The average value should be below 10 per second. Spikes (maximum values) should not be higher than 100 per second.

If I/O log write latencies are high, check for RAID5 or sync replication on log devices.

MSExchange Database(Information Store)\Log Threads Waiting

Shows the number of threads waiting for their data to be written to the log to complete an update of the database. If this number is too high, the log may be a bottleneck.

Should be less than 10 on average.

Regular spikes concurrent with log record stall spikes indicate that the transaction log disks are a bottleneck. If the value for log threads waiting is more than the spindles available for the logs, there is a bottleneck on the log disks.

MSExchange Database(Information Store)\Version buckets allocated

Shows the total number of version buckets allocated.

Should be less than 12,000 at all times.

The maximum default version is 16,384. If version buckets reach 70 percent of maximum, the server is at risk of running out of the version store.

MSExchange Database Instances(*)\I/O Database Reads Average Latency

Shows the average length of time, in milliseconds (ms), per database read operation.

Should be 20 ms on average. Should show 50 ms spikes.

MSExchange Database Instances(*)\I/O Database Writes Average Latency

Shows the average length of time, in milliseconds, per database write operation.

Should be 50 ms on average.

Spikes of up to 100 ms are acceptable if not accompanied by database page fault stalls.

MSExchange Database(Information Store)\Database Cache Size (MB)

Shows the amount of system memory, in megabytes, used by the database cache manager to hold commonly used information from the database files to prevent file operations.

Maximum value is RAM-2GB (RAM-3GB for servers with sync replication enabled). This and Database Cache Hit % are extremely useful counters for gauging whether a server's performance problems might be resolved by adding more physical memory.

Use this counter along with store private bytes to determine if there are store memory leaks. If the database cache size seems too small for optimal performance and there is little available memory on the system (check the value of Memory/Available Bytes), adding more memory to the system may increase performance. If there is ample memory on the system and the database cache size is not growing beyond a certain point, the database cache size may be capped at an artificially low limit. Increasing this limit may increase performance.

MSExchange Database(Information Store)\Database Cache % Hit

Shows the percentage of database file page requests that were fulfilled by the database cache without causing a file operation. If this percentage is too low, the database cache size may be too small.

Should be over 90% for companies with majority online mode clients. Should be over 99% for companies with majority cached mode clients.

If the hit ratio is less than these numbers, the database cache may be insufficient.

MSExchange Database\Log Bytes Write/sec

Shows the rate of bytes that are written to the log.

Should be less than 10,000,000 at all times.

With each log file being 1,000,000 bytes in size, 10,000,000 bytes/sec would yield 10 logs per second. This may indicate a large message being sent or a looping message.

The following table shows client related search counters.

Counter Description Threshold Troubleshooting

MSExchangeIS Mailbox(*)\Slow Findrow Rate

Shows the rate at which the slower FindRow needs to be used in the mailbox store.

Should be no more than 10 for any specific mailbox store.

Higher values indicate applications are crawling or searching mailboxes, which is affecting server performance. These include desktop search engines, customer relationship management (CRM), or other third-party applications.

MSExchangeIS Mailbox(*)\Search Task Rate

Shows the number of search tasks created per second.

Should be less than 10 at all times.

MSExchangeIS\Slow QP Threads

Shows the number of query processor threads currently running queries that are not optimized.

Should be less than 10 at all times.

MSExchangeIS\Slow Search Threads

Shows the number of search threads currently running queries that are not optimized.

Should be less than 10 at all times.

The following table shows content indexing counters.

Counter Description Threshold Troubleshooting

Process(Microsoft.Exchange.Search.ExSearch)\% Processor time

Shows the amount of processor time that is currently being consumed by the Exchange Search service.

Should be less than 1% of overall CPU typically and not sustained above 5%. Should be less than 10% of what the store process is during steady state.

Process(msftefd*)\%Processor Time

Shows the amount of processor time that is being consumed to update content indexing within the store process.

Full crawls will increase overall processing time, but should never exceed overall store CPU capacity. Check throttling counters to determine if throttling is occurring due to server performance bottlenecks.

MSExchange Search Indices(*)\Recent Average Latency of RPCs Used to Obtain Content

Shows the average latency, in milliseconds, of the most recent RPCs to the Information Store service. These RPCs are used to get content for the filter daemon for the specified database.

Should coincide with the latencies that Outlook clients are experiencing.

MSExchange Search Indices(*)\ Average Document Indexing Time

Shows the average, in milliseconds, of how long it takes to index documents.

Should be less than 30 seconds at all time.

MSExchange Search Indices(*)\Full Crawl Mode Status

This counter is used to determine if a full crawl is occurring for any specified database.

Indicates whether this .mdb file is going through a full crawl (value=1) or not (value=0).

If CPU resources are high, it is possible content indexing is occurring for a database or set of databases.

The following table shows mailbox assistant counters.

Counter Description Threshold Troubleshooting

Process(MSExchangeMailboxAssistants)\%Processor Time

Shows the amount of processor time that is being consumed by mailbox assistants.

Should be less than 5% of overall CPU capacity.

MSExchange Assistants(*)\Events in queue

Shows the number of events in the in-memory queue waiting to be processed by the assistants.

Should be a low value at all times. High values may indicate a performance bottleneck.

MSExchange Assistants(*)\Average Event Processing Time in Seconds

Shows the average processing time of the events chosen.

Should be less than 2 at all times.

The following table shows resource booking counters.

Counter Description Threshold Troubleshooting

MSExchange Resource Booking\Average ResourceBooking Processing Time

Shows the average time to process an event in the Resource Booking Attendant.

Should be a low value at all times. High values may indicate a performance bottleneck.

MSExchange Resource Booking\Requests Failed

Shows the total number of failures that occurred while the Resource Booking Attendant was processing events.

Should be 0 at all times.

The following table shows calendar attendant counters.

Counter Description Threshold Troubleshooting

MSExchange Calendar Attendant\Average Calendar Attendant Processing time

Shows the average time to process an event in the Calendar Attendant.

Should be a low value at all times. High values may indicate a performance bottleneck.

MSExchange Calendar Attendant\Requests Failed

Shows the total number of failures that occurred while the Calendar Attendant was processing events.

Should be 0 at all times.

The following table shows store client request counters.

Counter Description Threshold Troubleshooting

MSExchange Store Interface(_Total)\RPC Latency average (msec)

Shows the average latency, in milliseconds (ms), of RPC requests. The average is calculated over all RPCs since exrpc32 was loaded.

Should be less than 100 ms at all times.

MSExchange Store Interface(_Total)\RPC Requests outstanding

Shows the current number of outstanding RPC requests.

Should be 0 at all times.

MSExchange Store Interface(*)\RPC Requests failed (%)

Shows the percentage of failed requests in the total number of RPC requests. Here, failed means the sum of failed with error code plus failed with exception.

Should be 0 at all times.

MSExchange Store Interface(*)\RPC Slow Requests (%)

Shows the percentage of slow RPC requests among all RPC requests. A slow RPC request is one that has taken more than 500 ms.

Should be less than 1 at all times.

MSExchangeMailSubmission(*)\Hub Servers In Retry

Shows the number of Hub Transport servers in retry mode.

Should be 0 at all times.

MSExchangeMailSubmission(*)\Failed Submissions Per Second

Shows the number of failed submissions per second.

Should be 0 at all times.

MSExchangeMailSubmission(*)\Temporary Submission Failures/sec

Shows the number of temporary submission failures per second.

Should be 0 at all times.

MSExchange Replication(*)\CopyQueueLength

Shows the number of transaction log files waiting to be copied to the passive copy log file folder. A copy is not considered complete until it has been checked for corruption.

Should be less than 1 at all times for local continuous replication (LCR).

MSExchange Replication(*)\ReplayQueueLength

Shows the number of transaction log files waiting to be replayed into the passive copy.

Indicates the current replay queue length. Higher values cause longer store mount times when a handoff, failover, or activation is performed.

The following table shows client activity counters.

Counter Description Threshold

MSExchangeIS\RPC Client Backoff/sec

Indicates the rate at which client backoffs are occurring. Higher values may indicate that the server may be incurring a higher load resulting in an increase in overall averaged RPC latencies, causing client throttling to occur. This can also occur when certain client user actions are being performed. Depending on what the client is doing and the rate at which RPC operations are occurring, it may be normal to see backoffs occurring.

Not applicable

MSExchangeIS\Client: RPCs Failed:Server Too Busy/sec

Shows the client-reported rate of failed RPCs (since the store was started) due to the Server Too Busy ROC error.

Should be 0 at all times. Higher values may indicate RPC threads are exhausted or client throttling is occurring for clients running versions of Outlook earlier than Outlook 2007.

MSExchangeIS\Client: RPCs Failed:Server Too Busy

The client-reported number of failed RPCs (since the store was started) due to the Server Too Busy ROC error.

Should be 0 at all times.

The following table shows Information Store counters for determining user load.

Counter Description Threshold

MSExchangeIS Client(*)\RPC Operations/sec

Shows what client protocol is performing an excessive amount of RPC Operations/sec. High IMAP4, POP3, or Outlook Anywhere latency can indicate problems with Client Access servers rather than Mailbox servers. This is especially true when Other Clients (which includes MAPI) latency is lower in comparison. In some instances, high IMAP latencies could indicate a bottleneck on the Mailbox server in addition to the latencies that the Client Access server is experiencing.

Not applicable

MSExchangeIS Client (*)\RPC Average Latency

Should be less than 50 ms on average. Wide disparities between different client types, such as IMAP4, Outlook Anywhere, or other MAPI clients can help direct troubleshooting to appropriate subcomponents.

MSExchangeIS Client(*)\JET Log Records/sec

Shows the rate that database log records are generated while processing requests for the client. Used to determine current load. Used to determine current load.

Not applicable

MSExchangeIS Client(*)\JET Pages Read/sec

Shows the rate that database pages are read from disk while processing requests for the client. Used to determine current load.

Not applicable

MSExchangeIS Client(*)\Directory Access: LDAP Reads/sec

Shows the current rate that the LDAP reads occur while processing requests for the client. Used to determine the current LDAP read rate per protocol.

Not applicable

MSExchangeIS Client(*)\Directory Access: LDAP Searches/sec

Shows the current rate that the LDAP searches occur while processing requests for the client. Used to determine the current LDAP search rate per protocol.

Not applicable

MSExchangeIS Mailbox(_Total)\Messages Delivered/sec

Shows the rate that messages are delivered to all recipients. Indicates current message delivery rate to the store.

Not applicable

MSExchangeIS Mailbox(_Total)\Messages Sent/sec

Shows the rate that messages are sent to transport. Used to determine current messages sent to transport.

Not applicable

MSExchangeIS Mailbox(_Total)\Messages Submitted/sec

Shows the rate that messages are submitted by clients. Used to determine current rate that messages are being submitted by clients.

Not applicable

MSExchangeIS\User Count

Shows the number of users connected to the information store. Used to determine current user load.

Not applicable

MSExchangeIS Public(_Total)\Replication Receive Queue Size

Shows the number of replication messages waiting to be processed.

Should be less than 100 at all times. This value should return to a minimum value between replication intervals.

The following table shows mailbox assistant counters.

Counter Description Threshold

MSExchange Assistants(*)\Mailboxes Processed/sec

Shows the rate of mailboxes processed by time-based assistants per second. Determines current load statistics for this counter.

Not applicable

MSExchange Assistants(*)\Events Polled/sec

Shows the number of events polled per second. Determines current load statistics for this counter.

Not applicable

The following table shows store client request counters.

Counter Description Threshold

MSExchange Store Interface(*)\ROP Requests outstanding

Shows the total number of outstanding remote operations (ROP) requests. Used for determining current load.

MSExchange Store Interface(*)\RPC Requests Outstanding

Shows the total number of outstanding RPC requests. Used for determining current load.

MSExchange Store Interface(*)\RPC Requests Sent/sec

Shows the current rate of initiated RPC requests per second. Used for determining current load.

Not applicable

MSExchange Store Interface(*)\RPC Slow Requests latency average (msec)

Shows the average latency, in milliseconds, of slow requests. Used for determining the average latencies of RPC slow requests.

MSExchangeMailSubmission(*)\Successful Submissions Per Second

Determines current mail submission rate.

Not applicable

MSExchangeMailSubmission(*)\Failed Submissions Per Second

Should be 0 at all times.

MSExchangeMailSubmission(*)\Temporary Submission Failures/sec

Shows the number of temporary submission failures per second.

Should be 0 at all times.

MSExchange Replica Seeder(*)\Seeding Finished %

Shows the finished percentage of seeding. Its value is from 0 to 100 percent. Used to determine if seeding is occurring for a particular database, which is possibly affecting overall server performance or current network bandwidth.

Not applicable

MSExchangeIS\RPC Operations/sec

Indicates the current number of RPC operations that are occurring per second.

Should closely correspond to historical baselines. Values much higher than expected indicate that the workload has changed, while values much lower than expected indicate a bottleneck preventing client requests from reaching the server.