Best Practices for Configuring BAM Data Maintenance and Cube Update SSIS Packages in BizTalk Solutions
Background
It is known that the world of ISV applications is different. A typical ISV application tends not to come bundled with a support team responsible for making sure that the application is healthy and operational. Ultimately, the burden is on the ISV developers to ensure that their applications are capable of sustaining in the many customer environments, including those where health monitoring is not something that the end customers can be good at.
The ISV applications which use the Business Activity Monitoring (BAM) capability provided by BizTalk Server need to appreciate the many additional components which play part in the BAM infrastructure. The additional databases, tables, views, stored procedures, SSIS packages, cubes and other BAM artifacts add extra complexity to the end-to-end architecture and become important from a monitoring perspective. Should they not be monitored, a simple case such as an overgrown database may result in hours of downtime.
This article will highlight some specific best practices as it relates to configuring the BAM data maintenance and cube update SSIS packages which are used by the BAM infrastructure. If not optimally configured, these packages (in conjunction with other factors) could affect the stability of the applications which consume the BAM data.
The intention of this article is two-fold. First, we will share the key lessons learned captured from a recent BizTalk ISV application deployment to help ISV developers avoid similar roadblocks. Secondly, we will provide suggestions as to how best to configure the BAM SSIS packages in the BizTalk ISV applications.
Scenario
Consider the following BAM usage scenario:
- An ISV application uses BAM to provide nearly real-time monitoring and analysis of business process activities managed by the application;
- As part of the ISV application installation, multiple BAM activity models are deployed and configured;
- The high throughput nature of the ISV application results in a large number of events and activities being logged in the BAM database infrastructure;
- The ISV application installer creates multiple SQL Server Agent jobs to perform regular maintenance of the BAM activity data;
- The SQL Agent jobs invoke the standard SSIS packages responsible for partitioning and archiving of the BAM data;
- The SSIS packages are scheduled to run at frequent intervals to cope with the large amount of BAM activity data being logged by the ISV application.
So far, everything sounds reasonably straightforward. Let’s now add some real-world challenges:
- The ISV application is set up to use a SQL Server instance deployed on a machine with limited storage space, most specifically, on the disk drive used by the system databases such as master and msdb (both configured for unrestricted growth by default);
- The SQL Server machine used by the ISV application is not part of the customer’s managed services policy and is not being monitored by the in-house support team;
- The ISV application is running well for some time until the SQL Server machine goes down due to disk space issue on the drive hosting the msdb database;
- The business process managed by the ISV application experiences a downtime while customer is contacting the application’s support team to resolve the underlying problem.
One may argue that should the SQL Server be proactively monitored, the issue in question will never surface. Arguably, given the nature of ISV applications and variety of customer environments into which they can be deployed, it would be fair to expect that the ISV applications should provide a degree of resilience to be able to sustain in different environments.
Let’s now examine the default configuration of the SSIS packages provided by the BAM infrastructure to be able to understand the root cause of the above issue and learn what will be required in order to build a defense mechanism.
Analysis of BAM SSIS Package Configuration
The SSIS packages for BAM are auto-generated by the bm.exe utility when the BAM observation model is deployed. These packages come in two flavors:
- Packages labeled with BAM_DM_<Activity Name> are responsible for the BAM data maintenance, essentially, partitioning and archiving of the BAM activity data;
- Packages named as BAM_AN_<Activity Name> are dedicated to updating the OLAP cubes containing the aggregated BAM data.
Virtually every activity in both types of packages will generate one or more events which will be written using the log provider specified in the SSIS package configuration. By default, the SSIS packages used by BAM are configured to utilize the SQL Server Log Provider which writes log entries to the sysssislog table in the msdb database.
The example below is an extract from a BAM_DM_* package showing how SSIS logging option is configured “out of the box”. In summary, the default LoggingMode is 1 (means “Enabled”), the default log provider is SQL Server and the log provider connection string points to the msdb database on the SQL Server instance also hosting BAMPrimaryImport.
With the SSIS logging enabled, the sysssislog table will be populated with hundreds of records every time the BAM SSIS package is executed. Whilst it’s indeed useful to capture real-time information about a package for auditing and troubleshooting purposes, the volume of events accumulated over the time can be substantial.
In the example below, the sysssislog table has grown over 28GB just under a month with more than 53 million records generated by 5 BAM-owned SSIS packages.
As the result of the increased storage requirements, the corresponding database files can consume all available disk space which in conjunction with lack of monitoring procedures in place may lead to server instability and undesired downtime.
Now that we learned about the default logging configuration in the BAM SSIS packages and how it may impact the application health, let’s apply these lessons learned and extract some specific best practices related to the SSIS package configuration.
Configuration Best Practices for BAM-owned SSIS Packages
Depending of the end requirements, there may be several different ways to ensure that SSIS packages are optimally configured.
Option 1: Disable SSIS Package-Level Logging
In the event when it’s acceptable to have the SSIS package logging switched off, you have the option to turn the logging off on a package-level. Please note that even though no events will be written to the chosen log provider, a failure during SSIS package execution will still be registered in the SQL Server Agent Job History as a failed job step.
To turn the SSIS logging off per package, open the SQL Agent Job step which executes the package, navigate to the Set Values tab and configure the LoggingMode property with the value of 2 (means “Disabled”) as per example below:
Option 2: Implement Circular Logging in BAM SSIS Packages
Sometimes, turning the SSIS logging off may not be desirable. Perhaps, there is a requirement to capture a full audit trail and use this data for troubleshooting or reporting purposes. In most cases, there is no need to retain the historical data for longer than it’s really necessary unless some specific regulatory requirements enforce an extensively long retention period. Therefore, you can configure the BAM SSIS packages to log events in a circular mode using one of the following approaches.
Consider using the Windows Event Log Provider which writes entries to the Application log in the Windows Event log on the local computer.
You can then configure the Application event log to overwrite older events as needed.
It is recommended to script the event log configuration settings as opposite to relying on the server administrator to implement these. Perhaps, these settings can be automatically applied from within the ISV application installation wizard. The best way would be to use PowerShell and the following code snippet:
#---------------------------------------------------------------------------
# THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
# KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
# PARTICULAR PURPOSE.
#
# SetAppEventLogMaxSize.ps1
#
# This PS script can be used to configure the Application event log and
# set the maximum size. The script can be executed from inside the ISV
# application installer to automatically configure the event log settings.
#---------------------------------------------------------------------------
# Check the arguments first, at least 1 parameter is expected.
if($args.length -eq 0)
{
write-error "Usage: SetAppEventLogMaxSize.ps1 ServerName LogMaxSizeKB"
exit
}
$server = $args[0]
# Check if the LogMaxSizeKB was provided, use default of 64MB if not supplied.
if($args.length -eq 2)
{
$logMaxSizeKB = $args[1]
}
else
{
$logMaxSizeKB = 64 * 1024
}
# Get the collection of event logs from a given server
$eventLogs = get-wmiobject -class "Win32_NTEventlogFile" -namespace "root\CIMv2" -computername $server
foreach($log in $eventLogs)
{
# Allow change the Application event log properties only
if($log.LogfileName -eq "Application")
{
Write-Host -foregroundcolor White “Configuring the Application event log max size of" $logMaxSizeKB "KB on (” $server “)...”
# Maximum size (in bytes) permitted for the file that contains log events.
# See https://msdn.microsoft.com/en-us/library/aa394225(VS.85).aspx for details.
$log.MaxFileSize = $logMaxSizeKB * 1024
# Number of days after which an event can be overwritten.
# See https://msdn.microsoft.com/en-us/library/aa394225(VS.85).aspx for details.
$log.OverwriteOutDated = 0
# Persist changes.
$null = $log.Put("&h20000")
}
}
This approach is simple to implement, however it doesn’t provide the granular control over the log data retention period.
Should there be a requirement for keeping the logged events for a specific period of time (for instance, 30 days max), the following recommendations would apply:
- Leave the original BAM SSIS package configuration unchanged so that the events will be written into the sysssislog table;
- Add a new step in the corresponding SQL Agent Job responsible for executing the SSIS packages to have the older records in the sysssislog table cleaned up.
- Place the clean step to the very end of the job and make sure that each previous step is set up with “Quit the job reporting failure” action in the event of a failure.
The last step is important as it helps ensure that the SSIS log will not be truncated in the event of a failure in any of the SSIS packages being invoked by a SQL Agent job.
Again, it’s recommended to have the SQL jobs configuration scripted and deployed as part of the ISV application installation.
Summary
The behavior of an ISV application is indeed dependent on the many elements and components which play part in the end-to-end architecture. The understanding of how each component can affect the others is powerful knowledge which helps build flawless applications.
We learned that just a simple oversight such as using the default logging settings in the BAM SSIS packages may result in a significant volume of SSIS event log data, rapidly growing database files, undesirably large disk space consumption and potentially a server downtime.
By analyzing these lessons learned we were able to extract some interesting best practices which are easy to apply and implement in the real world. These best practices can help many BizTalk developers and administrators improve the reliability of their applications.
It is known that the knowledge is power. A typical ISV application developer will now have at least one headache less.