次の方法で共有


Troubleshooting SQL Server Management Studio with SQL Profiler and Debugging Tools – Part 1

This is the first in a series of 3 posts which discuss troubleshooting SQL Server Management Studio when it encounters errors or unexpected behaviour.

In this post we look at using profiler, in the 2nd and 3rd we debug SSMS using public Microsoft debugging tools.

Here are the symptoms of the problem we are looking at:

1. Unable to view Job Activity in the SQL Server Agent Job Activity Monitor

2. Progress bar shows “Cannot display activity data”

3. Click on more details throws a message box with the error

Failed to retrieve data for this request. (Microsoft.SqlServer.Management.Sdk.Sfc)

with the additional details

Hour, Minute, and Second parameters describe an un-representable DateTime. (mscorlib)

It actually looks like this

image

4. The error repros with all versions of SSMS (2005 and 2008)

5. The error repros for all users whatever workstation they connect from.

These last 2 points are particularly relevant. Whenever you are looking at an SSMS problem, you always need to be asking these 2 questions. Straight at the off, you have potentially then eliminated whole areas of research (such as corruptions of .NET framework on specific workstations, incompatibilities in processor architecture specific versions of the framework, and localising whether the problem is with the SSMS client or the server instance being managed, or a combination of both)

The initial approach – using SQL Profiler and some common sense

When I initially came to this problem, the customer was experiencing the errors above. Normally when I get any errors or problems in SSMS, rather than reaching for debugger, I reach for SQL Profiler first (having asked the above 2 questions about where and when the issue repros), and this is a good rule of thumb. At the end of the day SSMS is just a GUI which issues TSQL commands to achieve functions in SQL Server. Whenever I get unexpected results in SSMS, the first thing I therefore do is to run a profiler trace against the server which SSMS is connected to. This means that you can see the TSQL which is issued, and you can sometimes find the problem straight off, if for example it issues some piece of bad code which obviously causes your error.

In this case if you run the SQL profiler GUI tool (you can simply point it at the appropriate server instance and use the default options), you’ll note that the SSMS issues the following SQL batches.

 SET LOCK_TIMEOUT 10000
 select SERVERPROPERTY(N'servername')
 declare @RegPathParams sysname
                 declare @Arg sysname
                 declare @Param sysname
                 declare @MasterPath nvarchar(512)
  
                 declare @LogPath nvarchar(512)
                 declare @ErrorLogPath nvarchar(512)
                 declare @n int
  
                 select @n=0
                 select @RegPathParams=N'Software\Microsoft\MSSQLServer\MSSQLServer'+'\Parameters'
                 select @Param='dummy'
                 while(not @Param is null)
                 begin
                     select @Param=null
                     select @Arg='SqlArg'+convert(nvarchar,@n)
  
                     exec master.dbo.xp_instance_regread N'HKEY_LOCAL_MACHINE', @RegPathParams, @Arg, @Param OUTPUT
                     if(@Param like '-d%')
                     begin
                         select @Param=substring(@Param, 3, 255)
                         select @MasterPath=substring(@Param, 1, len(@Param) - charindex('\', reverse(@Param)))
                     end
                     else if(@Param like '-l%')
                     begin
                         select @Param=substring(@Param, 3, 255)
                         select @LogPath=substring(@Param, 1, len(@Param) - charindex('\', reverse(@Param)))
                     end
                     else if(@Param like '-e%')
                     begin
                         select @Param=substring(@Param, 3, 255)
                         select @ErrorLogPath=substring(@Param, 1, len(@Param) - charindex('\', reverse(@Param)))
                     end
                     
                     select @n=@n+1
                 end
             
  
  
                 declare @SmoRoot nvarchar(512)
                 exec master.dbo.xp_instance_regread N'HKEY_LOCAL_MACHINE', N'SOFTWARE\Microsoft\MSSQLServer\Setup', N'SQLPath', @SmoRoot OUTPUT
             
  
  
 SELECT
 CAST(case when 'a' <> 'A' then 1 else 0 end AS bit) AS [IsCaseSensitive],
 CAST(FULLTEXTSERVICEPROPERTY('IsFullTextInstalled') AS bit) AS [IsFullTextInstalled],
 @LogPath AS [MasterDBLogPath],
 @MasterPath AS [MasterDBPath],
 @ErrorLogPath AS [ErrorLogPath],
 @@MAX_PRECISION AS [MaxPrecision],
 @SmoRoot AS [RootDirectory],
 SERVERPROPERTY(N'ProductVersion') AS [VersionString],
 CAST(SERVERPROPERTY(N'MachineName') AS sysname) AS [NetName],
 CAST(SERVERPROPERTY(N'Edition') AS sysname) AS [Edition],
 CAST(SERVERPROPERTY(N'ProductLevel') AS sysname) AS [ProductLevel],
 CAST(SERVERPROPERTY('IsClustered') AS bit) AS [IsClustered],
 CAST(SERVERPROPERTY('IsSingleUser') AS bit) AS [IsSingleUser],
 CAST(SERVERPROPERTY('EngineEdition') AS int) AS [EngineEdition],
 SERVERPROPERTY(N'ResourceVersion') AS [ResourceVersionString],
 SERVERPROPERTY(N'ResourceLastUpdateDateTime') AS [ResourceLastUpdateDateTime],
 SERVERPROPERTY(N'BuildClrVersion') AS [BuildClrVersionString],
 SERVERPROPERTY(N'ComputerNamePhysicalNetBIOS') AS [ComputerNamePhysicalNetBIOS],
 SERVERPROPERTY(N'CollationID') AS [CollationID],
 SERVERPROPERTY(N'ComparisonStyle') AS [ComparisonStyle],
 SERVERPROPERTY(N'SqlCharSet') AS [SqlCharSet],
 SERVERPROPERTY(N'SqlCharSetName') AS [SqlCharSetName],
 SERVERPROPERTY(N'SqlSortOrder') AS [SqlSortOrder],
 SERVERPROPERTY(N'SqlSortOrderName') AS [SqlSortOrderName],
 convert(sysname, serverproperty(N'collation')) AS [Collation]
 exec sp_executesql N'SELECT
 dtb.collation_name AS [Collation],
 dtb.name AS [DatabaseName2]
 FROM
 master.sys.databases AS dtb
 WHERE
 (dtb.name=@_msparam_0)',N'@_msparam_0 nvarchar(4000)',@_msparam_0=N'msdb'    Microsoft SQL Server Management Studio    grahamk    EUROPE\grahamk    0    2    0    0    2912    52    2009-10-19 14:28:13.143    2009-10-19 14:28:13.143    0X00000000040000001A00730070005F006500780065006300750074006500730071006C002E01000082001A00E7206E0076006100720063006800610072002800310033003100290006010000530045004C004500430054000A006400740062002E006300    
 create table #tmp_sp_help_category
 (category_id int null, category_type tinyint null, name nvarchar(128) null)
 insert into #tmp_sp_help_category exec msdb.dbo.sp_help_category
 SELECT
 sv.name AS [Name],
 CAST(sv.enabled AS bit) AS [IsEnabled],
 tshc.name AS [Category],
 sv.category_id AS [CategoryID],
 tshc.category_type AS [CategoryType],
 null AS [CurrentRunStatus],
 null AS [CurrentRunStep],
 null AS [HasSchedule],
 null AS [HasStep],
 null AS [HasServer],
 null AS [LastRunDate],
 null AS [NextRunDate],
 null AS [LastRunOutcome],
 CAST(sv.job_id AS nvarchar(100)) AS [job_id]
 FROM
 msdb.dbo.sysjobs_view AS sv
 INNER JOIN #tmp_sp_help_category AS tshc ON sv.category_id = tshc.category_id
  
 drop table #tmp_sp_help_category
         

and finally:

 exec msdb.dbo.sp_help_job

 

These batches don’t appear to return any errors or warnings to profiler. The logical next step was to take a look at what this code does. Once you read through the various batches, you notice 2 types of batch. The first 5 batches are just required to get certain properties about the server instance and to populate properties of the tree view in SSMS Object Explorer. Whenever you profile SSMS you see lots of batches like this, which tend to precede the “real” work.

The you see the final single batch, which executes msdb.dbo.sp_help_job. This is the real work that is required to populate the dialogue box that SSMS is opening. Since these batches are all just queries, and nothing is being updated, we can easily re-run them in SSMS query windows to see if anything else is thrown up, or see if weird results are returned that might explain our strange date time error.

We tried this and it returned a wide record set of several hundred rows, which on initial glance was all well and good. At this stage, the pragmatists amongst you may feel it’s appropriate to just scan through this data and see if there are any glaring inconsistencies which can be noticed which might cause a DateTime related error. The truth is that there was such a piece of data, if you could be bothered to read all the data manually, but it was only a hypothesis. The stored procedure in question returns some columns that are related to dates and times in a true DateTime representation. However it also returns some columns as text representations of date parts. In this record set, these columns should always be populated with values, but in this case there were a number of rows which showed the value “0” in these columns. We therefore made the guess that this data was the problem. We removed this data and tried the repro again, and everything worked ok.

We were lucky in this situation that the record set to review was only a few hundreds rows, and that the data corruption was obvious. On the basis that neither of these situations is always true, part 2 will show how to continue to troubleshoot this with debugging tools, to categorically prove what the problem was.