Condividi tramite


Tree Utilities in Analysis Services Stored Procedures

This past week I was helping out a customer that wanted to reduce the length of a questionairre by using data mining to determine which of the 300+ questions were actually necessary for them to get the understanding they required.  By using a tree model, and playing with the COMPLEXITY_PENALTY parameter, I was able to build a model that was reasonably accurate and only required 10-17 questions.  (I was able to do the whole project using the Data Mining Add-ins as well!)

In the process, I created some stored procedures that helped me easily extract the information I needed from the tree - information such as which paths are the shortest, the longest, that contain the minimum or maximum values, etc

Attached is the source code for those utilities.  You can copy and paste them into Visual Studio - add a reference to ADOMDServer.NET, build and add as an assembly to your Analysis Services server.

The utilities give you the following functions:

For all trees:

List out the shortest paths from root to leaf
CALL TreeUtils.ShortestPaths("Model Name", "Tree Name")

List out the longest paths from root to leaf

CALL TreeUtils.LongestPaths("Model Name", "Tree Name")

Provide information about the shortest paths (e.g. value, probability, etc)

CALL TreeUtils.ShortestPathStatistics("Model Name", "Tree Name")

Provide information about the longest paths (e.g. value, probability, etc)
CALL TreeUtils.LongestPathStatistics("Model Name", "Tree Name")

For Regression Trees: 

Return the path to the leaf node containing the minimum value
CALL TreeUtils.MinimumPath("Model Name", "Tree Name")

Return information about the path containing the minimum value (e.g. depth, value, etc)

CALL TreeUtils.MinimumPathStatistics("Model Name", "Tree Name")

Return the path to the leaf node containing the maxmum value
CALL TreeUtils.MaximumPath("Model Name", "Tree Name")

Return information about the path containing the minimum value (e.g. depth, value, etc)

CALL TreeUtils.MaximumPathStatistics("Model Name", "Tree Name")

For Classification Trees 

Return the path leading to the leaf with the least likelihood of the specified state
CALL TreeUtils.MinimumPath("Model Name", "Tree Name", "State")

Return the path leading to the leaf with the most likelihood of the specified state
CALL TreeUtils.MaximumPath("Model Name", "Tree Name", "State")

Return information about the path containing the least likelihood of the state (e.g. depth, probability, etc)
CALL TreeUtils.MinimumPathStatistics("Model Name", "Tree Name", "State")

Return information about the path containing the most likelihood of the state (e.g. depth, probability, etc)
CALL TreeUtils.MaximumPathStatistics("Model Name", "Tree Name", "State")

.

TreeUtils.cs

Comments