MimicWrapper Class
A wrapper explainer which reduces the number of function calls necessary to use the explain model package.
Initialize the MimicWrapper.
`<<that accepts a 2d ndarray :param explainable_model: The uninitialized surrogate model used to explain the black box model.
Also known as the student model.
- Inheritance
-
azureml._logging.chained_identity.ChainedIdentityMimicWrapper
Constructor
MimicWrapper(workspace, model, explainable_model, explainer_kwargs=None, init_dataset=None, run=None, features=None, classes=None, model_task=ModelTask.Unknown, explain_subset=None, transformations=None, feature_maps=None, allow_all_transformations=None)
Parameters
Name | Description |
---|---|
workspace
Required
|
The workspace object where the Models and Datasets are defined. |
model
Required
|
str or
<xref:<xref:model that implements sklearn.predict>()> or
<xref:sklearn.predict_proba>() or
<xref:<xref:pipeline function that accepts a 2d ndarray>>
The model ID of a model registered to MMS or a regular machine learning model or pipeline to explain. If a model is specified, it must implement sklearn.predict() or sklearn.predict_proba(). If a pipeline is specified, it must include a function that accepts a 2d ndarray. |
explainable_model
Required
|
The uninitialized surrogate model used to explain the black box model. Also known as the student model. |
explainer_kwargs
|
Any keyword arguments that go with the chosen explainer not otherwise covered here. They will be passed in as kwargs when the underlying explainer is initialized. Default value: None
|
init_dataset
|
The dataset ID or regular dataset used for initializing the explainer (e.g., x_train). Default value: None
|
run
|
The run this explanation should be associated with. Default value: None
|
features
|
A list of feature names. Default value: None
|
classes
|
Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier. Default value: None
|
model_task
|
Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array. Default value: ModelTask.Unknown
|
explain_subset
|
A list of feature indices. If specified, Azure only selects a subset of the features in the evaluation dataset for explanation, which will speed up the explanation process when number of features is large and you already know the set of interesting features. The subset can be the top-k features from the model summary. This parameter is not supported when transformations are set. Default value: None
|
transformations
|
A sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas. If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler. Examples for transformations that work:
An example of a transformation that would raise an error since it cannot be interpreted as one to many:
The last example would not work since the interpret-community package can't determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception. Default value: None
|
feature_maps
|
A list of feature maps from raw to generated feature. This parameter can be list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, ..., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception. Default value: None
|
allow_all_transformations
|
Whether to allow many to many and many to one transformations. Default value: None
|
workspace
Required
|
The workspace object where the Models and Datasets are defined. |
model
Required
|
str or
<xref:<xref:model that implements sklearn.predict>()> or
<xref:sklearn.predict_proba>() or
<xref:<xref:pipeline function>>
The model ID of a model registered to MMS or a regular machine learning model or pipeline to explain. If a model is specified, it must implement sklearn.predict() or sklearn.predict_proba(). If a pipeline is specified, it must include a function that accepts a 2d ndarray. |
explainer_kwargs
Required
|
Any keyword arguments that go with the chosen explainer not otherwise covered here. They will be passed in as kwargs when the underlying explainer is initialized. |
init_dataset
Required
|
The dataset ID or regular dataset used for initializing the explainer (e.g. x_train). |
run
Required
|
The run this explanation should be associated with. |
features
Required
|
A list of feature names. |
classes
Required
|
Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier. |
model_task
Required
|
Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array. |
explain_subset
Required
|
List of feature indices. If specified, only selects a subset of the features in the evaluation dataset for explanation, which will speed up the explanation process when number of features is large and the user already knows the set of interested features. The subset can be the top-k features from the model summary. This argument is not supported when transformations are set. |
transformations
Required
|
A sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas. If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler. Examples for transformations that work:
An example of a transformation that would raise an error since it cannot be interpreted as one to many:
The last example would not work since the interpret-community package can't determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception. |
feature_maps
Required
|
list[array] or
list[csr_matrix] <xref::param allow_all_transformations: Whether to allow many to many and many to one transformations.>
A list of feature maps from raw to generated feature. This parameter can be list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, ..., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception. |
Remarks
The MimicWrapper can be used for explaining machine learning models, and is particularly effective in
conjunction with AutoML. For example, using the automl_setup_model_explanations
function in the
<xref:azureml.train.automl.runtime.automl_explain_utilities> module, you can use the MimicWrapper
to compute and visualize feature importance. For more information, see Interpretability: model
explanations in automated machine
learning.
In the following example, the MimicWrapper is used in a classification problem.
from azureml.interpret.mimic_wrapper import MimicWrapper
explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator,
explainable_model=automl_explainer_setup_obj.surrogate_model,
init_dataset=automl_explainer_setup_obj.X_transform, run=automl_run,
features=automl_explainer_setup_obj.engineered_feature_names,
feature_maps=[automl_explainer_setup_obj.feature_map],
classes=automl_explainer_setup_obj.classes,
explainer_kwargs=automl_explainer_setup_obj.surrogate_model_params)
For more information about this example, see this notebook.
Methods
explain |
Explain a model's behavior and optionally upload that explanation for storage and visualization. |
explain
Explain a model's behavior and optionally upload that explanation for storage and visualization.
explain(explanation_types, eval_dataset=None, top_k=None, upload=True, upload_datasets=False, tag='', get_raw=False, raw_feature_names=None, experiment_name='explain_model', raw_eval_dataset=None, true_ys=None)
Parameters
Name | Description |
---|---|
explanation_types
Required
|
A list of strings representing types of explanations desired. Currently, 'global' and 'local' are supported. Both may be passed in at once; only one explanation will be returned. |
eval_dataset
|
The dataset ID or regular dataset used to generate the explanation. Default value: None
|
top_k
|
Limit to the amount of data returned and stored in Run History to top k features, when possible. Default value: None
|
upload
|
If True, the explanation is automatically uploaded to Run History for storage and visualization. If a run was not passed in at initialization, one is created. Default value: True
|
upload_datasets
|
If set to True and no dataset IDs are passed in, the evaluation dataset will be uploaded to Azure storage. This will improve the visualization available in the web view. Default value: False
|
tag
Required
|
A string to attach to the explanation to distinguish it from others after upload. |
get_raw
|
If True and the parameter Default value: False
|
raw_feature_names
|
The list of raw feature names, replacing engineered feature names from the constructor. Default value: None
|
experiment_name
|
The desired name to give an explanation if Default value: explain_model
|
raw_eval_dataset
|
Raw eval data to be uploaded for raw explanations. Default value: None
|
true_ys
|
The true labels for the evaluation examples. Default value: None
|
Returns
Type | Description |
---|---|
An explanation object. |
Attributes
explainer
Get the explainer that is being used internally by the wrapper.
Returns
Type | Description |
---|---|
The explainer that is being used internally by the wrapper. |