MimicWrapper Class

A wrapper explainer which reduces the number of function calls necessary to use the explain model package.

Initialize the MimicWrapper.

`<<that accepts a 2d ndarray :param explainable_model: The uninitialized surrogate model used to explain the black box model.

Also known as the student model.

Inheritance
azureml._logging.chained_identity.ChainedIdentity
MimicWrapper

Constructor

MimicWrapper(workspace, model, explainable_model, explainer_kwargs=None, init_dataset=None, run=None, features=None, classes=None, model_task=ModelTask.Unknown, explain_subset=None, transformations=None, feature_maps=None, allow_all_transformations=None)

Parameters

Name Description
workspace
Required

The workspace object where the Models and Datasets are defined.

model
Required
str or <xref:<xref:model that implements sklearn.predict>()> or <xref:sklearn.predict_proba>() or <xref:<xref:pipeline function that accepts a 2d ndarray>>

The model ID of a model registered to MMS or a regular machine learning model or pipeline to explain. If a model is specified, it must implement sklearn.predict() or sklearn.predict_proba(). If a pipeline is specified, it must include a function that accepts a 2d ndarray.

explainable_model
Required

The uninitialized surrogate model used to explain the black box model. Also known as the student model.

explainer_kwargs

Any keyword arguments that go with the chosen explainer not otherwise covered here. They will be passed in as kwargs when the underlying explainer is initialized.

Default value: None
init_dataset

The dataset ID or regular dataset used for initializing the explainer (e.g., x_train).

Default value: None
run
Run

The run this explanation should be associated with.

Default value: None
features

A list of feature names.

Default value: None
classes

Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.

Default value: None
model_task
str

Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array.

Default value: ModelTask.Unknown
explain_subset

A list of feature indices. If specified, Azure only selects a subset of the features in the evaluation dataset for explanation, which will speed up the explanation process when number of features is large and you already know the set of interesting features. The subset can be the top-k features from the model summary. This parameter is not supported when transformations are set.

Default value: None
transformations

A sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas.

If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler.

Examples for transformations that work:


   [
       (["col1", "col2"], sklearn_one_hot_encoder),
       (["col3"], None) #col3 passes as is
   ]
   [
       (["col1"], my_own_transformer),
       (["col2"], my_own_transformer),
   ]

An example of a transformation that would raise an error since it cannot be interpreted as one to many:


   [
       (["col1", "col2"], my_own_transformer)
   ]

The last example would not work since the interpret-community package can't determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns.

Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception.

Default value: None
feature_maps

A list of feature maps from raw to generated feature. This parameter can be list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, ..., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed.

Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception.

Default value: None
allow_all_transformations

Whether to allow many to many and many to one transformations.

Default value: None
workspace
Required

The workspace object where the Models and Datasets are defined.

model
Required
str or <xref:<xref:model that implements sklearn.predict>()> or <xref:sklearn.predict_proba>() or <xref:<xref:pipeline function>>

The model ID of a model registered to MMS or a regular machine learning model or pipeline to explain. If a model is specified, it must implement sklearn.predict() or sklearn.predict_proba(). If a pipeline is specified, it must include a function that accepts a 2d ndarray.

explainer_kwargs
Required

Any keyword arguments that go with the chosen explainer not otherwise covered here. They will be passed in as kwargs when the underlying explainer is initialized.

init_dataset
Required

The dataset ID or regular dataset used for initializing the explainer (e.g. x_train).

run
Required
Run

The run this explanation should be associated with.

features
Required

A list of feature names.

classes
Required

Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.

model_task
Required
str

Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array.

explain_subset
Required

List of feature indices. If specified, only selects a subset of the features in the evaluation dataset for explanation, which will speed up the explanation process when number of features is large and the user already knows the set of interested features. The subset can be the top-k features from the model summary. This argument is not supported when transformations are set.

transformations
Required

A sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas.

If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler.

Examples for transformations that work:


   [
       (["col1", "col2"], sklearn_one_hot_encoder),
       (["col3"], None) #col3 passes as is
   ]
   [
       (["col1"], my_own_transformer),
       (["col2"], my_own_transformer),
   ]

An example of a transformation that would raise an error since it cannot be interpreted as one to many:


   [
       (["col1", "col2"], my_own_transformer)
   ]

The last example would not work since the interpret-community package can't determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns.

Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception.

feature_maps
Required
list[array] or list[csr_matrix] <xref::param allow_all_transformations: Whether to allow many to many and many to one transformations.>

A list of feature maps from raw to generated feature. This parameter can be list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, ..., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed.

Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception.

Remarks

The MimicWrapper can be used for explaining machine learning models, and is particularly effective in conjunction with AutoML. For example, using the automl_setup_model_explanations function in the <xref:azureml.train.automl.runtime.automl_explain_utilities> module, you can use the MimicWrapper to compute and visualize feature importance. For more information, see Interpretability: model explanations in automated machine learning.

In the following example, the MimicWrapper is used in a classification problem.


   from azureml.interpret.mimic_wrapper import MimicWrapper
   explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator,
                explainable_model=automl_explainer_setup_obj.surrogate_model,
                init_dataset=automl_explainer_setup_obj.X_transform, run=automl_run,
                features=automl_explainer_setup_obj.engineered_feature_names,
                feature_maps=[automl_explainer_setup_obj.feature_map],
                classes=automl_explainer_setup_obj.classes,
                explainer_kwargs=automl_explainer_setup_obj.surrogate_model_params)

For more information about this example, see this notebook.

Methods

explain

Explain a model's behavior and optionally upload that explanation for storage and visualization.

explain

Explain a model's behavior and optionally upload that explanation for storage and visualization.

explain(explanation_types, eval_dataset=None, top_k=None, upload=True, upload_datasets=False, tag='', get_raw=False, raw_feature_names=None, experiment_name='explain_model', raw_eval_dataset=None, true_ys=None)

Parameters

Name Description
explanation_types
Required

A list of strings representing types of explanations desired. Currently, 'global' and 'local' are supported. Both may be passed in at once; only one explanation will be returned.

eval_dataset

The dataset ID or regular dataset used to generate the explanation.

Default value: None
top_k
int

Limit to the amount of data returned and stored in Run History to top k features, when possible.

Default value: None
upload

If True, the explanation is automatically uploaded to Run History for storage and visualization. If a run was not passed in at initialization, one is created.

Default value: True
upload_datasets

If set to True and no dataset IDs are passed in, the evaluation dataset will be uploaded to Azure storage. This will improve the visualization available in the web view.

Default value: False
tag
Required
str

A string to attach to the explanation to distinguish it from others after upload.

get_raw

If True and the parameter feature_maps was passed in during initialization, the explanation returned will be for the raw features. If False or not specified, the explanation will be for the data exactly as it is passed in.

Default value: False
raw_feature_names

The list of raw feature names, replacing engineered feature names from the constructor.

Default value: None
experiment_name
str

The desired name to give an explanation if upload is True but no run was passed in during initialization

Default value: explain_model
raw_eval_dataset

Raw eval data to be uploaded for raw explanations.

Default value: None
true_ys
list | <xref:pandas.Dataframe> | ndarray

The true labels for the evaluation examples.

Default value: None

Returns

Type Description

An explanation object.

Attributes

explainer

Get the explainer that is being used internally by the wrapper.

Returns

Type Description

The explainer that is being used internally by the wrapper.