OnlineGradientDescentRegressor Class
Train a stochastic gradient descent model.
- Inheritance
-
nimbusml.internal.core.linear_model._onlinegradientdescentregressor.OnlineGradientDescentRegressorOnlineGradientDescentRegressornimbusml.base_predictor.BasePredictorOnlineGradientDescentRegressorsklearn.base.RegressorMixinOnlineGradientDescentRegressor
Constructor
OnlineGradientDescentRegressor(normalize='Auto', caching='Auto', loss='squared', learning_rate=0.1, decrease_learning_rate=True, l2_regularization=0.0, number_of_iterations=1, initial_weights_diameter=0.0, reset_weights_after_x_examples=None, lazy_update=True, recency_gain=0.0, recency_gain_multiplicative=False, averaged=True, averaged_tolerance=0.01, initial_weights=None, shuffle=True, feature=None, label=None, **params)
Parameters
Name | Description |
---|---|
feature
|
see Columns. |
label
|
see Columns. |
normalize
|
Specifies the type of automatic normalization used:
Normalization rescales disparate data ranges to a standard scale.
Feature
scaling insures the distances between data points are proportional
and
enables various optimization methods such as gradient descent to
converge
much faster. If normalization is performed, a |
caching
|
Whether trainer should cache input training data. |
loss
|
The default is Hinge. Other choices are Exp, Log, SmoothedHinge. For more information, please see nimbusml. |
learning_rate
|
Determines the size of the step taken in the direction of the gradient in each step of the learning process. This determines how fast or slow the learner converges on the optimal solution. If the step size is too big, you might overshoot the optimal solution. If the step size is too small, training takes longer to converge to the best solution. |
decrease_learning_rate
|
Decrease learning rate. |
l2_regularization
|
L2 Regularization Weight. |
number_of_iterations
|
Number of iterations. |
initial_weights_diameter
|
Sets the initial weights diameter that
specifies the range from which values are drawn for the initial
weights. These weights are initialized randomly from within this range.
For example, if the diameter is specified to be |
reset_weights_after_x_examples
|
Number of examples after which weights will be reset to the current average. |
lazy_update
|
Instead of updating averaged weights on every example, only update when loss is nonzero. |
recency_gain
|
Extra weight given to more recent updates (do_lazy_updates` must be False). |
recency_gain_multiplicative
|
Whether Recency Gain is multiplicative (vs. additive). |
averaged
|
Do averaging?. |
averaged_tolerance
|
The inexactness tolerance for averaging. |
initial_weights
|
Initial Weights and bias, comma-separated. |
shuffle
|
Whether to shuffle for each training iteration. |
params
|
Additional arguments sent to compute engine. |
Examples
###############################################################################
# OnlineGradientDescentRegressor
from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.feature_extraction.categorical import OneHotVectorizer
from nimbusml.linear_model import OnlineGradientDescentRegressor
# data input (as a FileDataStream)
path = get_dataset('infert').as_filepath()
data = FileDataStream.read_csv(path)
print(data.head())
# age case education induced parity ... row_num spontaneous ...
# 0 26 1 0-5yrs 1 6 ... 1 2 ...
# 1 42 1 0-5yrs 1 1 ... 2 0 ...
# 2 39 1 0-5yrs 2 6 ... 3 0 ...
# 3 34 1 0-5yrs 2 4 ... 4 0 ...
# 4 35 1 6-11yrs 1 3 ... 5 1 ...
# define the training pipeline
pipeline = Pipeline([
OneHotVectorizer(columns={'edu': 'education'}),
OnlineGradientDescentRegressor(feature=['parity', 'edu'], label='age')
])
# train, predict, and evaluate
metrics, predictions = pipeline.fit(data).test(data, output_scores=True)
# print predictions
print(predictions.head())
# Score
# 0 28.103731
# 1 21.805904
# 2 28.103731
# 3 25.584600
# 4 33.743286
# print evaluation metrics
print(metrics)
# L1(avg) L2(avg) RMS(avg) Loss-fn(avg) R Squared
# 0 4.452286 31.15933 5.582054 31.15933 -0.134398
Remarks
Stochastic gradient descent uses a simple yet efficient iterative technique to fit model coefficients using error gradients for convex loss functions (see Stochastic_gradient_descent).
The OnlineGradientDescentRegressor
implements the standard (non-
batch) SGD, with a
choice of loss functions, and an option to update the weight vector
using the average of
the vectors seen over time (averaged
argument is set to True
by default).
Reference
Methods
get_params |
Get the parameters for this operator. |
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
Name | Description |
---|---|
deep
|
Default value: False
|