graphlab.neuralnet_classifier.create

graphlab.neuralnet_classifier.create(dataset, target, features=None, max_iterations=10, network=None, validation_set='auto', verbose=True, class_weights=None, **kwargs)

Create a NeuralNetClassifier to predict the class of data with numerical features or image data features.

The optional network parameter accepts a NeuralNet object, which defines the neural network architecture and learning parameters. It is the most important parameter in the model learning process; we recommended starting with the default architecture returned by deeplearning.create(), then tuning it to best fit your task.

Multiple GPU Support

The creation of neurlanet classifier takes advantage of multiple GPU devices. By default, it uses all the GPUs that are detected. You can change the default behavior by setting the runtime config “GRAPHLAB_NEURALNET_DEFAULT_GPU_DEVICE_ID”. For instance, the following code sets the default to only use device 0 and 2.

graphlab.set_runtime_config("GRAPHLAB_NEURALNET_DEFAULT_GPU_DEVICE_ID", "0,2")

Note

If there is an imbalance in GPU’s, where one GPU is slower than another, then the faster GPU will end up waiting on the slower GPU.

Parameters:

dataset : SFrame

A training dataset, containing feature columns and a target column. If the feature column is of type graphlab.Image, all images must be of the same size.

target : str

The name of the column in dataset that is the prediction target. The values in this column represent classes, and must be of integer or string type.

features : list[str], optional

Column names of the features used to train the model. Each column must contain vectors of floats or there can be one column of Image type. The default argument is None, which means all columns are used, except the target.

max_iterations : int, optional

The maximum number of iterations for boosting.

network : NeuralNet, optional

The NeuralNet object contains model learning parameters and definitions for the network architecture. The default is None, but we recommend using deeplearning.create() to find a default structure for the input data. Because this default structure may be suboptimal, tuning the NeuralNet is highly recommended.

validation_set : SFrame, optional

A dataset for monitoring the models generalization performance. For each row of the progress table, the chosen metrics are computed for both the provided training dataset and the validation_set. The format of this SFrame must be the same as the training set. By default this argument is set to ‘auto’ and a validation set is automatically sampled and used for progress printing. If validation_set is set to None, then no additional metrics are computed. This is computed once per full iteration. The default value is ‘auto’.

class_weights : {dict, ‘auto’}, optional

Weights the examples in the training data according to the given class weights. If set to ‘None’, all classes will be weighted equally. The auto mode set the class weight to be inversely proportional to number of examples in the training data with the given class. If providing custom class weights, all classes must be present in the dictionary.

kwargs : dict, optional

Additional arguments for training the neural network. All of the parameters listed below can be stored in the params attribute of a NeuralNet object. If the same parameter is set in both places, the one in the create function overrides the one in the NeuralNet object.

  • batch_size : int, default 100

    The SGD mini batch size. Larger batch_size will improve per iteration speed, but costs more (GPU) or CPU memory.

  • model_checkpoint_path : str, default “”

    If specified, save the model to the given path every n iterations, where n is specified by model_checkpoint_interval.

  • model_checkpoint_interval : int, default 5

    If model_check_point_path is specified, save the model to the given path every n iterations.

  • mean_image : graphlab.image.Image, default None

    If set and subtract_mean is True, use the provided mean image to save computation time.

  • metric : {‘accuracy’, ‘error’, ‘recall@5’, ...} , default auto

    The metric(s) used for evaluating training and validation data. To evaluate on multiple metrics, supply a list of metric strings, e.g. [‘accuracy’, ‘recall@5’], or use a comma separated string e.g. ‘accuracy,recall@5’.

  • subtract_mean : bool, default True

    If true, subtract the mean from each image. Calculate the mean image from the training data or use the provided mean_image. Subtracting the mean centers the input data, which helps accelarate neural net training.

  • random_crop : bool, default False

    If true, apply random crop to the input images. The cropped image size is defined by the input_shape parameter below. Random cropping helps prevent the model from overfitting by augmenting the dataset.

  • input_shape : str, default None

    A formated string in the form of channels,width,height, e.g “1,28,28” or “3,256,256”, indicating the shape of the image after random cropping. The input_shape cannot exceed the shape of the original image size.

  • random_mirror : bool, default False

    If true, apply random mirror to the input images.Random mirroring helps prevent the model from overfitting by augmenting the dataset.

  • learning_rate : float, default 0.001

    The learning_rate for bias and weights.

  • momentum : float between [0, 1], default 0.9

    The momentum for bias and weights.

  • l2_regularization : float, default 0.0005

    L2 regularization for weights.

  • bias_learning_rate : float, default unused

    Specify the learning rate for bias, overriding learning_rate.

  • bias_momentum : float, default unused

    Specify the momentum for bias, overriding momentum.

  • bias_l2_regularization : float, default 0.0

    The L2 regularization for bias.

  • learning_rate_schedule : {‘constant’, ‘exponential_decay’, ‘polynomial_decay’}

    Learning rate scheduling algorithm.

    • constant: Use the same learning rate for all iterations
    • exponential_decay: Exponentially decreases the learning rate over iterations. See the notes section for more details.
    • polynomial_decay: Polynomially decreases the learning rate over iterations. See the notes section for more details.
  • learning_rate_start_epoch : int, default 0

    start learning rate scheduling after epoch

  • min_learning_rate : float, default 0.00001

    minimum of learning rate

  • learning_rate_step : int, default 1

    update the learning rate every learning_rate_step number of epochs.

  • learning_rate_gamma : float, default 0.1.

    learning decay param used in ‘exponential_decay’

  • learning_rate_alpha : float, default 0.5

    learning decay param used in ‘polynomial_decay’

  • init_random : {‘gaussian’ | ‘xavier’}, default ‘gaussian’

    The type of initialization for the weights. Either uses random gaussian initialization or Xavier initialization. See FullConnectionLayer paramters for more information.

  • init_sigma : float, default 0.01

    The standard deviation of the gaussian distribution weight initializations are drawn from.

  • init_bias : float, default 0.0

    The initial value of the biases.

  • divideby : float, default 1.0

    The value by which to scale the input data before it is inserted into the network.

Returns:

out : NeuralNetClassifier

Notes

For exponential_decay, the learning rate decreases exponentially according to the following:

\[new\_lr = lr * lr\_gamma^{epoch/ lr\_step}\]

For polynomial_decay, the learning rate decreases polynomially according to the following:

\[new\_lr = lr * (1 + (epoch/lr\_step)*lr\_gamma)^{-lr\_alpha}\]

References

Examples

We train a convolutional neural network for digits recognition, using the MNIST data. The data has already been downloaded from the MNIST database, and saved as SFrames in the Turi’s public S3 bucket.

>>> data = graphlab.SFrame('https://static.turi.com/datasets/mnist/sframe/train')
>>> test_data = graphlab.SFrame('https://static.turi.com/datasets/mnist/sframe/test')
>>> training_data, validation_data = data.random_split(0.8)

Resize all the images to the same size, since neural nets have fixed input size.

>>> training_data['image'] = graphlab.image_analysis.resize(training_data['image'], 28, 28, 1, decode=True)
>>> validation_data['image'] = graphlab.image_analysis.resize(validation_data['image'], 28, 28, 1, decode=True)
>>> test_data['image'] = graphlab.image_analysis.resize(test_data['image'], 28, 28, 1, decode=True)

Use the built-in NeuralNet architecture for MNIST (a one layer convolutional neural network):

>>> net = graphlab.deeplearning.get_builtin_neuralnet('mnist')

Layers of the neural network:

>>> net.layers
layer[0]: ConvolutionLayer
  padding = 1
  stride = 2
  random_type = xavier
  num_channels = 32
  kernel_size = 3
layer[1]: MaxPoolingLayer
  stride = 2
  kernel_size = 3
layer[2]: FlattenLayer
layer[3]: DropoutLayer
  threshold = 0.5
layer[4]: FullConnectionLayer
  init_sigma = 0.01
  num_hidden_units = 100
layer[5]: SigmoidLayer
layer[6]: FullConnectionLayer
  init_sigma = 0.01
  num_hidden_units = 10
layer[7]: SoftmaxLayer

Parameters of the neural network:

>>> net.params
{'batch_size': 100,
 'data_shape': '1,28,28',
 'divideby': 255,
 'init_random': 'gaussian',
 'l2_regularization': 0.0,
 'learning_rate': 0.1,
 'momentum': 0.9}

Train a NeuralNetClassifier using the specified network:

>>> m = graphlab.neuralnet_classifier.create(training_data, target='label',
...                                          network = net,
...                                          validation_set=validation_data,
...                                          metric=['accuracy', 'recall@2'],
...                                          max_iterations=3)

Classify the test data, and output the most likely class label.’probability’ corresponds to the probability that the input belongs to that class:

>>> pred = m.classify(test_data)
>>> pred
+--------+-------+----------------+
| row_id | class |  probability   |
+--------+-------+----------------+
|   0    |   0   | 0.998417854309 |
|   1    |   0   | 0.999230742455 |
|   2    |   0   | 0.999326109886 |
|   3    |   0   | 0.997855246067 |
|   4    |   0   | 0.997171103954 |
|   5    |   0   | 0.996235311031 |
|   6    |   0   | 0.999143242836 |
|   7    |   0   | 0.999519705772 |
|   8    |   0   | 0.999182283878 |
|   9    |   0   | 0.999905228615 |
|  ...   |  ...  |      ...       |
+--------+-------+----------------+
[10000 rows x 3 columns]

Predict the top 2 most likely digits:

>>> pred_top2 = m.predict_topk(test_data, k=2)
>>> pred_top2
+--------+-------+-------------------+
| row_id | class |    probability    |
+--------+-------+-------------------+
|   0    |   0   |   0.998417854309  |
|   0    |   6   | 0.000686840794515 |
|   1    |   0   |   0.999230742455  |
|   1    |   2   | 0.000284609268419 |
|   2    |   0   |   0.999326109886  |
|   2    |   8   | 0.000261707202299 |
|   3    |   0   |   0.997855246067  |
|   3    |   8   |  0.00118813838344 |
|   4    |   0   |   0.997171103954  |
|   4    |   6   |  0.00115600414574 |
|  ...   |  ...  |        ...        |
+--------+-------+-------------------+
[20000 rows x 3 columns]

Evaluate the classifier on the test data. Default metrics are accuracy and confusion_matrix.

>>> eval_ = m.evaluate(test_data)
>>> eval_
{'accuracy': 0.979200005531311, 'confusion_matrix':
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |      0       |        0        |  969  |
 |      2       |        0        |   2   |
 |      5       |        0        |   2   |
 |      6       |        0        |   9   |
 |      7       |        0        |   1   |
 |      9       |        0        |   2   |
 |      1       |        1        |  1126 |
 |      2       |        1        |   2   |
 |      6       |        1        |   2   |
 |      7       |        1        |   3   |
 |     ...      |       ...       |  ...  |
 +--------------+-----------------+-------+
 [64 rows x 3 columns]}