classifier

The GraphLab Create classifier toolkit contains models for classification problems. Currently, we support binary classification using support vector machines (SVM), logistic regression, boosted trees, neural networks, and nearest neighbors. In addition to these models, we provide a smart interface that selects the right model based on the data. If you are unsure about which model to use, simply use create() function.

Training datasets should contain a column for the ‘target’ variable and one or more columns representing feature variables.

# Set up the data
>>> import graphlab as gl
>>> data =  gl.SFrame('https://static.turi.com/datasets/regression/houses.csv')

# Create the model
>>> data['is_expensive'] = data['price'] > 30000
>>> model = gl.classifier.create(data, target='is_expensive',
...                              features=['bath', 'bedroom', 'size'])

# Make predictions and evaluate results.
>>> classification = model.classify(data)
>>> results = model.evaluate(data)

creating a classifier

classifier.create Automatically create a suitable classifier model based on the provided training data.

random forest

random_forest_classifier.create Create a (binary or multi-class) classifier model of type RandomForestClassifier using an ensemble of decision trees trained on subsets of the data.
random_forest_classifier.get_default_options Get the default options for the toolkit RandomForestClassifier.
random_forest_classifier.RandomForestClassifier The random forest model can be used as a classifier for predictive tasks.

decision tree

decision_tree_classifier.create Create a (binary or multi-class) classifier model of type DecisionTreeClassifier.
decision_tree_classifier.get_default_options Get the default options for the toolkit DecisionTreeClassifier.
decision_tree_classifier.DecisionTreeClassifier Special case of gradient boosted trees with the number of trees set to 1.

boosted trees

boosted_trees_classifier.create Create a (binary or multi-class) classifier model of type BoostedTreesClassifier using gradient boosted trees (sometimes known as GBMs).
boosted_trees_classifier.get_default_options Get the default options for the toolkit BoostedTreesClassifier.
boosted_trees_classifier.BoostedTreesClassifier The gradient boosted trees model can be used as a classifier for predictive tasks.

logistic regression

logistic_classifier.create Create a LogisticClassifier (using logistic regression as a classifier) to predict the class of a discrete target variable (binary or multiclass) based on a model of class probability as a logistic function of a linear combination of the features.
logistic_classifier.get_default_options Get the default options for the toolkit LogisticClassifier.
logistic_classifier.LogisticClassifier Logistic regression models a discrete target variable as a function of several feature variables.

support vector machines

svm_classifier.create Create a SVMClassifier to predict the class of a binary target variable based on a model of which side of a hyperplane the example falls on.
svm_classifier.get_default_options Get the default options for the toolkit SVMClassifier.
svm_classifier.SVMClassifier Support Vector Machines can be used to predict binary target variable using several feature variables.

neural networks

neuralnet_classifier.create Create a NeuralNetClassifier to predict the class of data with numerical features or image data features.
neuralnet_classifier.get_default_options Get the default options for the toolkit NeuralNetClassifier.
neuralnet_classifier.NeuralNetClassifier Neural Network is one of the classical models in artificial intelligence and machine learning, and has recently achieved great success in computer vision tasks such as object recognition.

nearest neighbor

nearest_neighbor_classifier.create Create a NearestNeighborClassifier model.
nearest_neighbor_classifier.get_default_options Return information about options for the nearest neighbor classifier.
nearest_neighbor_classifier.NearestNeighborClassifier Nearest neighbor classifier model.