Creating regression models is easy with GraphLab Create! The regression toolkit implements the following models:
These algorithms differ in how they make predictions, but conform to the same API. With all models, call create() to create a model, predict() to make predictions on the returned model, and evaluate() to measure performance of the predictions. All models can incorporate:
- Numeric features
- Categorical variables
- Sparse features (i.e feature sets that have a large set of features, of which only a small subset of values are non-zero)
- Dense features (i.e feature sets with a large number of numeric features)
- Text data
It isn't always clear that we know exactly which model is suitable for a given task. GraphLab Create's model selector automatically picks the right model for you based on statistics collected from the data set.
import graphlab as gl # Load the data data = gl.SFrame('https://static.turi.com/datasets/regression/yelp-data.csv') # Make a train-test split train_data, test_data = data.random_split(0.8) # Automatically picks the right model based on your data. model = gl.regression.create(train_data, target='stars', features = ['user_avg_stars', 'business_avg_stars', 'user_review_count', 'business_review_count']) # Save predictions to an SArray predictions = model.predict(test_data) # Evaluate the model and save the results into a dictionary results = model.evaluate(test_data)
GraphLab Create implementations are built to work with up to billions of examples and up to millions of features.