A Random Forest classifier is one of the most effective machine learning models for predictive analytics. Refer to the chapter on random forest regression for background on random forests.
import graphlab as gl # Load the data # The data can be downloaded using data = gl.SFrame.read_csv('https://static.turi.com/datasets/xgboost/mushroom.csv') # Label 'c' is edible data['label'] = data['label'] == 'c' # Make a train-test split train_data, test_data = data.random_split(0.8) # Create a model. model = gl.random_forest_classifier.create(train_data, target='label', max_iterations=2, max_depth = 3) # Save predictions to an SArray. predictions = model.predict(test_data) # Evaluate the model and save the results into a dictionary results = model.evaluate(test_data)
We can visualize the models using
model.show(view="Tree", tree_id=0) model.show(view="Tree", tree_id=1)
See the chapter on random forest regression for additional tips and tricks of using the random forest classifier model.
Refer to the earlier chapters for the following features: