`nearest_neighbors`

¶

The GraphLab Create nearest neighbors toolkit finds the rows in a tabular reference dataset that are most similar to a set of queries with the same schema.

A `NearestNeighborsModel`

is created with
a reference dataset contained in an `SFrame`

, a distance
function, and an indexing method (the latter two options can be done
automatically by the model). An instantiated model has two key methods:
**query**, for finding the closest points in the reference dataset to *new* data
points; and **similarity_graph**, for finding the nearest neighbors of each
point in the original reference set.

```
>>> references = graphlab.SFrame({'x1': [0.98, 0.62, 0.11],
... 'x2': [0.69, 0.58, 0.36]})
>>> references.print_rows()
+------+------+
| x1 | x2 |
+------+------+
| 0.98 | 0.69 |
| 0.62 | 0.58 |
| 0.11 | 0.36 |
+------+------+
[3 rows x 2 columns]
...
>>> model = graphlab.nearest_neighbors.create(references)
...
>>> sim_graph = model.similarity_graph(k=1)
>>> sim_graph.show(vlabel='__id')
>>> sim_graph.edges
+----------+----------+----------------+------+
| __src_id | __dst_id | distance | rank |
+----------+----------+----------------+------+
| 0 | 1 | 0.376430604494 | 1 |
| 2 | 1 | 0.55542776308 | 1 |
| 1 | 0 | 0.376430604494 | 1 |
+----------+----------+----------------+------+
...
>>> queries = graphlab.SFrame({'x1': [0.05, 0.61, 0.99],
... 'x2': [0.06, 0.97, 0.86]})
>>> queries.print_rows()
+------+------+
| x1 | x2 |
+------+------+
| 0.05 | 0.06 |
| 0.61 | 0.97 |
| 0.99 | 0.86 |
+------+------+
[3 rows x 2 columns]
...
>>> model.query(queries, k=2)
+-------------+-----------------+----------------+------+
| query_label | reference_label | distance | rank |
+-------------+-----------------+----------------+------+
| 0 | 2 | 0.305941170816 | 1 |
| 0 | 1 | 0.771556867638 | 2 |
| 1 | 1 | 0.390128184063 | 1 |
| 1 | 0 | 0.464004310325 | 2 |
| 2 | 0 | 0.170293863659 | 1 |
| 2 | 1 | 0.464004310325 | 2 |
+-------------+-----------------+----------------+------+
```

In addition to the API documentation, please see the data science Gallery, How-tos, and the nearest neighbors chapter of the User Guide for more details and extended examples.

## nearest neighbors¶

`nearest_neighbors.create` |
Create a nearest neighbor model, which can be searched efficiently and quickly for the nearest neighbors of a query observation. |

`nearest_neighbors.NearestNeighborsModel` |
The NearestNeighborsModel represents rows of an SFrame in a structure that is used to quickly and efficiently find the nearest neighbors of a query point. |