`graph_analytics`

¶

The graph analytics toolkit contains methods for analyzing a
`SGraph`

. Each method takes an input graph and returns a model
object, which contains the training time, an `SFrame`

with the
desired output for each vertex, and a new graph whose vertices contain the
output as an attribute. Note that all of the methods in the graph analytics
toolkit are available from the top level graphlab import.

For this example we download a dataset of James Bond characters to an SFrame, then build the SGraph.

```
>>> from graphlab import SFrame, SGraph
>>> url = 'https://static.turi.com/datasets/bond/bond_edges.csv'
>>> data = SFrame.read_csv(url)
>>> g = SGraph().add_edges(data, src_field='src', dst_field='dst')
```

The `degree_counting.create()`

method computes the inbound, outbound,
and total degree for each vertex.

```
>>> from graphlab import degree_counting
>>> deg = degree_counting.create(g)
>>> deg_graph = deg['graph'] # a new SGraph with degree data attached to each vertex
>>> in_degree = deg_graph.vertices[['__id', 'in_degree']]
>>> out_degree = deg_graph.vertices[['__id', 'out_degree']]
```

The `connected_components.create()`

method finds all weakly connected
components in the graph and returns a
`ConnectedComponentsModel`

.
A connected component is a group of vertices such that
there is a path between each vertex in the component and all other vertices in
the group. If two vertices are in different connected components there is no
path between them.

```
>>> from graphlab import connected_components
>>> cc = connected_components.create(g)
>>> print cc.summary()
>>> print cc.list_fields()
>>> cc_ids = cc.get('component_id') # an SFrame
>>> cc_ids = cc['component_id'] # equivalent to the above line
>>> cc_graph = cc['graph']
```

The `shortest_path.create()`

method finds the shortest directed path distance from a single source vertex to
all other vertices and returns a
`ShortestPathModel`

.
The output model contains this information and a method to
retrieve the actual path to a particular vertex.

```
>>> from graphlab import shortest_path
>>> sp = shortest_path.create(g, source_vid=123)
>>> sp_sframe = sp['distance']
>>> sp_graph = sp['graph']
>>> path = sp.get_path('98')
```

The `triangle_counting.create()`

counts the number of triangles in the graph and for each vertex and returns
a `TriangleCountingModel`

.
A graph triangle is a complete subgraph of three vertices. The number of
triangles to which a vertex belongs is an indication of the connectivity of the
graph around that vertex.

```
>>> from graphlab import triangle_counting
>>> tc = triangle_counting.create(g)
>>> tc_out = tc['triangle_count']
```

The `pagerank.create()`

method computes
the pagerank for each vertex and returns a `PagerankModel`

.
The pagerank value indicates the centrality of each node in the graph.

```
>>> from graphlab import pagerank
>>> pr = pagerank.create(g)
>>> pr_out = pr['pagerank']
```

The `label_propagation.create()`

method computes the label probability for the vertices with unobserved labels
by propagating information from the vertices with observed labels along the edges.
The method returns a `LabelPropagationModel`

,
which contains the probability that a vertex belongs to each of the label classes.

```
>>> from graphlab import label_propagation
>>> import random
>>> def init_label(vid):
... x = random.random()
... if x > 0.9:
... return 0
... elif x < 0.1:
... return 1
... else:
... return None
>>> g.vertices['labels'] = g.vertices['__id'].apply(init_label, int)
>>> m = label_propagation.create(g)
>>> labels = m['labels']
```

The K-core decomposition recursively removes vertices from the graph with degree
less than k. The value of k where a vertex is removed is its core ID; the
`kcore.create()`

method returns
a `KcoreModel`

which contains the core ID for
every vertex in the graph.

```
>>> from graphlab import kcore
>>> kc = kcore.create(g)
>>> kcore_id = kc['core_id']
```

Graph coloring assigns each vertex in the graph to a group in such a way that no
two adjacent vertices share the same group.
`graph_coloring.create()`

method returns
a `GraphColoringModel`

which contains
the color group ID for every vertex in the graph.

```
>>> from graphlab import graph_coloring
>>> color = graph_coloring.create(g)
>>> color_id = color['color_id']
>>> num_colors = color['num_colors']
```

For more information on the models in the graph analytics toolkit, plus extended examples, please see the model definitions and create methods in the API documentation, as well as the data science Gallery, the How-Tos, and the graph analytics chapter of the User Guide.

## connected components¶

`connected_components.create` |
Compute the number of weakly connected components in the graph. |

`connected_components.ConnectedComponentsModel` |
A ConnectedComponentsModel object contains the component ID for each vertex and the total number of weakly connected components in the graph. |

`connected_components.get_default_options` |
Get the default options for `graphlab.connected_components.create()` . |

## degree counting¶

`degree_counting.create` |
Compute the in degree, out degree and total degree of each vertex. |

`degree_counting.DegreeCountingModel` |
Model object containing the in degree, out degree and total degree for each vertex, |

`degree_counting.get_default_options` |
Get the default options for `graphlab.degree_counting.create()` . |

## graph coloring¶

`graph_coloring.create` |
Compute the graph coloring. |

`graph_coloring.GraphColoringModel` |
A GraphColoringModel object contains color ID assignments for each vertex and the total number of colors used in coloring the entire graph. |

`graph_coloring.get_default_options` |
Get the default options for `graphlab.graph_coloring.create()` . |

## k-core¶

`kcore.create` |
Compute the K-core decomposition of the graph. |

`kcore.KcoreModel` |
A KcoreModel object contains a core ID for each vertex, and the total number of cores in the graph. |

`kcore.get_default_options` |
Get the default options for `graphlab.kcore.create()` . |

## label propagation¶

`label_propagation.create` |
Given a weighted graph with observed class labels of a subset of vertices, infer the label probability for the unobserved vertices using the “label propagation” algorithm. |

`label_propagation.LabelPropagationModel` |
A LabelPropagationModel computes the probability of each class label for each unlabeled vertex. |

`label_propagation.get_default_options` |
Get the default options for `graphlab.label_propagation.create()` . |

## pagerank¶

`pagerank.create` |
Compute the PageRank for each vertex in the graph. |

`pagerank.PagerankModel` |
A PageRankModel object contains the pagerank value for each vertex. |

`pagerank.get_default_options` |
Get the default options for `graphlab.pagerank.create()` . |

## shortest path¶

`shortest_path.create` |
Compute the single source shortest path distance from the source vertex to all vertices in the graph. |

`shortest_path.ShortestPathModel` |
Model object containing the distance for each vertex in the graph to a single source vertex, which is specified during `graphlab.shortest_path.create()` . |

`shortest_path.get_default_options` |
Get the default options for `graphlab.shortest_path.create()` . |

## triangle counting¶

`triangle_counting.create` |
Compute the number of triangles each vertex belongs to, ignoring edge directions. |

`triangle_counting.TriangleCountingModel` |
Model object containing the traingle count for each vertex, and the total number of triangles. |

`triangle_counting.get_default_options` |
Get the default options for `graphlab.triangle_counting.create()` . |