graphlab.text_analytics.stopwords

graphlab.text_analytics.stopwords(lang='en')

Get common words that are often removed during preprocessing of text data, i.e. “stopwords”. Currently only English stop words are provided.

Parameters:

lang : str, optional

The desired language. Default: ‘en’ (English).

Returns:

out : set

A set of strings.

Examples

You may remove stopwords from an SArray as follows:

>>> docs = graphlab.SArray([{'are': 1, 'you': 1, 'not': 1, 'entertained':1}])
>>> docs.dict_trim_by_keys(graphlab.text_analytics.stopwords(), True)
dtype: dict
Rows: 1
[{'entertained': 1}]