Sentiment analysis

Data scientists are often faced with data sets that contain text, and must employ natural language processing (NLP) techniques in order to make it useful. Sentiment analysis refers to the use of NLP techniques to extract subjective information such as the polarity of the text, e.g., whether or not the author is speaking positively or negatively about some topic.

In many cases, sentiment analysis can help keep a pulse on the users' needs and adapt the product and services accordingly. Many applications exist for this type of analysis:

  • Forum data: Find out how people feel about various products and features.
  • Restaurant and movie reviews: What are people raving about? What do people hate?
  • Social media: What is the sentiment about a hashtag, e.g. for a company, politician, etc?
  • Call center transcripts: Are callers praising or complaining about particular topics?

In the next section, you will learn to use GraphLab Create's sentiment_analysis toolkit to apply pre-trained models to predict sentiment for text data in these situations.

More advanced forms of sentiment analysis exist. Aspect mining attempts to identify features (or aspects) of entities that are mentioned, and then estimate the sentiment for each aspect. For example, when studying reviews about mobile phones you may be interested in how people feel about aspects such as battery life, screen resolution, size, etc.

For these situations we provide a product_sentiment toolkit where it's easy to explore and summarize sentiment about products within text data. The toolkit enables to search for aspects of interest and obtain summaries of the reviews or sentences with the most positive (or negative) predicted sentiment.