Comparison of methods of calculating Sentiment Analysis

There are a number of different methods of calculating sentiment analysis (SA). For this experiment I compare the two most common methods, dictionary-based SA and natural language understanding SA. A set of Tweets will be fetched from the Twitter API, and a script will calculate the sentiment for with both methods and display the results in the table and charts below.



Dictionary-based SA

Dictionary-based sentiment analysis uses predefined dataset of words annotated with their semantic values. The algorithm simply looks up the semantic value for each word and then calculates the overall sentiment for the sentence based on the values for each word.
The word list used in this example is the AFINN-111. You can view the sourcecode and documentation for the dictionary-based SA module here

Natural Language Understanding SA

Natural language understanding-based sentiment analysis is a little more complex, and takes longer for each calculation. It also tends to be more costly as well, however the results are considerably more accurate. It is also possible to extract more than just a positivity value for each string, you can drill down into much more detail about the attitudes conveyed.
For this example the HP Haven OnDemand API has been used



Data Sauce

The following examples are using Twitter data relating to the Edward Snowdon news story in 2013. There is also a third column added which represents the sentiment calculated by a number of humans in a questionnaire, this is to act as a benchmark
You can enter your own search query below, Twitter data will be fetched and the sentiment of each Tweet then calculated.

Custom Data Sauce

Calculate