TWITTER SENTIMENT ANALYSIS


TWITTER SENTIMENT ANALYSIS- How professionals turn millions of seemingly normal tweets into meaningful information and predict the mood of the general public

Sentiment analysis has been predominantly used in data science for analysis of customer feedback on products and reviews. They are used to understand user ratings on different kinds of products, hospitality services like travel, hotel bookings. Instead of having humans sit and go through all the user-submitted reviews and feedbacks it is easier to have a code which can do that for you. It has also become popular to analyse user tweets — positive, negative or neutral by crawling twitter through APIs for obvious reasons, people who care about their public image might want to know what the users are talking about them and the most important, politicians in recent times utilize sentiment analysis to understand the performance of their campaigns.
Now for those of you who are alien to such technological jargon, API stands for Application Programming Interface, which is a software that establishes a link between two applications and allows the applications to talk to each other. Each time you use an app like Facebook, send a message or check the weather on your phone, you're using an API.

We can generally categorize the analytics and machine learning into 3 sections:
1.   Crawling, cleaning data and labelling unstructured data by using/mapping known English words from various sources.
In this section the code runs through all the tweets present in the public domain, then the tweets are labelled, as positive, negative or neutral. This labelling is done using known English words.
2.     Applying Natural Language-based classifiers used for text processing to train tweets and predict moods
Here computing techniques are used to return best matching predefined classes for short text inputs, such as a sentence or phrase. It can classify phrases that are expressed in natural language into categories.
3.     Applying standard machine learning algorithms and deep learning to do multi-class mood classification.
This is usually the final step, where the real code is used and all the collected data is analysed using the algorithm.
   Tweet Analytics
In this part of the process, the blog is structured into different areas of analytics and visual representations are made for different comparisons.
1.   Frequency of different Moods
2.   Sentiment representation by Word Cloud
3.   N-gram model
4.   Location-wise tweet distribution
5.   Retweet frequency distribution
LET'S TAKE A LOOK AT A FEW METHODS MENTIONED ABOVE TO UNDERSTAND HOW THE DATA IS VISUALIZED
Frequency of different Moods
The different mood frequencies show public reactions towards both the parties before elections. "Dominance" mood dominates in case of both the parties followed by "Joy" mood. SNS countplot (a built-in function in python which does the work of statistical plotting) provides the functionality to plot total frequency distribution of each individual mood which helps to compare within party different moods as well compare a specific mood for both the parties. For instance, for the following graphs of a party X and party Y shows the total number of tweets received for party X is more than party Y and consequently each corresponding mood gets a higher percentage of tweets for X than Y.

Sentiment Representation by WordCloud
A tag cloud or Word Cloud is a visual representation of text data typically used to depict keyword on websites or to visualize free form text. Tags are usually single words and the importance of each tag is shown with font size or colour.
























Comments

Popular Posts