TWITTER SENTIMENT ANALYSIS
TWITTER
SENTIMENT ANALYSIS- How
professionals turn millions of seemingly normal tweets into meaningful
information and predict the mood of the general public
Sentiment
analysis has been predominantly used in data
science for analysis of customer feedback on products and reviews. They are
used to understand user ratings on different kinds of products, hospitality
services like travel, hotel bookings. Instead of having humans sit and go
through all the user-submitted reviews and feedbacks it is easier to have a
code which can do that for you. It has also become popular to analyse user
tweets — positive, negative or neutral by crawling twitter through APIs
for obvious reasons, people who care about their public image might want to
know what the users are talking about them and the most important, politicians
in recent times utilize sentiment analysis to understand the performance of
their campaigns.
Now for those of you who are alien to
such technological jargon, API stands for Application Programming Interface,
which is a software that establishes a link between two applications and allows
the applications to talk to each other. Each time you use an app like Facebook,
send a message or check the weather on your phone, you're using an API.
We can generally categorize the
analytics and machine learning into 3 sections:
1.
Crawling, cleaning data and
labelling unstructured data by using/mapping known English words from various
sources.
In this section the code runs
through all the tweets present in the public domain, then the tweets are
labelled, as positive, negative or neutral. This labelling is done using known
English words.
2.
Applying Natural Language-based
classifiers used for text processing to train tweets and predict moods
Here computing techniques are used
to return best matching predefined classes for short text inputs, such as a
sentence or phrase. It can classify phrases that are expressed in natural
language into categories.
3.
Applying standard machine
learning algorithms and deep learning to do multi-class mood classification.
This is usually the final step,
where the real code is used and all the collected data is analysed using the
algorithm.
Tweet Analytics
In this part of the process, the blog is structured into different areas of analytics and visual representations
are made for different comparisons.
1.
Frequency of different Moods
2.
Sentiment representation by
Word Cloud
3.
N-gram model
4.
Location-wise tweet
distribution
5.
Retweet frequency distribution
LET'S TAKE A LOOK AT A FEW METHODS
MENTIONED ABOVE TO UNDERSTAND HOW THE DATA IS VISUALIZED
Frequency of different Moods
The different mood frequencies
show public reactions towards both the parties before elections.
"Dominance" mood dominates in case of both the parties followed by
"Joy" mood. SNS countplot (a built-in function in python which does
the work of statistical plotting) provides the functionality to plot total
frequency distribution of each individual mood which helps to compare within
party different moods as well compare a specific mood for both the parties. For
instance, for the following graphs of a party X and party Y shows the total
number of tweets received for party X is more than party Y and consequently
each corresponding mood gets a higher percentage of tweets for X than Y.
Sentiment Representation by
WordCloud
A tag cloud or Word Cloud is a visual
representation of text data typically used to depict keyword on websites or to
visualize free form text. Tags are usually single words and the importance of
each tag is shown with font size or colour.
Comments
Post a Comment