Sentiment Analysis of Anthem Game Launch in Python

Langdetect by Mimino66's detect function is all we need to identify the language of our tweets.

We can load in the necessary function with from langdetect import detectWe’ll make a new column in our dataframe by mapping the detect() function over our text data and then keep only the tweets that are in English.

df['lang'] = df['text'].

map(lambda x: detect(x))df = df[df['lang']=='en']When this step was complete I was left with just 77,740 tweets.

Now with that out of the way we can begin to run some text analysis on our tweets.

VADER Sentiment Analysis is a popular python package for getting the sentiment of a piece of text, its particularly good for social media data and is ready to go out of the box!We need to import its SentimentIntensityAnalyzer and initialize it.

from vaderSentiment.

vaderSentiment import SentimentIntensityAnalyzeranalyzer = SentimentIntensityAnalyzer()VADER will return a dictionary of 4 scores for any text you pass it; Positive, Neutral, Negative, and Compound scores, all ranging from -1 to 1.

We’ll be mostly interested in the compound scores for tracking the overall sentiment of a tweet.

From here we make a new Series of data that contains the sentiment of our tweet’s text and concatenate it to our original dataframe.

sentiment = df['text'].

apply(lambda x: analyzer.

polarity_scores(x))df = pd.

concat([df,sentiment.

apply(pd.

Series)],1)Here is what our final dataframe looks like.

We can see our tweets are in english, and each has a set of sentiment scores associated with it.

Now on to the analysis!Analyzing SentimentFirst let’s just call df.

describe() and get some basic information on our dataset.

We have 77,740 tweets that average 10 likes, 35 replies, and 2 retweets.

Looking at the compound score we can see on average tweets are positive, with a mean sentiment of .

21.

Plotting this data will give us a better idea of what it looks like.

Before we plot I make a few changes to my dataframe for ease of use, sorting all the values by timestamp so they’re in order, copying the timestamp to the index to make graphing easier, and calculating an expanding and rolling mean for compound sentiment scores.

df.

sort_values(by='timestamp', inplace=True)df.

index = pd.

to_datetime(df['timestamp'])df['mean'] = df['compound'].

expanding().

mean()df['rolling'] = df['compound'].

rolling('6h').

mean()Now using matplotlib, with import matplotlib.

pyplot as plt, we can create a quick chart of our tweets and their sentiment over time.

fig = plt.

figure(figsize=(20,5))ax = fig.

add_subplot(111)ax.

scatter(df['timestamp'],df['compound'], label='Tweet Sentiment')ax.

plot(df['timestamp'],df['rolling'], color ='r', label='Rolling Mean')ax.

plot(df['timestamp'],df['mean'], color='y', label='Expanding Mean')ax.

set_xlim([dt.

date(2019,1,15),dt.

date(2019,2,21)])ax.

set(title='Anthem Tweets over Time', xlabel='Date', ylabel='Sentiment')ax.

legend(loc='best')fig.

tight_layout()plt.

show(We can notice some interesting things right off the bat here.

There are a lot of tweets with a sentiment score of 0.

We have A LOT of data.

The mean seems somewhat stable across our data with the exception of the 25th where there was such an increase in negative tweets the expanding mean was heavily impacted.

There seems to be areas of higher density where more tweets are occurring.

We’ll see if we can line these up with events surrounding the games launch.

Let’s try to tackle things one at a time here.

First let’s look at those tweets with a sentiment of 0.

Seborn’s distplot is a quick way to see the distribution of sentiment scores across our tweets.

fig = plt.

figure(figsize=(10,5))ax = fig.

add_subplot(111)sns.

distplot(df['compound'], bins=15, ax=ax)plt.

show()Just over 30% of our tweets have a sentiment of 0.

I chose to leave these tweets in my dataset for the time being.

But it is worth noting that if these were not included the average sentiment would be much higher.

Let’s see if we can get a little bit clearer picture of our sentiment over time.

Overall our data is noisy, there is just too much of it.

Taking a sample of our data might make it easier to see the trends happening.

We’ll use pandas sample() function to retain just a tenth of our 77,740 tweets.

ot = df.

sample(frac=.

1, random_state=1111)ot.

sort_index(inplace=True)ot['mean'] = ot['compound'].

expanding().

mean()ot['rolling'] = ot['compound'].

rolling('6h').

mean()I re-sort by date and calculate a new expanding and rolling mean for our data and chart the new dataset.

By sampling the dataset we can get a much better idea of how sentiment is changing over time.

This graph is much better, allowing us to actually see some dips and trends in sentiment over time.

Now all that is left to do is figure out what is causing the changes in sentiment.

I mentioned at the start of this article some important notes about Anthem’s launch.

Let’s add some important dates to our chart and see if they line up with the trends in our data.

Anthem had a ‘free demo weekend’ February 1st to February 3rd.

Anthem went live for Origin Access Members on February 15th.

Anthem had server issues shortly after the February 15th launch, posted to their twitter account at 7:30 AM, these issues were resolved and a twitter post was made by EA at 11:06 AM.

EA Released announced a day one patch on the 19th, full patch notes on the 20th at 8:13 AM, and the patch went live that same day at 4:27 PM.

Adding some lines with .

axvline() and .

text() I ended up with this graph here.

These lines might not line up perfectly, as I’m not sure the ‘official’ time of each release.

We can see that two large clusters of tweets coincide with the games launches, both the ‘demo weekend’ and the Origin Access launch.

Furthermore we can see that sentiment dropped during the demo weekend.

The average sentiment during the demo weekend was .

138, compared to the same time period prior to the demo weekend where the average sentiment was .

239.

You can also quickly notice there is another cluster of tweets in late January that was unaccounted for.

A quick scroll through Twitter and I found out that this was actually a VIP Demo Weekend which also encountered server issues, long load times, and required multiple patch fixes.

This coincides with that significant dip in sentiment.

We’ll add that line to our chart as well and create some subplots allowing us to see in detail some of the individual events.

Here is the final code for the plots followed by the plots themselves.

fig = plt.

figure(figsize=(20,5))ax=fig.

add_subplot(111)ax.

scatter(ot['timestamp'],ot['compound'], label='Tweet Sentiment')ax.

plot(ot['timestamp'],ot['rolling'], color ='r', label='Rolling Mean')ax.

plot(ot['timestamp'],ot['mean'], color='y', label='Expanding Mean')ax.

set_xlim([dt.

date(2019,1,15),dt.

date(2019,2,21)])ax.

set(title='Anthem Tweets over Time', xlabel='Date', ylabel='Sentiment')ax.

legend(loc='best')#free demo weekendax.

axvline(x=dt.

datetime(2019,2,1) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,2,1), y=0, s='Demo Weekend Starts', rotation=-90, size=10)ax.

axvline(x=dt.

datetime(2019,2,4) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,2,4), y=0, s='Demo Weekend Ends', rotation=-90, size=10)#origin access launchax.

axvline(x=dt.

datetime(2019,2,15) ,linewidth=3, color='r', linestyle='dashed')ax.

text(x=dt.

datetime(2019,2,15), y=0, s='Origin Access Launch', rotation=-90, size=10)#server fixax.

axvline(x=dt.

datetime(2019,2,15,11,6) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,2,15,11,6), y=0, s='Server Up', rotation=-90, size=10)#patchnotes announcedax.

axvline(x=dt.

datetime(2019,2,19,12) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,2,19,12), y=0, s='Patch Notes Announced', rotation=-90, size=10)#patchnotes releasedax.

axvline(x=dt.

datetime(2019,2,20,8,13) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,2,20,8,13), y=0, s='Patch Notes Released', rotation=-90, size=10)#patch realeasedax.

axvline(x=dt.

datetime(2019,2,20,16,27) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,2,20,16,27), y=0, s='Patch Released', rotation=-90, size=10)#vip weekendax.

axvline(x=dt.

datetime(2019,1,25,9,0) ,linewidth=3, color='r')ax.

text(x=dt.

datetime(2019,1,25,9,0), y=0, s='VIP Demo', rotation=-90, size=10)fig.

tight_layout()plt.

show()Again, the lines might not be perfect as I’m unsure of the ‘official’ time of each launch.

And here is the larger graph with the VIP Demo added.

Our final graphs show us some interesting things.

First the VIP demo had the most significant impact on sentiment.

Clearly individuals were upset with all of the issues surrounding the VIP demo.

The Open Demo weekend as well also showed a significant drop in sentiment.

Both of these are interesting cases in which a developer decides to allow the public to play a game before it is fully tested.

On one end the developers and publishers are getting valuable feedback on the game, server capacity, and bugs that need to be fixed.

The question is, does this come at the cost of sentiment surrounding the game?Perhaps not!.We can see that when the EA Access launch begins sentiment has returned to its original levels (although never as high as its pre-VIP demo sentiment.

)Game developers and publishers need to weigh the value of having the public act as beta testers for early game launches vs.

the perception of the game in the public’s eye.

Perhaps individual sentiment would have remained higher if the demo weekends were marketed as ‘beta’ weekends.

All in all this was a fun project, and the same analysis could be applied to all sorts of things, politics, movies, etc.

And now I think I’ll take a break from the statistics and go for a flight in my Javelin.

.

. More details

Leave a Reply