Decoding ‘Game of Thrones’ by way of data science

Given the authors tendency to kill of key characters at a steady rate, betweenness centrality looks particularly interesting (see chart 3 above).

Betweenness centrality gives us a measure of how difficult it is to replace that individual node, i.


kill a character, without significant impact on the rest of the network’s connectivity.

With these scores in mind we will take a close look at the main characters.

The main characters have been identified as being characters that have more than five chapter with ‘point-of-view’ perspective (red nodes).

In addition, we are adding characters that have significant betweenness centrality but are not defined as POV-characters (grey nodes).

Betweenness centrality will also be reflected in the size of the characters circle in the diagram.

As before the width of the lines is proportional to the number of co-occurrences between the characters:Connecting the main characters of ‘A Song of Ice and Fire’ using betweenness centrality — circular visualizationSame data using an alternative graph layout — who will survive and control the Iron Throne?The graph helps us get an intuitive understanding of the relative importance of our main characters.

All-in-all it is very clear that, from a network theory perspective, Jon is the most important character and would indeed be hard to kill off in order to maintain a connected and coherent story.

Jon and Sam have the strongest pair-wise connection with no less than 262 co-occurrences.

But between them there is a vast difference in terms of centrality in connections, with Sam being more of a edge node compared to his super-connected friend.

The other Stark siblings tend to hang out a lot, but primarily between brother-to-brother (Bran and Robb) and sister-to-sister (Sansa and Arya).

It is also clear that Tyrion is a very important part of the story connecting the House of Stark with the Lannisters (and later as we shall see also to house Targaryen).

The five books leave Daenerys really on the edge and some distance to all of the most central connections, relying on additional ‘bridge’ characters to connect her to Westeros.

I guess if you have dragons you don’t need so many human friends.

ConclusionsThanks to the large number of invented words and names, Martin displays a lexical richness similar to Shakespeare at face value.

However, comparative studies need to take in the larger context and ideally compare text similar in terms of volume, genre and language morphology.

Natural language processing tools such as part-of-speech tagging and lemmatization enables more valid and accurate computations.

Visualisation and analysis of word frequencies across the text provide us with insights into how the narrative is structured.

We have seen how the death, blood and love develops through the story and concludes that much of it rests on who is related to whom.

Perhaps no surprisingly given that Mr Martin is said to be greatly inspired by the wars of the roses, a dynastic struggle for control of throne of England in the 15th century.

Network theory clearly identifies Jon Snow as the most important character, followed by Tyrion and Jaime.

From a network theory perspective, Daenerys lives on the edge.

The key concept is to calculate the centrality of the characters in terms of their role in the entire network of characters.

The metric of ‘betweenness centrality’ guides us to who would be most difficult to kill without disturbing too many interrelations.

We have been able to quantify many of the different aspects of the books.

This might make us feel a little more certain of the future development of the story.

But it might be precisely this certainty that the author wants to lure us in, just so that he can surprise is even more.

This is probably especially true as we draw near the end of the saga.

Long live Tyrion!Better odds than meets the eye?Which part did you found most interesting in this numerical exploration?.Do you think it is possible to calculate the possibility for certain events in fiction given a large enough dataset?.Leave a comment below for your reflections!.For the upcoming article we will turn to artificial intelligence and self learning systems to see how we can achieve a machine understanding of texts written by humans.

Will it be possible in the future to have machine learning models generating text indistinguishable from humans?.Stay tuned for more!.

. More details

Leave a Reply