Linear regression and lines of succession

Linear regression and lines of successionHow Computers Think: Part TwoSimon CarryerBlockedUnblockFollowFollowingFeb 8Since William the Conqueror was crowned on Christmas day in 1066, England and later Great Britain has had close to 50 claimants to the throne, who, with varying degrees of success, attempted to stamp their name on the history of the country and her people.

Of these, just 36 Kings and 5 Queens are considered to have ruled as monarchs with any legitimacy.

But these reigns have been of immensely varied lengths.

Boy-King Edward V achieved just 86 days of rule until his sudden and convenient disappearance cleared the way for his uncle Richard to ascend to the throne.

Our current Queen’s 65-year reign (and counting) is, as of September 2015, the longest of any British or English monarch.

Those are rookie numbers, EdwardImagine if, as each monarch ascended the throne, we could gaze into a crystal ball, and learn how many years of rule they would be granted.

Imagine if we could, at the point of their accession, make a guess as to when their reign would end.

As it happens, with the right data, and within certain broad limits of accuracy, we can.

The way we can do this is with an algorithm that underpins much of modern-day artificial intelligence, and one that may be familiar to anyone who (unlike me) paid attention in high-school statistics.

The algorithm is called linear regression.

The world is full of numbers, and those numbers often have relationships to each other.

Taller people often weigh more.

Larger houses often sell for more money.

Cyclists with a lower resting heart rate have faster lap times.

Sometimes these relationships are simple, such as the link between how much money you have and how many cans of beer you can buy.

Sometimes they are complex and mysterious, like the relationship between how much beer you drink and how terrible you feel the next day.

Linear regression is a tool for understanding those relationships.

From a set of prior examples, it derives a numerical ratio that best fits the observed numbers.

If you felt fine after one beer, a little iffy the morning after three, and had a raging headache after six, then you can extrapolate from that to conclude that last night’s 12-can bender is going to have grim consequences tomorrow.

Linear regression is used in artificial intelligence any time a prediction is made about a numeric value: The likely value of a piece of real estate, how many points one baseball player is likely to score compared to another, the outlook for a company’s share price — all of these can be predicted with a linear regression.

By looking at prior examples, say, a set of historic house sales, or scores from dozens of games, we can derive rules that, if they hold true, help us predict how things will go in the future.

Regression is also used in research that seeks to, for example, understand the relationship between smoking and lung cancer, or to estimate the effect of a drug in treating a disease.

The impact of multiple factors can be measured and teased apart, to derive complex understandings about the world.

We can also use linear regression to make a (rough) prediction on how long a British monarch will remain on the throne.

But first we’ll need to choose some variables — some known facts about past monarchs — which might help us make a predictionBritish monarchs, with a few exceptions, hold their position for life.

And barring the headsman’s axe or the exigencies of war (or the fate of Henry I, who reportedly died from consuming “a surfeit of lampreys”), a chief predictor of a monarch’s remaining years in power must be their age at the time they ascended the throne.

Therefore, let’s choose “age at ascension” as the first variable to look at with our linear regression algorithm.

Let’s try to understand the relationship between the age at which a monarch takes the throne, and the length of time they stay on it.

The face of a man who is hungry for lampreysTo understand and visualise this relationship, we can create a scatter-plot diagram.

We make a chart which has “Age at ascension” along its horizontal axis, and “Length of reign” on its vertical axis.

For each monarch, from William I to Elizabeth II, we put a mark on the chart.

Henry VI, who acceded the throne at nine months of age after his father shat himself to death from dysentery, goes at the far left of the chart.

With 39 years in power, his mark is made about two-thirds of the way up the y-axis.

His mark is near the bottom of the y-axis.

By plotting each of the 40-odd monarchs in this way, we can produce the following chart:Fig.

1: Relationship between age at ascension and length of reignIn this chart we see the grinning face of the reaper of souls.

Though they rule wisely or foolishly, no monarch may live far beyond their threescore years and ten.

A reign may be sadly cut short — such as the aformentioned Edward V, whose 86-day reign was ended when at the age of 12 he was confined to the Tower of London and then (historians mostly agree) secretly murdered by his perfidious uncle Richard.

But a reign may never last beyond the natural lifespan of the ruler.

Never trust a man with a pinkie ringWe can see this in the chart.

Though we see monarchs (such as poor Edward V) who fall into the lower left quadrant of the chart — their reign was short despite their youth — we see none in the top right — rulers who enjoyed a long reign despite taking the throne at an advanced age.

It is this fact — this relationship — which linear regression is perfectly suited to discover.

This relationship between age at ascension and the length of a monarch’s reign can be given a numeric value, and this is shown by the red line on the chart.

This relationship is derived by the linear regression algorithm.

The algorithm takes a set of examples — in this case each of our monarchs, and finds the value for age at ascension that best fits the lengths of reigns in the examples we’ve shown it.

It draws a line through the chart that has as many dots as possible as close as possible to the line.

The line on the chart starts right by Henry VI, who was just nine months old when he took the throne, and stayed there 40 years, not counting several years spent variously hiding out in Scotland, imprisoned in the Tower of London, and in a state of catatonic insanity.

The line ends very close to William IV, who waited until the age of 64 to take the throne.

William lived through most of the 59-year reign of his mad father, and then his older brother’s 10 years in power, before he acceded the throne, a position he would hold for only 7 years.

In between, the line passes many other monarchs with varied ages and tenures.

It doesn’t pass so close to all of them, but its course is calculated to be the best possible fit to the observed data.

Its slope is exactly fit to the data such that the distance between the line and each data point is minimised.

I hope at 64 I still look this good in tightsThat slope gives us the following rule: A monarch can expect to reign for just over 40 years, less a little more than six months for every year of age at the time they ascend the throne.

Here we have a very simple model for predicting the length of a monarch’s reign:“Y = ß + ß1”, or “Length of Reign = 40 years + (Age x six months)”Simple!But such mathematical precision can conceal some fundamental misconceptions in our model.

For a start, it seems odd that each year of age accounts for only six months taken from a monarch’s reign.

All other things being equal, one year before taking the throne should mean one less year after taking the throne.

Second, when we look at the chart, we see a large number of monarchs who fall into the top left of the chart, well above the red line, and another group who fall well below the line, in the bottom middle of our chart.

Our line is drawn too shallow!.What has happened is that our few young rulers who died before their time have pulled the whole line downwards on the left.

Is it possible that, while in general taking the throne younger leads to a longer reign, some other factor means that those who take the throne too young are destined to die before their time?As it happens, this is perfectly plausible!.Of the six monarchs who acceded to the throne at or below the age of fourteen, two of them, Henries III and VI, succeeded fathers killed in war, and three of them, Edwards III, V, and VI, were somewhat dubious claimants to the throne.

Furthermore, young rulers are subject to rebellious regents and scheming relatives who may exploit the new ruler’s youth.

What if we could adjust for this effect in our analysis?Linear regression handles exactly this scenario.

For each of the monarchs, we give the algorithm two values — the age at ascension, and then another value that is set to one if the monarch was 14 or older at the time they took the throne, and zero otherwise.

The algorithm again tries to find the best-fitting line, but it has two values it can adjust in order to find that line — the value of a year of age at ascension, and the value of being younger than fourteen on the day of ascension.

We produce a new rule: A monarch will rule for 50 years, less 13 years if they are under 14 when they take the throne, and then less nine months for each year of age.

We can visualise this rule with the following chart: We see the same values as before, except that the lengths of the reigns of the five young rulers have been adjusted to account for their youth — we have given them an 13 extra years, and the red line — the indicator of our prediction for the length of reign — now takes a steeper slope.

Fig.

2: Age and length of reign, adjusted for accession under age fourteenThe line now fits the data points more closely.

Our model is better at predicting the length of a monarch’s reign, but it’s still pretty terrible.

There remain a large number of monarchs whose marks on the chart are a long way away from the red line — our predictions for them are very poor.

For example, even with the extra 14 years the adjsutment affords him, the line passes over twenty years above poor Edward V’s score.

His successor (and suspected killer), Richard III, also receives an over-generous estimate from the model.

The line gives him 25 years, whereas in fact he only managed three years, before he paid for his crimes with his life at the battle of Bosworth Field.

Meanwhile, at the other end of the spectrum, the marks for George III, Victoria, and Elizabeth II are all over thirty years above where the line would place them.

We need to look for more variables, and fortunately there are many likely candidates.

Our model is missing any data about one of the major causes of a monarch’s reign ending — the vicissitudes of war, and the deadly politics of the British Middle-Ages.

For much of British history, a monarch was more likely to meet their end through the machinations of their enemies than from the implacable hand of Father Time.

But how to predict the outbreak of war?.How can we guess, at the start of a monarch’s reign, whether their rule will be contested?.One way is to look at the method by which they took the throne — by legitimate succession, or by usurpation.

Cheats, it seems, do not prosper.

Those rulers who sidestepped the usual rules of succession reigned for, on average, less than half as long as those who inherited the throne uncontested.

“Bloody” Mary I, who raised an army to depose her cousin Lady Jane Grey, enjoyed only three years of burning Protestants at the stake before her death.

But Richard I, called “The Lionheart”, is perhaps the best example of this “live by the sword, die by the sword” effect.

Usurping his own brothers, and making war against his father, Richard won the throne in a revolt, but died ten years later, defending his place on the throne from a revolt against his own rule.

We should adjust the scores for those rulers who took the throne in disputed circumstances.

Richard: The line artBut even legitimate rulers can fall to warfare, or its accompanying diseases.

The medieval world was dangerous, and prone to outbreaks of war, famine, and disease.

And especially in the primarily agrarian economy of medieval Britain, there’s a factor that can help predict those outbreaks: The weather.

In this age the economy was almost entirely dependent on agriculture, and agriculture was almost entirely dependent on good weather.

So if the weather is colder than average, we might expect to see more diseases, more famines, more wars, and consequently, shorter-than-expected reigns.

Fig.

3: Correlation between error in age/reign model and average temperatureThe above chart shows a point (in red) for each monarch, plotted by the year they took the throne, and how many more or fewer years they rules than we would expect, given their age at the time they took the throne.

The red shaded area shows a moving average of this value.

The blue line represents an estimate of the average temperature for each 50-year period.

The temperature is high through the late 1100’s — the “medieval warm period”, but falls from the mid 1300’s to a low around the late 1500’s to the early 1700’s — the “Little Ice Age”, which coincided with a period of instability and revolution in Britain, including the “Glorious Revolution” which unseated James II.

The chart shows that there’s some amount of correlation between the temperature and the error in our model — they move in the same direction.

James II low-key repping WestsideOur model now has five input features: Age at ascension, an adjustment for whether the monarch was under 14, the number of years since 1066, the average temperature, and finally, an adjustment for whether the monarch was the legitimate heir to the throne.

It is tempting now to keep searching for more features, to find the monarchs for which the model still produces poor predictions, and to find some input feature which explains why their reign was shorter or longer than expected.

But we should be careful.

Take, for example, Edward VIII.

Edward VIII took the throne at the relatively young age of 42, and given the political stability of the age, our model forecasts his reign lasting into his sixties.

But Edward’s reign lasted less than a year.

Edward was something of a ladies’ man (as well as being a likely Nazi-sympathiser), and shortly into his reign announced his intention to marry Wallis Simpson, an American multiple-divorcee with whom he had been conducting a secret affair for several years.

This was considered constitutionally unacceptable, and Edward abdicated the throne.

What a dickHow could we predict this?.We might include some feature in our dataset, like “Engaged to an American” or “Number of Previous Affairs”.

This might allow us to better predict the true length of Edward’s reign, but it would also expose us to a risk.

The next monarch that takes the throne after a string of affairs with married women, or who intends to marry a multiple-divorcee, might do so in a more enlightened age when these indiscretions don’t stand between them and a long tenure on the throne.

We would predict far too short a reign for this hypothetical philandering monarch.

Features that apply to only a small set of our monarchs risk failing to generalise to the wider population.

The associations we find for them hold true in the narrow set of data we have observed, but those observations don’t hold true outside of that set.

For that reason, we should be content with the set of five features we have.

The final rules the model came up with were as follows:38 years-16 years if under 14-1 year per year of age+7 days per year since 1066+/-15 years for every degree of temperature above or below the mean+10 years if legitimate rulerThe model we end up with is still fairly poor, as we can see from this chart plotting predicted reigns vs actual.

For this chart, the model was given data for all of the monarchs except for a random selection of nine.

Then predictions were generated for the nine holdouts.

This way we can see how well our model generalises.

Do rules we’ve learned from one set of monarchs apply well to a different set?.I plotted the predicted length of the monarchs’ reigns on the y-axis, and how long they actually spent on the throne on the x-axis.

Perfectly accurate predictions on this chart would look like a straight diagonal line of dots — the prediction perfectly matching the true value.

The actual accuracy is substantially worse.

Fig.

4: Predicted length of reign vs actualThe model is good in the sense that it very rarely predicts a very short reign for a monarch who in fact reigned for a long time, but it often predicts long reigns for monarchs who in fact achieved only a short time on the throne.

For example, it still gives poor Edward V 25 years on the throne.

In addition, it’s underestimating the length of most reigns — Elizabeth I is predicted to have 27 years, while she in fact achieved 45, George IV managed 11 years, not the three the model gives him.

It’s a strong model for Henries though.

It gets Henry VII spot on, at 24 years, and Henry II only ruled for two years beyond the 33 the model gives him.

What is he planning?We can see this in practice by entering the numbers for our most-likely next data point, Prince Charles.

Assuming Charles took the throne tomorrow (RIP Elizabeth), how long does our model predict he will stay on the throne?Charles is, at time of writing, 68 years old.

The temperature is a full half a degree warmer than the average since 1066, giving Charles another seven years on the throne.

Taking the throne in 2017 gives Charles 951 years since 1066, which our model translates into eighteen extra years of rule.

Charles will (presumably) ascend the throne legitimately, rather than usurping poor Elizabeth, adding another 10 years.

Adding all these factors together, we get a reign of only five years for Charles, which is perhaps a little conservative.

Our model is not terribly accurate — its predictions are usually within ten years of the true value.

That may seem fairly good in the context of the huge variation in the lengths of monarchs’ reigns, but I suspect an error of ten years would be of some substantial concern to the monarch involved.

Linear regression algorithms assume a linear relationship between the features and outcomes.

For example, it assumes that the relationship between temperature and length of reign will always be positive — the warmer it is, the more stable Britain will be, and the longer monarchs will stay on the throne.

All the information the model has seen confirms this relationship.

But the assumption that warmer temperatures will always lead to longer reigns is a short-sighted one.

In fact, climate change is likely to cause significant political instability in the near future, and a tragic consequence of this global upheaval will be a significant drop in the accuracy of our model.

We’re all rooting for you, CharlieIt’s also possible that the correlation between Britain’s climate and the length of time its rulers stay on the throne is entirely spurious.

The link is plausible, and the data, on the surface, seems to fit.

But it’s likely there are dozens of other metrics that show a similar pattern over that time period, and that could also be argued to have some plausible causal link.

If it turns out that the correlation we observed is accidental, then our model will go badly wrong in the future.

This exposes one of the weaknesses of linear regression, and in fact of many such algorithms: Our model works on the basis of a set of strong assumptions about Britain and its rulers — that life expectancies will continue to increase, that warmer temperatures will always mean greater political stability, that no new threats to a monarch’s reign will emerge, and so on.

Should any of these assumptions prove to be false, the accuracy of our model is affected.

We can somewhat mitigate these risks by frequently checking and re-training our model as new data emerges, but it remains a problem in even the most sophisticated algorithms.

For example, financial forecast models might embed an assumption that property prices will always increase, or fail to account for radical disruptions of a business model.

Though our own model was a little iffy, regression models can be extremely accurate, and extremely useful.

To the extent that the assumptions that underpin them hold true, these models can provide very useful guidance.

But when we rely on these models, we are hoping that the conditions that existed when the model was created will continue into the future.

What we’ve learnt is that though these models might create the best fit on average, any individual example might be a substantial outlier.

When a monarch takes the throne, and uses our model to predict their future, they might legitimately take comfort from a prediction of a long reign being their most likely outcome.

But they should also pay attention to circumstances which have changed since the model was created.

And they should keep a sharp eye out for scheming uncles.

Code and data used for this essay can be found in my Github page.

The previous article in this series can be found here.

The next article in this series will be published in March.

.. More details

Leave a Reply