Assessing NHL award winners using K-means

Assessing NHL award winners using K-meansAshley JonesBlockedUnblockFollowFollowingJan 28After the end of each NHL season, various awards are handed out to players that were considered outstanding in various categories.

Whilst there are some awards that are decided upon raw stats, such as, total number of goals in a regular season (Maurice Richard trophy) and total number of points (goals plus assists) in a regular season (Art Ross trophy), there are others that are somewhat more subjective and a voting system is used to decide the winner.

Ballots for most awards are cast by the Professional Hockey Writers’ Association, including the Frank J.

Selke trophy: “to the forward who best excels in the defensive aspects of the game” and the James Norris Memorial trophy: “to the defense player who demonstrates throughout the season the greatest all-round ability in the position”.

Both trophies only apply to regular season performances.

I have never been a big fan of a voting system, although in some cases the resulting winners can be very obvious.

I thought it might be interesting to see that by using standard and advanced hockey statistics alone, if I could classify players into “Tiers” and by doing so, identify the winners of these two voting-style trophies with the hypothesis that a winner in a given year should be in the top Tier.

Data setsThe final data-set used is a combination of traditional and advanced player metrics.

Traditional statistics concern metrics like goals and assists (total being known as points), plus-minus, penalty minutes and time on ice, whilst advanced player metrics deal more with player behavior and puck possession.

Using Python’s beautifulsoup library, I scraped more traditional statistics (such as goals, assists, points, etc.

) from www.


com, whilst the advanced metrics were provided by www.



Advanced hockey statistics from corsica hockey starts in 2008, hence, I have data for nearly 2000 players, spanning 2008–2018.

It should be noted that I only considered advanced statistical data for where the game situation is at even-strength.

Data were cleaned for missing values and were passed into Pandas for analysis.

Whilst, I plan on getting to it eventually, this data set only applies to skaters and not goalies.

Goalie stats are very different from skater stats.

ResponsibilityIn my last post, I devised a player rating system using the data set described above, where various statistical parameters were weighted and summed and passed through a standard Sigmoid function, yielding rating values between 50 and 99.

9 (recursively) for a given regular season; the higher the score, the better the player’s performance.

The rating system identifies four areas to which the ratings are devised, one of which is the “responsibility” parameter known as DEFN which accounts for ~35% of the final rating value (the others being productivity ~50%, stamina ~8 %, and other miscellaneous game features making up ~ 7%).

Responsibility is simply the (on average) defensive responsibility assigned to a player combined with how well he performed those duties over the regular season.

It is a mathematical algorithm of a few advanced statistics defined below:CF%: CF stands for “Corsi For”, which is the number of shots a player’s team generated when that player was on the ice as opposed to “Corsi Against” (CA), the number of shots generated for the opposing team whilst that player was on the ice.

CF% is simply CF / (CF + CA).

CFQoC: The Quality of Competition’s average CF%.

The higher the number, the higher the level of competition.

A somewhat debated statistic regarding its interpretation, but I like it when it’s used in context with OZS (see below).

CFQoT: This is the CF% of a player’s line-mates.

It indicates how a player contributed to the overall play compared to his team-mates.

Usually a good indicator to see if a given player made the players around him better.

OZS: Offensive Zone Starts.

The percentage of times that a given player started their shift in the offensive zone.

xGF%: The ratio of Expected goals for versus Expected goals against.

Expected goals are simply the likelihood of a goal being scored from a given shot.

It provides judgement on the quality of shots.

Hence, an xG with a value of 0.

2, means the shot should be a goal 20% of the time.

Essentially, players with a high score here are involved in high quality chances.

The algorithm takes into account which zone a player is most likely to start his shift with more weight given to a low OZS (thus more defensive duties), multiplied with the quality of competing players on the ice at the same time.

So a player who has a large overall score here has a higher degree of “responsibility”.

How well they deal with their task is decided by multiplying responsibility with the combined factors of the CF%, CFQoT, and xGF%.

More on this algorithm including the actual formula can be found here.

Fig 1.

Histograms of DEFN values for rated forwards (F) and defensemen (D) between 2008 and 2018Fig 1.

shows the distribution of all rated forwards and defensemen DEFN values for seasons from 2008 to 2018.

It can be see that there is a similar gaussian shaped distribution for both classes with a mean/median of ~0.

43 and has a standard deviation of about 0.



Fig 2.

, shows the scatter of these histograms by position (LW, C, RW=forwards, D=defensemen) where DEFN is a function of average Time On Ice (TOI) per game.

TOI dictates the stamina part of the player rating, described above.

Not only are players subjected to responsibility, they will have to endure it over a period of time, some more than others.

The size of the circles represents an individual player’s rating for a given season as calculated from my previous postFig 2.

Scatter plot of all rated players from 2008–2018 classed into position.

The size of the dot is a representation of the overall player ratingWe can learn a lot from this plot.

Defensemen play more minutes than forwards, and as we saw from the Fig 1.

, the distributions of all player positions are fairly even.

We can also see that player ratings tend to get larger per class, per increased playing time, which makes sense given a team will field its best players as much as possible.

However, what about DEFN?.As stated earlier, the higher the DEFN value the larger the responsibility and the better that player has performed those duties.

Hence, we should expect to find the best overall “defensive” players with the best ratings towards the top right corner of both forward/defensemen classes.

In theory, those players should be the ones that are either Selke or Norris trophy winners.

Over the next few sections, we will find out if this is the case.

However, we first need a method to classify the data in Fig 2.

K-meansK-means clustering is an unsupervised clustering technique used to label data and put them into categories which share similar characteristics/behaviour.

A set number of “K” clusters is declared and each data point is compared to randomly generated centroids representative of each cluster.

A given data point is assigned to a given cluster based on the shortest Euclidean distance to the nearest cluster-centroid.

The algorithm then works in an iterative manor, repeating this process, such that the centroid positions are optimized and have stabilized in their final positions.

An important aspect of this kind of modeling is that the features to which the model is trained have to be of similar magnitudes, hence, it is advisable to scale (normalize) those features first, otherwise the clustering algorithm will cluster with a bias towards the larger feature.

Clustering Forwards and Defensemen into “Tiers”Perhaps the major drawback of K-means is deciphering how many clusters to use?.Unless you have some insight information about the data and what would be the best number of clusters, it is not always immediately obvious how many to implement.

However, the elbow method can provide some insight to this problem although is still only suggestive.

The idea is to run multiple different values of K taking note of the sum of square error (SSE).

Increasing the number of K will reduce the SSE and if K = N, the total number of data points, the SSE = 0 as each data point would be its own cluster.

Instead, we examine the SSE as a function of K number of clusters as is shown for the forwards class in Fig 3.

Fig 3.

Sum of Square Error (SSE) for different values of K in a K-means clustering model.

This is for forwards, but a very similar plot is found for defensemen.

Ideally, the number of K used is the one where there is the strongest change in gradient (or kink in the curve), which resembles an elbow.

However, it is not immediately obvious which K to use due to its smoothness and could be realistically between 4 and 12.

To get a better idea, I normalized both axes and calculated the euclidean distance to the origin and considered the shortest distance to be the most “pointy” part of the elbow.

For forwards K=5 and defensemen it is K=4.

Fig 4.

Shows the resulting clustering for both forwards and defensemen using the selected K values.

In both plots, we see that the number of clusters by their different colours and we can identify Tiers based on where they lie in comparison to the whole population.

As mentioned, the higher the TOI and DEFN, the more impressive the result.

Hence Tier-1 is in the top right corner, Tier-2 just below etc.

Selke TrophyNow that we have a reliable set of clusters, we can start to look at who is in the Tier-1 cluster and see if those names match up to the winners of the Selke trophy.

Table 1.

A list of the most appearances (count) in Tier-1 forwards for players with a player rating of greater than 90 from 2008–2018.

Also shown are the number of Selke wins and top-3 nominations.

NB, There are more players with one appearanceTable 1.

, shows a list of the top-30 supposedly best 2-way forwards in the NHL as clustered into Tier-1 over the past 11-seasons and with a player rating greater than 90.

Also given are the number of times a player has been awarded the Selke trophy or at least if they were ever a final-3 nominee for those seasons in the top Tier.

At the top of the list are two familiar names: Pavel Datysuk and Patrice Bergeron, who are renowned for their 2-way style of game and have won multiple Selke trophies.

There are other Selke winners there including, Anze Kopitar, Ryan Kesler, and Jonathan Toews; in-fact all Selke trophy winners between 2008–2018 have been categorized here in this top Tier which is excellent.

Furthermore, there are various other players with final-3 nominations.

In total, 137 different players have been put in this category and those with a player rating of 90 or more, for a given season, have been plotted in Fig 5.

, where DEFN is a function of average TOI.

The circle colours represent the player position and the size represents the player rating for that season.

It’s fair to say that centres (“C”) dominate this Tier which is to be expected as the centre’s role is considered more of an all-round style of game.

Players such as Brad Marchand and Henrik Zetterberg play/played alongside Bergeron and Datsyuk, respectively, and it is arguable in the latter case that Zetterberg should have won the Selke in 2008 instead of Datsyuk given the two giant blobs in the top right corner (with Zetterberg’s contribution slightly better than Datsyuk).

Thus, this brings the question of bias due to “reputation” by voters which I am not a fan of.

However, Bergeron is in a league of his own and has enormous DEFN ratings compared to his piers although in many seasons played fewer average minutes.

Overall, the K-means clustering has done a good job of filtering the best Selke candidates into the correct Tier.

Norris TrophySo what about the Norris?.Table 2.

shows the top-37 names, as well as, any final-3 nominations or Norris wins from 2008–2018 associated with Tier-1 clustering for players with a season rating of at least 88.

Of the 11 possible Norris winners, 8 are in Tier-1 and 3 from Tier-2.

Moreover, of the 33 Top-3 nominees over 11 years, 23 are from Tier-1, 9 from Tier-2, and 1 from Tier-3.

Players such as Zedeno Chara, Nick Lidström, Duncan Keith, Drew Doughty, and P.


Subban have had a Norris trophy awarded and have been popular members of the top Tier club during this period.

The top player is Shea Weber with 7 appearances while Kris Letang, Chara, and Doughty have 6.

Letang has had some fine seasons, but many were also shortened due to injury and voters don’t like absences.

Table 2.

A list of the most appearances (count) in Tier-1 defencemen for players with a player rating of greater than 88 from 2008–2018.

Also shown are the number of Norris wins and top-3 nominations.

The number of Tier-2 Norris winners is also shown.

Fig 6.

shows the time series of DEFN values of Norris winners, as well as, which Tiers they were in.

The average DEFN cutoff (the red line) for Tier-1 ~ 0.

475, and it shows the three winners that are not in Tier-1 (i.


those below the red line).

Fig 6.

Time series of Norris trophy winners presented as DEFN over time.

Winners above the red line are from Tier-1 and below from Tier-2One thing that has to be considered is that even though these DEFN values are lower, the result is relative to all other players in the same year.

So I decided to investigate other possible candidates in those years to see if those Tier-2 winners were justified.

2011: Nick Lidström won even though his DEFN and season player rating only came 5th among top D-men.

Lubomir Visnovsky’s overall game (TOI=23,DEFN=0.


9,Tier=1) was superior to Lidström’s (TOI=22,DEFN=0.


3,Tier=2) that year, but it only earned a 4th place in voting.

Lidström turned 40 that season, the oldest player to win the award.

Maybe a nice little “thank you” by voters for all those years of service?2015: Erik Karlsson won in 2015.

Two other Tier-1 defensemen were ahead of Karlsson that year, but both had injury plagued seasons.

Whilst Karlsson’s DEFN numbers (TOI=27,DEFN=0.


1,Tier=2) were fairly average, he and Doughty (TOI=29,DEFN=0.


5,Tier=1) played the most minutes in hockey and Karlsson beat out Doughty in a close race.

2018: In 2018, Victor Hedman (TOI=26,DEFN=0.


4,Tier=2) won with a DEFN value just above the population average and beating out Doughty (TOI=26,DEFN=0.


1,Tier=1), who arguably had the better all-round performance.

2017: Interestingly, Brent Burns (TOI=25,DEFN=0.


4,Tier=1) won in 2017 and is classified as Tier-1, but is on the limit.

In-fact, his DEFN results were much lower than other potential candidates in Tier-1, such as, Doughty (TOI=27,DEFN=0.


7,Tier=1) and Alex Pietrangelo (TOI=25,DEFN=0.



So by reconsidering Fig 6.

, we see that in three of the last four years, there has been a preference for voters to pick players with lower DEFN ratings.

As the player rating dominant weight is productivity (~50% ) and that top players have similar average TOI (whilst other miscellaneous player rating parameters are similar among top players), it means production (i.


points) is being given more emphasis rather than the “greatest all-round ability”, which the Norris is traditionally all about.

Fig 7.

presents all Tier-1 defensemen from 2008–2018 with a player rating greater than 88.

ConclusionUnder the assumption that the DEFN parameter is a good indicator of defensive attributes, we have assessed the last 11 winners of the Selke and Norris trophies.

For the most part, we can say that voters have been selecting the Selke trophy winners correctly.

With all winners and many nominees being clustered in the top Tier, it’s fair to say that players are consistently recognized by the standards declared in order to win this trophy.

However, there appears to be an inconsistent attitude towards what is valued in Norris trophy winners where recently productivity seems to be the driving factor rather than an all-round game.

It’s hard to explain why that is.

Is it because the style of the game has changed dramatically over the past decade where more emphasis is now on speed and skill?.Is it because the NHL-game is converting to a more European style, where offense is driven from the defense using quarterback-styled players?.Are all-rounded defensemen not as sexy as they used to be?.2019 should be an interesting year for award winners, so watch this space.

The code for this work can be found on my Github.

Thanks for reading!.

. More details

Leave a Reply