Insights From Raw NBA Shot Log Data and an Exploration of the Hot Hand Phenomenon

Creating this graph required a lot of data preprocessing in Jupyter Notebook to construct a JSON file that was formatted in a way that is easy to manipulate from JavaScript.First, we had to create a lookup matrix from the shot log data to record the number of points each player scored on each other player..This matrix, attack_defense_matrix.csv, was then iterated over in conjunction with players_kmeans.csv to construct a list of nodes and edges..Each node contained information about a player, including a unique identifier, their total number of points made, the k-means defined class label, and the player’s Defensive Box Plus/Minus (DBPM)..Each edge reflects the number of points that a given player scored on the other player..This data was then stored as an adjacency list in a JSON file that a JavaScript file parsed to populate the graph.We created two interactive force-directed graphs from this processed data: one that visualizes players’ offensive capabilities and one that highlights their defensive limitations..On mouseover, the offensive graphs display each player’s generated 3D heatmap from our scoring efficiency tool..The defensive graph displays each player’s adjusted DBPM.Force-Directed Graph of NBA Players by Offensive ClusterInterpretation of the GraphOffensive GraphThe size of the nodes in the offensive graph indicate a player’s volume of scoring..We chose to display a player’s shot distribution on mouseover to explore how different types of players’ offensive capabilities differ..The average size of a node differs greatly by class..The colors of the nodes represent the different groups generated by the k-means clustering..Red and blue nodes tend to both be high-volume scorers, but they generally take their shots from different areas on the court..Purple nodes are secondary scorers with a more even distribution of shots, and green and orange nodes tend to be more low-output players.. More details

Leave a Reply