Coder Social home page Coder Social logo

gregfeliu / birthplace_analysis_of_nhl_players Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 5.77 MB

Analyzing the birthplaces of (North American) NHL players in relation to the location off NHL teams.

Python 0.26% Jupyter Notebook 99.74% Dockerfile 0.01%
nhl nhl-api

birthplace_analysis_of_nhl_players's Introduction

NHL Player Birthplaces

Description

Hockey is, and necessarily will be, a regional sport. This has less to do with culture (like it is for the American/Canadian football divide) and more to do with the weather -- if it isn't cold where you live, how can you play a sport on ice? To take an extreme example, Hawaii only has one ice rink in the entire state and, for obvious reasons, does not have any frozen lakes. So it makes sense why there has never been a Hawaiian born and raised hockey player.

Map of all NHL players' birthplaces and closest NHL team

Besides weather, there are clearly some other factors at play for determining the areas that produce NHL players. New York City has a population of about 8.5 million people and 2 current NHL players born in the city. Toronto, on the other hand, has a little less than 3 million people and has 27 players born within the city! Weather may have something to do with this (you would definitely know the difference between the two winters, that's for sure...), but it surely isn't the whole story. Therefore, in order to learn a little more about where competitve hockey is played and produces NHL players, I examined the birthplaces of all current NHL players.

This project will help the casual NHL fan understand more about the players who play professionally. Having a local player on the team won't be surprising for a Toronto Maple Leaf fan but is like winning the lottery for a Nashville Predator fan (lucky for them there is one right now!). Moreover, this will be useful for any rising youth hockey player and their parents understand more about what it takes to rise to the NHL. It is easy to see which areas of North America are hockey powerhouses, and which ones are on the periphery.

Data

I gathered data by using the NHL API to get the names of each of the current NHL players (2021 season) and the names of their birth city and state/province. In addition, I got the names of each of the teams and the location of their arenas.

From there, I used the geopy library to get the coordinates of each players' birth city, as well as the coordinates of each arena. I then used the Shapely and GeoPandas libraries to calculate which team's arena was closest to each player and found the haversine distance (i.e.: the "as the crow flies" distance) to that arena.

With this information available, I was able to plot the data using Folium, and add lines to the closest team for each player using Shapely (see the first image in this README). From there, I looked at the number of players who were born close to an NHL team and the number of players per province/state.

Findings

There were many surprising findings in this project. While it is well known that there are a lot of Canadian NHL players (8 of the top state/provinces nhl players per person were Canadian), certain provinces are much more likely to produce an NHL player than others. Quebec, once a province that had 2 NHL teams, is 4 times less likely to have an NHL player than Saskatchewan (a region with zero NHL teams). Similarly, one U.S. state stands out far above the rest: Minnesota. It has more NHL players per person than some Canadian provinces! In the image below, you can see the number of players per capita for each state/province in the U.S. and Canada.

Cartogram of States/Provinces and Players per Capita

If tomorrow the NHL forced all players to play for their "home" team, only 11 teams would still exist. Even accounting for the ~30% of non-North American players, this finding holds. Those teams can be found in the following screenshot:

Number of players born close to NHL team

If we limited the players to playing on a team that is within 60 miles of their hometown, the number of viable NHL teams then drops to 8:

Number of players born reasonably close to an NHL team

Most Canadian teams would still survive in this scenario: only Ottawa and Winnepeg would be excluded.

Moreover, it's no surprise that northern U.S. teams are more likely to have local NHL players. With a new Seattle team joining next season, this begs the question: how many local NHL players could the Seattle team have? The answer? Only 2. All other local players are much closer to Vancouver. It appears that even a northern team could have fewer local players than a southern team like the Tampa Bay Lightning or even Arizona Coyotes!

A final task for this project was finding the optimal number of clusters for players' birthplaces. In other words, what's the most geographically salient way of describing where a player was born? If we only had X categories for describing where a player was born, what birthplaces make up that category? I found using the Cartogram library and statistical tests (silhouette_score and calinski_harabasz_score) that 3 is the optimal number of clusters for players' birthplaces. I then applied Kmeans clustering to this data and clustered the birthplaces into 3 based on their coordinates. I found that "vertical" splits make the most sense: vertical lines east of chicago and through central Manitoba are the lines that divide players' birthplaces in 3. Of these, the eastern portion has nearly twice as many players as the other two portions. What this means is that the optimal way of dividing players based on the geography of where they were born is whether they were born in western, central, or eastern North America.

To learn more about the project please visit the blog post about the project here.

Image of the 3 clusters of players' birthplaces

Conclusion

Taken together, we see the importance of geography in producing NHL talent. However, there is an important caveat in that many players did not spend most of their time playing where they were born: the Mississippi born Mathieu Olivier grew up playing in Montreal, QC and the only Dutch player to ever play in the NHL moved to Quebec to play hockey at age 8. While someone's birthplace tells us a lot, it still leaves open the thorny question of where and how one became an NHL player. This question seems to be beyond the scope of this analysis, but I hope it helped to shed more light on the issue.

Technologies

  • Python
    • haversine
    • Shapely
    • GeoPandas
    • Folium
    • Pandas
    • Seaborn
    • GeoPy
    • Requests
    • scikit-learn
    • Cartogram
  • Jupyter Notebook

birthplace_analysis_of_nhl_players's People

Contributors

gregfeliu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.