用 Python 创建 NBA 得分图表,pythonnba,未经许可,禁止转载!英文
用 Python 创建 NBA 得分图表,pythonnba,未经许可,禁止转载!英文
本文由 编橙之家 - Sam Lin 翻译,艾凌风 校稿。未经许可,禁止转载!英文出处:Savvas Tjortjoglou。欢迎加入翻译组。
在这篇文章中,我研究了如何提取一个篮球运动员的得分图数据,然后使用 matplotlib
和 seaborn
绘制得分图。
%matplotlib inline import requests import matplotlib.pyplot as plt import pandas as pd import seaborn as sns
获取数据
从 stats.nba.com 获取数据是非常简单的。虽然 NBA 没有提供一个公共的 API,但是实际上我们可以通过使用 requests
库来访问 NBA 用在 stats.nba.com 上的 API。Greg Reda 发布的这篇博客很好地解释了如何访问这个 API(或者为任意 web 应用找到一个 API 来完成上述问题)。
我们将会使用这个 URL 来获取 James Harden 的得分图数据。
Pythonshot_chart_url = 'http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPAR'\ 'AMS=2014-15&ContextFilter=&ContextMeasure=FGA&DateFrom=&D'\ 'ateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Loca'\ 'tion=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&'\ 'PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=201935&Plu'\ 'sMinus=N&Position=&Rank=N&RookieYear=&Season=2014-15&Seas'\ 'onSegment=&SeasonType=Regular+Season&TeamID=0&VsConferenc'\ 'e=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&sh'\ 'owZones=0'
上述 URL 发送给我们一个 JSON 文件,该文件包含我们想要的数据。也要注意到 URL 包含了用于访问数据的各种 API 参数。URL 中被设置为 201935 的 PlayerID 参数就是 James Harden 的 PlayerID。
现在让我们使用 requests
库来获取我们想要的数据。
# Get the webpage containing the data response = requests.get(shot_chart_url) # Grab the headers to be used as column headers for our DataFrame headers = response.json()['resultSets'][0]['headers'] # Grab the shot chart data shots = response.json()['resultSets'][0]['rowSet']
利用获取到的得分图数据创建一个 pandas DataFrame
。
shot_df = pd.DataFrame(shots, columns=headers) # View the head of the DataFrame and all its columns from IPython.display import display with pd.option_context('display.max_columns', None): display(shot_df.head())
上述的得分图数据包含了 2014-15 常规赛期间 James Harden 所有的投篮出手次数。我们想要的数据在 LOC_X 和 LOC_Y 中。这是每次投篮出手位置的坐标值,然后就可以在代表篮球场的一组坐标轴上绘制这些坐标值了。
绘制得分图数据
让我们快速地绘制数据,看看数据是如何分布的。
Pythonsns.set_style("white") sns.set_color_codes() plt.figure(figsize=(12,11)) plt.scatter(shot_df.LOC_X, shot_df.LOC_Y) plt.show()
注意到上述的图表没能很好的表示数据。横坐标轴上的值与实际的相反了。下面只把右边的投篮绘制出来,看看问题出在哪里。
Pythonright = shot_df[shot_df.SHOT_ZONE_AREA == "Right Side(R)"] plt.figure(figsize=(12,11)) plt.scatter(right.LOC_X, right.LOC_Y) plt.xlim(-300,300) plt.ylim(-100,500) plt.show()
正如我们所看到的,属于“右侧”出手的投篮,虽然是位于观看者的右边,但是实际上位于篮框的左边。在创建我们最终的得分图时,这是需要我们解决的问题。
绘制篮球场
首先我们需要了解如何在图表上绘制球场线。通过查看第一张图片和数据,我们可以大概估计球场的中心位于原点。我们也可以估算在 x 轴或者 y 轴上每 10 个单位代表 1 英尺。我们可以通过查看 DataFrame
中第一条记录来验证前面的估算。某个投篮是从距离为 22 英尺的右边底角 3 分线出手的,而 22 英尺对应 LOC_X 的值为 226。所以那个投篮是从球场右边大约 22.6 英尺的地方出手的。现在我们知道这个之后,我们可以在图表上绘制球场了。
篮球场的尺寸可以查看这里和这个网址。
利用这些尺寸,我们可以将它们转换成适合我们图表大小的尺寸,并且使用 Matplotlib Patches 来绘制。我们将使用 Circle、Rectangle 和 Arc 对象来绘制我们的球场。现在要创建函数来绘制篮球场了。
NOTE:虽然你可以使用 Lines2D 在图上绘制线条,但是我发现用 Rectangles 会更加方便(不带一个高度或者宽度)。
编辑(Aug 4, 2015):我在绘制外线和半场弧形的时候犯了一个错误。外面的球场线高度从错误的值 442.5 改成 470。球场中央弧形的中心,其 y 值从 395 改到 422.5。在图表上,y 轴范围上的值从 (395, -47.5) 改成 (422.5, -47.5)。
Pythonfrom matplotlib.patches import Circle, Rectangle, Arc def draw_court(ax=None, color='black', lw=2, outer_lines=False): # If an axes object isn't provided to plot onto, just get current one if ax is None: ax = plt.gca() # Create the various parts of an NBA basketball court # Create the basketball hoop # Diameter of a hoop is 18" so it has a radius of 9", which is a value # 7.5 in our coordinate system hoop = Circle((0, 0), radius=7.5, linewidth=lw, color=color, fill=False) # Create backboard backboard = Rectangle((-30, -7.5), 60, -1, linewidth=lw, color=color) # The paint # Create the outer box 0f the paint, width=16ft, height=19ft outer_box = Rectangle((-80, -47.5), 160, 190, linewidth=lw, color=color, fill=False) # Create the inner box of the paint, widt=12ft, height=19ft inner_box = Rectangle((-60, -47.5), 120, 190, linewidth=lw, color=color, fill=False) # Create free throw top arc top_free_throw = Arc((0, 142.5), 120, 120, theta1=0, theta2=180, linewidth=lw, color=color, fill=False) # Create free throw bottom arc bottom_free_throw = Arc((0, 142.5), 120, 120, theta1=180, theta2=0, linewidth=lw, color=color, linestyle='dashed') # Restricted Zone, it is an arc with 4ft radius from center of the hoop restricted = Arc((0, 0), 80, 80, theta1=0, theta2=180, linewidth=lw, color=color) # Three point line # Create the side 3pt lines, they are 14ft long before they begin to arc corner_three_a = Rectangle((-220, -47.5), 0, 140, linewidth=lw, color=color) corner_three_b = Rectangle((220, -47.5), 0, 140, linewidth=lw, color=color) # 3pt arc - center of arc will be the hoop, arc is 23'9" away from hoop # I just played around with the theta values until they lined up with the # threes three_arc = Arc((0, 0), 475, 475, theta1=22, theta2=158, linewidth=lw, color=color) # Center Court center_outer_arc = Arc((0, 422.5), 120, 120, theta1=180, theta2=0, linewidth=lw, color=color) center_inner_arc = Arc((0, 422.5), 40, 40, theta1=180, theta2=0, linewidth=lw, color=color) # List of the court elements to be plotted onto the axes court_elements = [hoop, backboard, outer_box, inner_box, top_free_throw, bottom_free_throw, restricted, corner_three_a, corner_three_b, three_arc, center_outer_arc, center_inner_arc] if outer_lines: # Draw the half court line, baseline and side out bound lines outer_lines = Rectangle((-250, -47.5), 500, 470, linewidth=lw, color=color, fill=False) court_elements.append(outer_lines) # Add the court elements onto the axes for element in court_elements: ax.add_patch(element) return ax
让我们绘制我们的球场吧
Pythonplt.figure(figsize=(12,11)) draw_court(outer_lines=True) plt.xlim(-300,300) plt.ylim(-100,500) plt.show()
创建一些得分图
现在绘制经过我们适当调整的得分图数据和球场。我们可以用两种方法调整 x 值。可以传递 LOC_X 的负数给 plt.scatter
,或者将降序的值传给 plt.xlim
。我们将会使用后者来绘制我们的得分图。
plt.figure(figsize=(12,11)) plt.scatter(shot_df.LOC_X, shot_df.LOC_Y) draw_court(outer_lines=True) # Descending values along the axis from left to right plt.xlim(300,-300) plt.show()
调整得分图,使得篮框位于图表的顶部,这与 stats.nba.com 上的得分图方向一致。我们通过将递减的 y 值从 y 轴的底部设置到顶部来实现前面所说的效果。当这样做的时候,我们就不再需要调整图片的 x 值。
Pythonplt.figure(figsize=(12,11)) plt.scatter(shot_df.LOC_X, shot_df.LOC_Y) draw_court() # Adjust plot limits to just fit in half court plt.xlim(-250,250) # Descending values along th y axis from bottom to top # in order to place the hoop by the top of plot plt.ylim(422.5, -47.5) # get rid of axis tick labels # plt.tick_params(labelbottom=False, labelleft=False) plt.show()
使用 seaborn
中的 jointplot
来创建一些得分图。
# create our jointplot joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None, kind='scatter', space=0, alpha=0.5) joint_shot_chart.fig.set_size_inches(12,11) # A joint plot has 3 Axes, the first one called ax_joint # is the one we want to draw our court onto and adjust some other settings ax = joint_shot_chart.ax_joint draw_court(ax) # Adjust the axis limits and orientation of the plot in order # to plot half court, with the hoop by the top of the plot ax.set_xlim(-250,250) ax.set_ylim(422.5, -47.5) # Get rid of axis labels and tick marks ax.set_xlabel('') ax.set_ylabel('') ax.tick_params(labelbottom='off', labelleft='off') # Add a title ax.set_title('James Harden FGA n2014-15 Reg. Season', y=1.2, fontsize=18) # Add Data Scource and Author ax.text(-250,445,'Data Source: stats.nba.com' 'nAuthor: Savvas Tjortjoglou (savvastjortjoglou.com)', fontsize=12) plt.show()
获取一个球员的图片
我们也可以从 stats.nba.com 上获取 James Harden 的照片并且将它放到我们的图表上。我们可以在这个 URL 找到他的图片。
为了获取图片给我们的图表使用,我们可以像下面一样使用 url.requests
中的 urlretrieve
:
import urllib.request # we pass in the link to the image as the 1st argument # the 2nd argument tells urlretrieve what we want to scrape pic = urllib.request.urlretrieve("http://stats.nba.com/media/players/230x185/201935.png", "201935.png") # urlretrieve returns a tuple with our image as the first # element and imread reads in the image as a # mutlidimensional numpy array so matplotlib can plot it harden_pic = plt.imread(pic[0]) # plot the image plt.imshow(harden_pic) plt.show()
现在可以将 Harden 的脸绘制到一个 jointplot
上了。我们将会导入 matplotlib.Offset
中的 OffsetImage
模块,该模块允许我们将图片放到图表的右上角。所以让我们像前面所做的一样去创建得分图,但是这次我们将会创建一个 KDE(核心密度估计)jointplot
并且在最后添加我们的图片。
from matplotlib.offsetbox import OffsetImage # create our jointplot # get our colormap for the main kde plot # Note we can extract a color from cmap to use for # the plots that lie on the side and top axes cmap=plt.cm.YlOrRd_r # n_levels sets the number of contour lines for the main kde plot joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None, kind='kde', space=0, color=cmap(0.1), cmap=cmap, n_levels=50) joint_shot_chart.fig.set_size_inches(12,11) # A joint plot has 3 Axes, the first one called ax_joint # is the one we want to draw our court onto and adjust some other settings ax = joint_shot_chart.ax_joint draw_court(ax) # Adjust the axis limits and orientation of the plot in order # to plot half court, with the hoop by the top of the plot ax.set_xlim(-250,250) ax.set_ylim(422.5, -47.5) # Get rid of axis labels and tick marks ax.set_xlabel('') ax.set_ylabel('') ax.tick_params(labelbottom='off', labelleft='off') # Add a title ax.set_title('James Harden FGA n2014-15 Reg. Season', y=1.2, fontsize=18) # Add Data Scource and Author ax.text(-250,445,'Data Source: stats.nba.com' 'nAuthor: Savvas Tjortjoglou (savvastjortjoglou.com)', fontsize=12) # Add Harden's image to the top right # First create our OffSetImage by passing in our image # and set the zoom level to make the image small enough # to fit on our plot img = OffsetImage(harden_pic, zoom=0.6) # Pass in a tuple of x,y coordinates to set_offset # to place the plot where you want, I just played around # with the values until I found a spot where I wanted # the image to be img.set_offset((625,621)) # add the image ax.add_artist(img) plt.show()
另一个用 hexbins 的 jointplot
。
# create our jointplot cmap=plt.cm.gist_heat_r joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None, kind='hex', space=0, color=cmap(.2), cmap=cmap) joint_shot_chart.fig.set_size_inches(12,11) # A joint plot has 3 Axes, the first one called ax_joint # is the one we want to draw our court onto ax = joint_shot_chart.ax_joint draw_court(ax) # Adjust the axis limits and orientation of the plot in order # to plot half court, with the hoop by the top of the plot ax.set_xlim(-250,250) ax.set_ylim(422.5, -47.5) # Get rid of axis labels and tick marks ax.set_xlabel('') ax.set_ylabel('') ax.tick_params(labelbottom='off', labelleft='off') # Add a title ax.set_title('FGA 2014-15 Reg. Season', y=1.2, fontsize=14) # Add Data Source and Author ax.text(-250,445,'Data Source: stats.nba.com' 'nAuthor: Savvas Tjortjoglou', fontsize=12) # Add James Harden's image to the top right img = OffsetImage(harden_pic, zoom=0.6) img.set_offset((625,621)) ax.add_artist(img) plt.show()
编辑:根据 Ogi010 的建议,我使用新的 Viridis matplotlib
颜色映射(你可以在这里找到该映射),重新创建 KDE 图表。
# import the object that contains the viridis colormap from option_d import test_cm as viridis # Register and set Viridis as the colormap for the plot plt.register_cmap(cmap=viridis) cmap = plt.get_cmap(viridis.name) # n_levels sets the number of contour lines for the main kde plot joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None, kind='kde', space=0, color=cmap(0.1), cmap=cmap, n_levels=50) joint_shot_chart.fig.set_size_inches(12,11) # A joint plot has 3 Axes, the first one called ax_joint, # It's the one we want to draw our court onto and adjust some other settings ax = joint_shot_chart.ax_joint draw_court(ax, color="white", lw=1) # Adjust the axis limits and orientation of the plot in order # to plot half court, with the hoop by the top of the plot ax.set_xlim(-250,250) ax.set_ylim(422.5, -47.5) # Get rid of axis labels and tick marks ax.set_xlabel('') ax.set_ylabel('') ax.tick_params(labelbottom='off', labelleft='off') # Add a title ax.set_title('James Harden FGA n2014-15 Reg. Season', y=1.2, fontsize=18) # Add Data Scource and Author ax.text(-250,445,'Data Source: stats.nba.com' 'nAuthor: Savvas Tjortjoglou', fontsize=12) # Add Harden's image to the top right # First create our OffSetImage by passing in our image # and set the zoom level to make the image small enough # to fit on our plot img = OffsetImage(harden_pic, zoom=0.6) # Pass in a tuple of x,y coordinates to set_offset # to place the plot where you want, I just played around # with the values until I found a spot where I wanted # the image to be img.set_offset((625,621)) # add the image ax.add_artist(img) plt.show()Python
import sys print('Python version:', sys.version_info) import IPython print('IPython version:', IPython.__version__) print('Requests verstion', requests.__version__) print('Urllib.requests version', urllib.request.__version__) import matplotlib as mpl print('Matplotlib version:', mpl.__version__) print('Seaborn version:', sns.__version__) print('Pandas version:', pd.__version__)Python
Python version: sys.version_info(major=3, minor=4, micro=3, releaselevel='final', serial=0) IPython version: 3.2.0 Requests verstion 2.7.0 Urllib.requests version 3.4 Matplotlib version: 1.4.3 Seaborn version: 0.6.0 Pandas version: 0.16.2
延伸阅读和资源
我创建了一个模块,该模块包含了上述所有的功能,你可以在这里找到该模块。
Bradley Fey 写了这样一个非常酷的包,可以让你通过一个很好用的 Python 封装器来访问 stats.nba.com 上大量的数据。
如果你想阅读更多关于如何使用 Python 获取数据的话,我建议阅读一些 Greg Reda 的贴子。这是另一个很好的贴子,它介绍了如何从 stats.nba.com 上获取数据。
如果你发现任何问题或者有任何疑问,请在下面留言。
打赏支持我翻译更多好文章,谢谢!
打赏译者
打赏支持我翻译更多好文章,谢谢!
相关内容
- matplotlib中两个坐标轴之间画一条直线光标,matplotlib坐
- matplotlib中,如何在坐标系中画一个矩形,matplotlib坐标系
- matplotlib中,如何在坐标系中画一个矩形,matplotlib坐标系
- Python实现matplotlib显示中文的方法详解,
- Python使用matplotlib简单绘图示例,pythonmatplotlib
- 详解pyenv下使用python matplotlib模块的问题解决,pyenvmat
- Python使用matplotlib绘图无法显示中文问题的解决方法,
- Python使用matplotlib绘制余弦的散点图示例,pythonmatplotl
- Python使用matplotlib绘制多个图形单独显示的方法示例,
- python中matplotlib的颜色及线条控制的示例,pythonmatplotl
评论关闭