用 Python 创建 NBA 得分图表,pythonnba,未经许可,禁止转载!英文


本文由 编橙之家 - Sam Lin 翻译,艾凌风 校稿。未经许可,禁止转载!
英文出处:Savvas Tjortjoglou。欢迎加入翻译组。

在这篇文章中,我研究了如何提取一个篮球运动员的得分图数据,然后使用 matplotlibseaborn 绘制得分图。

Python
%matplotlib inline
import requests
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

获取数据

从 stats.nba.com 获取数据是非常简单的。虽然 NBA 没有提供一个公共的 API,但是实际上我们可以通过使用 requests 库来访问 NBA 用在 stats.nba.com 上的 API。Greg Reda 发布的这篇博客很好地解释了如何访问这个 API(或者为任意 web 应用找到一个 API 来完成上述问题)。

我们将会使用这个 URL 来获取 James Harden 的得分图数据。

Python
shot_chart_url = 'http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPAR'\
                'AMS=2014-15&ContextFilter=&ContextMeasure=FGA&DateFrom=&D'\
                'ateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Loca'\
                'tion=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&'\
                'PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=201935&Plu'\
                'sMinus=N&Position=&Rank=N&RookieYear=&Season=2014-15&Seas'\
                'onSegment=&SeasonType=Regular+Season&TeamID=0&VsConferenc'\
                'e=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&sh'\
                'owZones=0'

上述 URL 发送给我们一个 JSON 文件,该文件包含我们想要的数据。也要注意到 URL 包含了用于访问数据的各种 API 参数。URL 中被设置为 201935 的 PlayerID 参数就是 James Harden 的 PlayerID

现在让我们使用 requests 库来获取我们想要的数据。

Python
# Get the webpage containing the data
response = requests.get(shot_chart_url)
# Grab the headers to be used as column headers for our DataFrame
headers = response.json()['resultSets'][0]['headers']
# Grab the shot chart data
shots = response.json()['resultSets'][0]['rowSet']

利用获取到的得分图数据创建一个 pandas DataFrame

Python
shot_df = pd.DataFrame(shots, columns=headers)
# View the head of the DataFrame and all its columns
from IPython.display import display
with pd.option_context('display.max_columns', None):
    display(shot_df.head())

 

上述的得分图数据包含了 2014-15 常规赛期间 James Harden 所有的投篮出手次数。我们想要的数据在 LOC_XLOC_Y 中。这是每次投篮出手位置的坐标值,然后就可以在代表篮球场的一组坐标轴上绘制这些坐标值了。

绘制得分图数据

让我们快速地绘制数据,看看数据是如何分布的。

Python
sns.set_style("white")
sns.set_color_codes()
plt.figure(figsize=(12,11))
plt.scatter(shot_df.LOC_X, shot_df.LOC_Y)
plt.show()

注意到上述的图表没能很好的表示数据。横坐标轴上的值与实际的相反了。下面只把右边的投篮绘制出来,看看问题出在哪里。

Python
right = shot_df[shot_df.SHOT_ZONE_AREA == "Right Side(R)"]
plt.figure(figsize=(12,11))
plt.scatter(right.LOC_X, right.LOC_Y)
plt.xlim(-300,300)
plt.ylim(-100,500)
plt.show()

正如我们所看到的,属于“右侧”出手的投篮,虽然是位于观看者的右边,但是实际上位于篮框的左边。在创建我们最终的得分图时,这是需要我们解决的问题。

绘制篮球场

首先我们需要了解如何在图表上绘制球场线。通过查看第一张图片和数据,我们可以大概估计球场的中心位于原点。我们也可以估算在 x 轴或者 y 轴上每 10 个单位代表 1 英尺。我们可以通过查看 DataFrame 中第一条记录来验证前面的估算。某个投篮是从距离为 22 英尺的右边底角 3 分线出手的,而 22 英尺对应 LOC_X 的值为 226。所以那个投篮是从球场右边大约 22.6 英尺的地方出手的。现在我们知道这个之后,我们可以在图表上绘制球场了。

篮球场的尺寸可以查看这里和这个网址。

利用这些尺寸,我们可以将它们转换成适合我们图表大小的尺寸,并且使用 Matplotlib Patches 来绘制。我们将使用 CircleRectangleArc 对象来绘制我们的球场。现在要创建函数来绘制篮球场了。

NOTE:虽然你可以使用 Lines2D 在图上绘制线条,但是我发现用 Rectangles 会更加方便(不带一个高度或者宽度)。

编辑(Aug 4, 2015):我在绘制外线和半场弧形的时候犯了一个错误。外面的球场线高度从错误的值 442.5 改成 470。球场中央弧形的中心,其 y 值从 395 改到 422.5。在图表上,y 轴范围上的值从 (395, -47.5) 改成 (422.5, -47.5)。

Python
from matplotlib.patches import Circle, Rectangle, Arc

def draw_court(ax=None, color='black', lw=2, outer_lines=False):
    # If an axes object isn't provided to plot onto, just get current one
    if ax is None:
        ax = plt.gca()

    # Create the various parts of an NBA basketball court

    # Create the basketball hoop
    # Diameter of a hoop is 18" so it has a radius of 9", which is a value
    # 7.5 in our coordinate system
    hoop = Circle((0, 0), radius=7.5, linewidth=lw, color=color, fill=False)

    # Create backboard
    backboard = Rectangle((-30, -7.5), 60, -1, linewidth=lw, color=color)

    # The paint
    # Create the outer box 0f the paint, width=16ft, height=19ft
    outer_box = Rectangle((-80, -47.5), 160, 190, linewidth=lw, color=color,
                          fill=False)
    # Create the inner box of the paint, widt=12ft, height=19ft
    inner_box = Rectangle((-60, -47.5), 120, 190, linewidth=lw, color=color,
                          fill=False)

    # Create free throw top arc
    top_free_throw = Arc((0, 142.5), 120, 120, theta1=0, theta2=180,
                         linewidth=lw, color=color, fill=False)
    # Create free throw bottom arc
    bottom_free_throw = Arc((0, 142.5), 120, 120, theta1=180, theta2=0,
                            linewidth=lw, color=color, linestyle='dashed')
    # Restricted Zone, it is an arc with 4ft radius from center of the hoop
    restricted = Arc((0, 0), 80, 80, theta1=0, theta2=180, linewidth=lw,
                     color=color)

    # Three point line
    # Create the side 3pt lines, they are 14ft long before they begin to arc
    corner_three_a = Rectangle((-220, -47.5), 0, 140, linewidth=lw,
                               color=color)
    corner_three_b = Rectangle((220, -47.5), 0, 140, linewidth=lw, color=color)
    # 3pt arc - center of arc will be the hoop, arc is 23'9" away from hoop
    # I just played around with the theta values until they lined up with the 
    # threes
    three_arc = Arc((0, 0), 475, 475, theta1=22, theta2=158, linewidth=lw,
                    color=color)

    # Center Court
    center_outer_arc = Arc((0, 422.5), 120, 120, theta1=180, theta2=0,
                           linewidth=lw, color=color)
    center_inner_arc = Arc((0, 422.5), 40, 40, theta1=180, theta2=0,
                           linewidth=lw, color=color)

    # List of the court elements to be plotted onto the axes
    court_elements = [hoop, backboard, outer_box, inner_box, top_free_throw,
                      bottom_free_throw, restricted, corner_three_a,
                      corner_three_b, three_arc, center_outer_arc,
                      center_inner_arc]

    if outer_lines:
        # Draw the half court line, baseline and side out bound lines
        outer_lines = Rectangle((-250, -47.5), 500, 470, linewidth=lw,
                                color=color, fill=False)
        court_elements.append(outer_lines)

    # Add the court elements onto the axes
    for element in court_elements:
        ax.add_patch(element)

    return ax

让我们绘制我们的球场吧

Python
plt.figure(figsize=(12,11))
draw_court(outer_lines=True)
plt.xlim(-300,300)
plt.ylim(-100,500)
plt.show()

创建一些得分图

现在绘制经过我们适当调整的得分图数据和球场。我们可以用两种方法调整 x 值。可以传递 LOC_X 的负数给 plt.scatter,或者将降序的值传给 plt.xlim。我们将会使用后者来绘制我们的得分图。

Python
plt.figure(figsize=(12,11))
plt.scatter(shot_df.LOC_X, shot_df.LOC_Y)
draw_court(outer_lines=True)
# Descending values along the axis from left to right
plt.xlim(300,-300)
plt.show()

调整得分图,使得篮框位于图表的顶部,这与 stats.nba.com 上的得分图方向一致。我们通过将递减的 y 值从 y 轴的底部设置到顶部来实现前面所说的效果。当这样做的时候,我们就不再需要调整图片的 x 值。

Python
plt.figure(figsize=(12,11))
plt.scatter(shot_df.LOC_X, shot_df.LOC_Y)
draw_court()
# Adjust plot limits to just fit in half court
plt.xlim(-250,250)
# Descending values along th y axis from bottom to top
# in order to place the hoop by the top of plot
plt.ylim(422.5, -47.5)
# get rid of axis tick labels
# plt.tick_params(labelbottom=False, labelleft=False)
plt.show()

使用 seaborn 中的 jointplot 来创建一些得分图。

Python
# create our jointplot
joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None,
                                 kind='scatter', space=0, alpha=0.5)

joint_shot_chart.fig.set_size_inches(12,11)

# A joint plot has 3 Axes, the first one called ax_joint 
# is the one we want to draw our court onto and adjust some other settings
ax = joint_shot_chart.ax_joint
draw_court(ax)

# Adjust the axis limits and orientation of the plot in order
# to plot half court, with the hoop by the top of the plot
ax.set_xlim(-250,250)
ax.set_ylim(422.5, -47.5)

# Get rid of axis labels and tick marks
ax.set_xlabel('')
ax.set_ylabel('')
ax.tick_params(labelbottom='off', labelleft='off')

# Add a title
ax.set_title('James Harden FGA n2014-15 Reg. Season', 
             y=1.2, fontsize=18)

# Add Data Scource and Author
ax.text(-250,445,'Data Source: stats.nba.com'
        'nAuthor: Savvas Tjortjoglou (savvastjortjoglou.com)',
        fontsize=12)

plt.show()

获取一个球员的图片

我们也可以从 stats.nba.com 上获取 James Harden 的照片并且将它放到我们的图表上。我们可以在这个 URL 找到他的图片。

为了获取图片给我们的图表使用,我们可以像下面一样使用 url.requests 中的 urlretrieve

Python
import urllib.request
# we pass in the link to the image as the 1st argument
# the 2nd argument tells urlretrieve what we want to scrape
pic = urllib.request.urlretrieve("http://stats.nba.com/media/players/230x185/201935.png",
                                "201935.png")

# urlretrieve returns a tuple with our image as the first 
# element and imread reads in the image as a 
# mutlidimensional numpy array so matplotlib can plot it
harden_pic = plt.imread(pic[0])

# plot the image
plt.imshow(harden_pic)
plt.show()

现在可以将 Harden 的脸绘制到一个 jointplot 上了。我们将会导入 matplotlib.Offset 中的 OffsetImage 模块,该模块允许我们将图片放到图表的右上角。所以让我们像前面所做的一样去创建得分图,但是这次我们将会创建一个 KDE(核心密度估计)jointplot 并且在最后添加我们的图片。

Python
from matplotlib.offsetbox import  OffsetImage

# create our jointplot

# get our colormap for the main kde plot
# Note we can extract a color from cmap to use for 
# the plots that lie on the side and top axes
cmap=plt.cm.YlOrRd_r 

# n_levels sets the number of contour lines for the main kde plot
joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None,
                                 kind='kde', space=0, color=cmap(0.1),
                                 cmap=cmap, n_levels=50)

joint_shot_chart.fig.set_size_inches(12,11)

# A joint plot has 3 Axes, the first one called ax_joint 
# is the one we want to draw our court onto and adjust some other settings
ax = joint_shot_chart.ax_joint
draw_court(ax)

# Adjust the axis limits and orientation of the plot in order
# to plot half court, with the hoop by the top of the plot
ax.set_xlim(-250,250)
ax.set_ylim(422.5, -47.5)

# Get rid of axis labels and tick marks
ax.set_xlabel('')
ax.set_ylabel('')
ax.tick_params(labelbottom='off', labelleft='off')

# Add a title
ax.set_title('James Harden FGA n2014-15 Reg. Season', 
             y=1.2, fontsize=18)

# Add Data Scource and Author
ax.text(-250,445,'Data Source: stats.nba.com'
        'nAuthor: Savvas Tjortjoglou (savvastjortjoglou.com)',
        fontsize=12)

# Add Harden's image to the top right
# First create our OffSetImage by passing in our image
# and set the zoom level to make the image small enough 
# to fit on our plot
img = OffsetImage(harden_pic, zoom=0.6)
# Pass in a tuple of x,y coordinates to set_offset
# to place the plot where you want, I just played around
# with the values until I found a spot where I wanted
# the image to be
img.set_offset((625,621))
# add the image
ax.add_artist(img)

plt.show()

另一个用 hexbins 的 jointplot

Python
# create our jointplot

cmap=plt.cm.gist_heat_r
joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None,
                                 kind='hex', space=0, color=cmap(.2), cmap=cmap)

joint_shot_chart.fig.set_size_inches(12,11)

# A joint plot has 3 Axes, the first one called ax_joint 
# is the one we want to draw our court onto 
ax = joint_shot_chart.ax_joint
draw_court(ax)

# Adjust the axis limits and orientation of the plot in order
# to plot half court, with the hoop by the top of the plot
ax.set_xlim(-250,250)
ax.set_ylim(422.5, -47.5)

# Get rid of axis labels and tick marks
ax.set_xlabel('')
ax.set_ylabel('')
ax.tick_params(labelbottom='off', labelleft='off')

# Add a title
ax.set_title('FGA 2014-15 Reg. Season', y=1.2, fontsize=14)

# Add Data Source and Author
ax.text(-250,445,'Data Source: stats.nba.com'
        'nAuthor: Savvas Tjortjoglou', fontsize=12)

# Add James Harden's image to the top right
img = OffsetImage(harden_pic, zoom=0.6)
img.set_offset((625,621))
ax.add_artist(img)

plt.show()

编辑:根据 Ogi010 的建议,我使用新的 Viridis matplotlib 颜色映射(你可以在这里找到该映射),重新创建 KDE 图表。

Python
# import the object that contains the viridis colormap
from option_d import test_cm as viridis

# Register and set Viridis as the colormap for the plot
plt.register_cmap(cmap=viridis)
cmap = plt.get_cmap(viridis.name)

# n_levels sets the number of contour lines for the main kde plot
joint_shot_chart = sns.jointplot(shot_df.LOC_X, shot_df.LOC_Y, stat_func=None,
                                 kind='kde', space=0, color=cmap(0.1),
                                 cmap=cmap, n_levels=50)

joint_shot_chart.fig.set_size_inches(12,11)

# A joint plot has 3 Axes, the first one called ax_joint, 
# It's the one we want to draw our court onto and adjust some other settings
ax = joint_shot_chart.ax_joint
draw_court(ax, color="white", lw=1)

# Adjust the axis limits and orientation of the plot in order
# to plot half court, with the hoop by the top of the plot
ax.set_xlim(-250,250)
ax.set_ylim(422.5, -47.5)

# Get rid of axis labels and tick marks
ax.set_xlabel('')
ax.set_ylabel('')
ax.tick_params(labelbottom='off', labelleft='off')

# Add a title
ax.set_title('James Harden FGA n2014-15 Reg. Season', 
             y=1.2, fontsize=18)

# Add Data Scource and Author
ax.text(-250,445,'Data Source: stats.nba.com'
        'nAuthor: Savvas Tjortjoglou', fontsize=12)

# Add Harden's image to the top right
# First create our OffSetImage by passing in our image
# and set the zoom level to make the image small enough 
# to fit on our plot
img = OffsetImage(harden_pic, zoom=0.6)
# Pass in a tuple of x,y coordinates to set_offset
# to place the plot where you want, I just played around
# with the values until I found a spot where I wanted
# the image to be
img.set_offset((625,621))
# add the image
ax.add_artist(img)

plt.show()

Python
import sys
print('Python version:', sys.version_info)
import IPython
print('IPython version:', IPython.__version__)
print('Requests verstion', requests.__version__)
print('Urllib.requests version', urllib.request.__version__)
import matplotlib as mpl
print('Matplotlib version:', mpl.__version__)
print('Seaborn version:', sns.__version__)
print('Pandas version:', pd.__version__)

Python
Python version: sys.version_info(major=3, minor=4, micro=3, releaselevel='final', serial=0)
IPython version: 3.2.0
Requests verstion 2.7.0
Urllib.requests version 3.4
Matplotlib version: 1.4.3
Seaborn version: 0.6.0
Pandas version: 0.16.2

延伸阅读和资源

我创建了一个模块,该模块包含了上述所有的功能,你可以在这里找到该模块。

Bradley Fey 写了这样一个非常酷的包,可以让你通过一个很好用的 Python 封装器来访问 stats.nba.com 上大量的数据。

如果你想阅读更多关于如何使用 Python 获取数据的话,我建议阅读一些 Greg Reda 的贴子。这是另一个很好的贴子,它介绍了如何从 stats.nba.com 上获取数据。

如果你发现任何问题或者有任何疑问,请在下面留言。

打赏支持我翻译更多好文章,谢谢!

打赏译者

打赏支持我翻译更多好文章,谢谢!

评论关闭