Did you know that Artificial Intelligence (AI) is revolutionizing the sports industry? AI helps coaches to augment strategic decisions before and during a game. By using high-speed cameras and wearable sensors, AI can now measure the motions and positions of each player in the court. But how does AI track the distances of each player with one another? Can a machine tell if player 1 maintains a sufficient space to guard player 2 from the figure below?
Fig 1. Player 1 guarding player 2 in Basketball
Relying based only on the figure above can give inaccurate results since the image above is taken from a certain point of view and does not accurately represent the distance between the two players in real life. What the machine needs to do first is transform the broadcast view image to a top view image. This blog will discuss how to use a Homography Matrix to transform the image above to the court’s top view.
Homography is defined as the mapping between two planar projections of an image. A 3x3 transformation matrix represents it. We want to transform our image, and we need to compute for the Homography matrix. To illustrate, different 2D transformation matrixes are shown in the figure below.
Fig 2. Different Transformation matrix hierarchy
As shown in the figure above, we can transform our source image based on the following matrix:
Projective 8dof — transform images to different projections.
Affine 6dof — transform images that would preserve the parallel lines of the source image.
Euclidean 3dof — rotate the source image.
Similarity 4dof — rotate and scale the source image.
If you want to know more about different types of Homography, you can check the scikit image documentation.
Now that we know how a Homography matrix can transform an image based on its hierarchy, we need to determine our image’s source and destination points, which we will denote as p and p’, respectively. In identifying the points, no three points must be collinear, or else, the equation cannot distinguish the transformation. However, there is an algorithm for automatically detecting points, which is called the RANSAC method, although it will not be discussed in this blog. For this scenario, we can set the shaded lane of the court as our source points.
Fig 3. Shaded lane as source points (broad cast view)
Fig 4. Shaded lane as source points (broad cast view)
import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread, imshow
from skimage import transform
still1 = imread('still1.png')
src = np.array([(608, 641),
(683, 553),
(1841, 678),
(1750, 579)])
dst = np.array([(800, 339),
(1273, 339),
(800, 1000),
(1310, 1000)])
We will use the equation below to compute the Homography matrix based on the shaded lane’s pre-defined corners.
Fig 5. Homography Matrix Equation
tform = transform.estimate_transform('projective', src, dst)
tf_still1 = transform.warp(still1, tform.inverse)
fig, ax = plt.subplots(figsize=(20, 6))
ax.imshow(tf_still1)
ax.set_title('projective transformation')
Fig 6. Transformed top view basketball court
Based on the figure above, using a projective Homography matrix successfully transformed the broadcast view to the court’s top view image. However, the transformation also stretched the players’ bodies.
Here is another example of transforming an image using a Homography matrix. In this example, we can straighten out the leaning tower of Pisa using a Euclidean hierarchy to rotate the image.
tower = imread('tower_pisa.jpeg')
src = np.array([(291, 329),
(537, 344),
(230, 868),
(507, 891)])
dst = np.array([(220, 320),
(462, 320),
(220, 870),
(462, 870)])
tform = transform.estimate_transform('euclidean', src, dst)
tf_still1 = transform.warp(tower, tform.inverse)
fig, (ax0, ax1) = plt.subplots(1, 2, figsize=(12, 8))
ax0.imshow(tower)
ax0.set_title('Leaning tower of Pisa')
ax0.set_axis_off()
ax1.imshow(tf_still1)
ax1.set_title('Euclidean transformation')
ax1.set_axis_off()
plt.tight_layout()
Fig 7. Rotated leaning tower of Pisa using Euclidean Transformation
Homography is a very powerful tool and can be used to augment images to meet your objectives. It can be used in remote measurements, correcting satellite images, perspective correction, image stitching, calculation of depth, and camera pose estimation.
Comments