2. Technology and Sports Events
Technology and sporting events have shared an inseparable relationship, especially during the Olympic Games, which serve as an excellent opportunity for host countries and companies to showcase their technological prowess. The 2012 London Olympics introduced data analysis technology into athletic competitions [
1], while the 2016 Rio Olympics became the first to employ VR and 360-degree technology in event broadcasts [
2]. The emergence of uncrewed aerial drones broke the constraints of traditional cameras limited by the venue. The 2018 PyeongChang Winter Olympics witnessed the deployment of 5G mobile broadband and high-speed broadcasting, significantly enhancing the quality of viewing experiences [
3]. Additionally, uncrewed vehicles were introduced to transport athletes and spectators. The 2020 Tokyo Olympics, impacted by the COVID-19 pandemic, utilized innovative technologies to overcome restrictions on in-person attendance. Utilizing the high-speed transmission characteristics of 5G and specially positioned sports cameras, audiences were able to view the competitions from the athletes’ perspectives. Virtual reality (VR) equipment allowed viewers to experience matches from a first-person viewpoint, cheering alongside virtual neighboring spectators. The use of 8 K/12 K high-resolution images enabled audiences to observe athletes’ subtle facial expressions and muscle movements clearly on their screens [
4]. The broadcast presentation was no longer confined to the singular perspective of television; the aforementioned technologies created an immersive and participatory viewing experience.
Apart from the Olympics, various sports events have used the application of technology, such as the Hawk-Eye system. Originally developed for cricket, the Hawk-Eye system was first used in televised cricket matches in 2001, providing commentators with data on ball speed, trajectory, and more. In 2002, the British Broadcasting Corporation (BBC) utilized Hawk-Eye technology during tennis broadcasts, enabling audiences to view the action in real time. The technology was then formally introduced into matches, allowing players to challenge line referee decisions. Hawk-Eye technology has been adopted in multiple sports, including badminton and volleyball, and used for sideline judgment challenges in 2014. In the 2013–2014 season, the English Premier League was using Hawk-Eye goal-line technology to determine whether a ball crossed the goal line for a valid goal.
Many groundbreaking technologies have emerged in the quadrennial FIFA World Cup as well. The Semi-Automated Offside Technology (SAOT), introduced in the 2022 Qatar World Cup, utilizes 12 tracking cameras installed in the stadium to track the ball and the positions of all players. These data were processed through big data algorithms to help referees confirm player positions for making offside decisions.
3. Big Data Cases in Sport: National Basketball Association (NBA) and Major League Baseball (MLB)
With the widespread availability of cloud storage devices and the proliferation of various sensors and smart devices, the wave of big data has swept through the sports domain. Data analysis in sports is no longer just about statistical analysis; big data analysis has become mainstream in professional sports. It impacts the operations and decision-making of all sports teams, transforming the operational landscape of professional sports leagues, including game formats, player trade management, and the overall game experience [
5]. Among the major North American professional sports leagues, baseball and basketball exhibit the highest proportion of big data applications with Major League Baseball in the US reaching up to 97%, and the NBA at 80%.
Since the 1980s, the NBA has been utilizing data management technology, encompassing detailed records of game statistics such as player scores, rebounds, assists, blocks, steals, turnovers, fouls, and more, meticulously documented by sideline personnel. Upon analysis, these statistics not only reveal individual performances but can even illustrate players’ offensive and defensive performances during matchups. During the 2010–2011 season, four NBA teams utilized the dynamic camera technology provided by SportVU, known as the Player Tracking System. Through six high-resolution tracking cameras stationed along the basketball court’s catwalk at a frequency of 25 frames per second, these cameras generated thousands of images per minute, capturing athletes’ subtle movements comprehensively for analysis. From 2013 onwards, SportVU became an official partner of the NBA, leading the NBA to instantly publish SportVU-derived data on the NBA’s official website and NBATV. This enabled audiences to access more professional and reference-worthy basketball data, allowing each viewer to derive pleasure from watching games based on their understanding of the data.
In 2018, the NBA began collaborating with Sportradar and Second Spectrum, equipping all NBA arenas with multiple dynamic cameras recording players’ in-game dynamics. Apart from utilizing image processing and machine learning models to analyze and apply vast databases, Second Spectrum employs machine learning to automatically detect and record various “meaningful” behaviors on the court, such as touches, shooting, passing, dribbling, screens, and more, and evaluates players’ choices on the court through statistical methods, predicting the shooting accuracy of players at particular positions. SportVU and Second Spectrum have transformed NBA teams’ decision-making, player training, tactics, player assessment criteria, and fans’ viewing experiences, propelling NBA basketball games into another developmental stage.
In comparison to basketball, baseball’s advanced data development commenced earlier. Due to the game’s pace and on-field interactions, baseball places greater emphasis on various data sets than basketball. Baseball is appropriate for data analysis because it is a turn-based sport, where each game comprises a series of individual pitching and hitting matchups all of which serve as independent samples for analysis. Baseball is inherently a sport that reveals probabilities, utilizing metrics such as batting average, on-base percentage, slugging percentage, earned run average, and fielding percentage, among others, as references for pitching, offensive and defensive strategies, and tactical instructions from the coaching staff during games.
In the early stages of baseball development, only basic statistics such as game scores and hit counts were recorded. In 1858, sportswriter Henry Chadwick invented the box score to provide various game statistics. In 1964, Earnshaw Cook enabled people to understand the potential of using data to analyze baseball through his work “Percentage Baseball”. Bill James in the 1970s conducted a series of analyses for the Society for American Baseball Research (SABR) concerning baseball statistical data and began writing “The Bill James Baseball Abstract” in 1977, published annually until 1984. This series of works introduced numerous new arguments based on data, initiating the wave of sabermetrics, and advanced baseball data research.
Since 2007, MLB has introduced advanced analysis systems such as the Pitchf/x system and Hitf/x. Approximately 20 data collectors are installed in each ballpark. The Pitchf/x system records various data such as the velocity, location, pitch type, release point, spin angle, and the outcome of pitcher–batter matchups for every thrown ball. This results in a substantial surge in baseball data, where a single game generates around 300,000 high-resolution photos, resulting in over 1 TB of data and potentially reaching a massive 7 TB [
6,
7]. Subsequently, MLB integrated systems capable of tracking players (ChyronHego) and tracking the ball (Trackman high-speed camera system), allowing for comprehensive spatial data collection introduced as Statcast, adopted by all MLB teams. The Statcast database collects intricate on-field events as data, presenting new metrics such as exit velocity, launch angle, spin rate, tunneling, and pop time, and later widely used derived data to evaluate player performance such as hard-hit rate and barrel rate as shown in
Figure 1,
Figure 2 and
Figure 3. These next-generation scientific data have started to be featured in in-game broadcasts, providing anchors and commentators with a basis for analyzing player athletic abilities and enhancing the professionalism and entertainment value of game reporting. These data have gradually become familiar to audiences, allowing fans to discuss star players’ performances based on scientific data rather than speculation.
In 2020, MLB replaced Trackman with Hawk-Eye, requiring the installation of 12 cameras in the ballpark. While Trackman could only track one fast-moving object, typically the ball, Hawk-Eye can track all moving objects within the frame, enhancing its precision in detecting the ball and players and capturing swing times and trajectories. The interconnection between the development of sports and television broadcasting indicates that this linkage not only enhances entertainment but also contributes to the promotion of a particular sport. Throughout the broadcast of games, leveraging technological advancements enables the display of various aspects of the game process and related information on screens. This includes the trajectory of pitches, K-Zone, Hawk-Eye, and other elements. By developing relevant technologies from an entertainment perspective, audiences are allowed to immerse themselves in the tension and excitement elicited by the game.
4. Big Data in Chinese Professional Baseball League (CPBL)
In the era of big data, the traditional statistics recorded in the box score, such as a player’s runs, hits, errors, and other basic data, no longer suffice for the modern needs of data analysis in baseball. While CPBL and its associated teams have started exploring advanced statistics, the current CPBL broadcasts continue to display primarily basic, calculable statistics on the screen for viewers. These typically include batting average, earned run average, wins, runs batted in, and saves as shown in
Figure 4 and
Figure 5.
In 2018, the Fubon Guardians and the CTBC Brothers installed Trackman systems at their home stadiums in Xinzhuang and Taichung, respectively. In 2020, the Wei Chuan Dragons, which returned to the CPBL, also equipped their stadiums in Tianmu and Douliu with Trackman. Differing from these three teams that purchased the “Trackman” data analysis system used by MLB teams, the Uni-President 7-Eleven Lions opted for the portable Rapsodo system. The Taoyuan Monkeys collaborated with a local data analytics company, using eight high-speed cameras to measure the trajectory of pitches and hits, as well as players’ movements, achieving automatic video recording and editing.
A research team in Taiwan developed the Karma Zone system in 2018. The system, consisting of six high-speed cameras, introduced two applications: an electronic strike zone and a 3D motion analysis system. Karma Zone utilizes images captured by two high-speed cameras positioned over the first base side and behind the home plate to calculate the coordinates in space. The results are displayed on the broadcast screen as a grid pattern of the strike zone as shown in
Figure 6 and
Figure 7. It displays on the broadcast screen in less than a second after a pitcher throws the ball. This system provides data on hitting positions, ball speed, hitting velocity, and various other information in the future. The teams plan to release an app for fans to access these data by downloading the app.
In May 2020, the CTBC Brothers integrated the Trackman data mirroring system into their broadcasts, displaying data like the grid pattern, pitcher ball speed, ball rotation speed, initial hitting speed, and home run distances. This development brings Taiwanese baseball broadcasts closer to the trend of incorporating big data.
5. Discussion
The MLB has developed data-driven baseball for many years. However, Taiwan has only recently begun to focus on data analysis, utilizing technology to collect and record player statistics and continuing to establish a big data backend for analysis. The aim is to provide coaches and players with data-backed strategies in the field. This allows for in-game decision-making and tracking the developmental trajectory of young players, ensuring whether players are following the growth curve planned by the team. Nonetheless, the CPBL still has a considerable gap in terms of personnel in its team’s data departments and investment compared to MLB.
There are several reasons why the concepts of using data analysis are not widespread in Taiwan’s baseball industry or CPBL teams [
5]. These include unfamiliarity with data analysis, lack of confidence in its accuracy, concerns over the increased operational costs associated with data analysis, and the concept of making decisions through data analysis not being widely accepted in Taiwan. In reality, data analysis requires capturing a vast amount of data through hardware equipment and then using software to conclude. Due to the substantial financial support required for hardware setup, database system establishment, and hiring of technical talent, CPBL’s limited number of teams leads to insufficient data collection, and limited commercial interests, making it difficult to attract local and foreign scholars for research and related investments. Moreover, CPBL struggles to attract investment or create profits similar to sports big data technology and service companies including MLB Advance Media (MLBAM). In contrast, MLB teams are significantly concerned about baseball statistics due to the multimillion-dollar player market dynamics. However, apart from improving team performance and player athletic skills, there are no other compelling reasons to attract investment in the league.
Undeniably, the collection and analysis of sports data have already occupied a significant position in professional sports, with most discussions focusing on enhancing player performance. However, there has been scarce research on the impact of these data on audiences watching the game on screen. General viewers tend to concentrate their attention on the wins and losses of the teams rather than the statistics generated in professional sports. Only a few individuals with an interest in statistics seek to understand the meaning and implications of the data provided on the screen. Zheng and Chen pointed out that complex information poses a challenge to the audience’s information-processing ability [
8]. Sports data can create barriers among audiences with different experiences, making it difficult for fans with varying levels of experience to enjoy watching sports games. Their research revealed that in situations with less experience, statistical information does not assist low-engagement fans in understanding baseball events. However, visualizing player information allows for low-engagement fans to derive more enjoyment and meaning while making it easier to comprehend the game. On the other hand, high-engagement fans, due to their extensive baseball knowledge and experience, do not significantly benefit from any type of information provided during broadcast experiences.