Quantifying and Predicting Momentum in Tennis Match via Machine Learning Approach

Article
DOI: 10.30827/ijrss.34569

Quantifying and Predicting Momentum in Tennis Match via Machine Learning Approach

Cuantificación y predicción del momentum en partidos de tenis mediante un enfoque de aprendizaje automático

International Journal of Racket Sports Science, vol. 7(1) (January - June, 2025), Pag. 46-58. eISSN: 2695-4508

Received: 05-01-2025
Acepted: 15-07-2025

AUTHORS

Chang Liu *

Jiangyan Yang

Yixiong Cui

1 Beijing Sport University, Beijing, China.

Corresponding Author: Yixiong Cui, cuiyixiong@bsu.edu.cn

Cite this article as: Liu, C., Yang, J., & Cui, Y. (2025). Quantifying and Predicting Momentum in Tennis Match via Machine Learning Approach. International Journal of Racket Sports Science, 7(1), 46-pp. https://doi.org/10.30827/ijrss.34569

ABSTRACT

Abstract

This study aims to identify and analyze momentum shifts in tennis, developing a data-driven model to quantify and predict these shifts and assess their influence on match outcomes. Using data from 6 tournaments, including 564 matches and over 135,000 points, this study constructed a momentum calculation model integrating 14 weighted match factors such as point progression, server advantage, and player ranking differences. The model incorporates adjustments for set discontinuities and initial momentum based on player rankings to enhance predictive accuracy. Following data processing and validation, a Kappa consistency test was performed on the 2023 Wimbledon Championship data, yielding a high alignment with actual outcomes (Kappa = 0.96). Using a Gradient Boosting Decision Tree (GBDT) regression model, the study achieved a high accuracy in predicting momentum shifts, identifying key variables such as serve advantage and score gaps as primary indicators of performance dynamics. This model further revealed that players’ momentum tends to stabilize at critical points, such as 40:30, while fluctuating more at disadvantageous scores. These findings highlight the model’s utility for pre-match analysis, enabling detailed insights into opponents’ tactical patterns and psychological responses under varying score conditions. Overall, this momentum model provides valuable applications for enhancing player preparation and in-game strategic adjustments, offering coaches and players a quantifiable tool to interpret and influence match outcomes.

Keywords: Result prediction, gradient boosting decision tree, sports performance analysis.

Resumen

El objetivo de este estudio es identificar y analizar los cambios de momentum en el tenis, desarrollando un modelo basado en datos para cuantificar y predecir estos cambios y evaluar su influencia en los resultados de los partidos. A partir de datos de 6 torneos, que incluyen 564 partidos y más de 135 000 puntos, este estudio construyó un modelo de cálculo del momentum que integra 14 factores ponderados del partido, como la progresión de los puntos, la ventaja del servidor y las diferencias en la clasificación de los jugadores. El modelo incorpora ajustes para las discontinuidades de los sets y el momentum inicial basado en la clasificación de los jugadores para mejorar la precisión de la predicción. Tras el procesamiento y la validación de los datos, se realizó una prueba de concordancia Kappa con los datos del Campeonato de Wimbledon de 2023, la cual arrojó una alta coincidencia con los resultados reales (Kappa = 0,96). Utilizando un modelo de regresión con árboles de decisión potenciados por gradiente (GBDT), el estudio logró una alta precisión en la predicción de los cambios de momentum e identificó variables clave como la ventaja en el servicio y las diferencias en el marcador como indicadores principales de la dinámica del rendimiento. Este modelo reveló además que el momentum de los jugadores tiende a estabilizarse en puntos críticos, como el 40:30, mientras que fluctúa más en puntuaciones desfavorables. Estos hallazgos resaltan la utilidad del modelo para el análisis previo al partido, ya que permite obtener información detallada sobre los patrones tácticos y las respuestas psicológicas de los oponentes en condiciones de puntuación variables. En general, este modelo de momentum ofrece aplicaciones valiosas para mejorar la preparación de los jugadores y los ajustes estratégicos durante el partido, proporcionando a los entrenadores y a los jugadores una herramienta cuantificable para interpretar e influir en los resultados de los partidos.

Palabras clave: predicción de resultados, árboles de decisión potenciados por gradiente, análisis del rendimiento deportivo.

Introduction

INTRODUCTION

With the integration of big data analytics into the sports industry, sports performance data collection and analysis have achieved unprecedented depth and precision, gaining widespread attention across professional, media, and fan communities. In recent years, metrics like win probability have become increasingly prominent in sports broadcasting (Duen & Peker, 2024), providing audiences with real-time insights into each player’s likelihood of success and enhancing the viewing experience. Tennis stands out due to its dynamic tactical shifts and intense point-to-point momentum changes, making it a rich subject for performance analysis. A prime example is the 2023 Wimbledon Men’s Singles Final, where 20-year-old Spanish star Carlos Alcaraz defeated 36-year-old legend Novak Djokovic. The match saw dramatic swings in momentum-from Djokovic’s dominance in the first set to Alcaraz's pivotal tiebreak win in the second, and then a series of momentum shifts that led to Alcaraz’s victory. Such dynamic shifts sparked widespread discussions about the concept of "momentum" in sports. However, while momentum is often described as “strength or force gained by motion or by a series of events” (Merriam-Webster, 2024), accurately measuring and analyzing it remains a challenge for both academic and sports communities. Therefore, defining momentum, identifying the factors influencing match flow, and predicting its impact on match outcomes are essential areas of performance analysis research in tennis (Sampaio et al., 2024). Relevant findings would hold significant value when informing coaches to refine tactical strategies and enabling players to better adapt to real-time changes during matches.

Research on tennis covers a wide range of areas, including player skill and tactics analysis, match outcome factors, match prediction, and sports betting. The focus on player performance has garnered significant attention, with extensive studies exploring pre-match predictions and post-match analyses (Bayram et al., 2021). However, the necessity and significance of real-time momentum analysis during matches remain underexplored, likely due to the complexities and dynamic nature of capturing momentum shifts as they unfold. This gap highlights the need for a deeper understanding of how momentum influences the flow and outcome of matches, which could offer valuable insights for both players and coaches (Tognini & Perciavalle, 2022).

Defining momentum is challenging. Some scholars view it as a psychological state, reflecting a player’s confidence and drive during a match, while others consider it a technical and tactical advantage (Ahola & Dotson, 2014, Zheng Cao, 2011). On the other hand, tennis data provider defines it as “an exponentially weighted moving average of the leverage gained by a player” (Manuel, 2022). Dietl and Nesseler (2017) argue that players benefit from momentum as long as they control the match; once they lose control, their chance of winning subsequent sets diminishes significantly. Despite these differing perspectives, momentum clearly influences match outcomes. Real-time momentum analysis, however, demands large-scale, detailed match data alongside sophisticated data acquisition and processing tools to manage and analyze this information in real-time.

Quantifying and measuring momentum pose further challenges. Traditionally, momentum has been assessed based on score and performance sequences. For instance, Moss investigates whether the outcomes of service games were significantly associated with the outcomes of the receiving and next serving games that followed (Moss & O’Donoghue, 2015). However, with advancements in data analytics, more studies are employing big data and statistical models to quantify momentum shifts, capturing the dynamics and trends within matches for a more precise analysis of game flow. Zhong introduced a composite model integrating a Logistic Regression Model and a LASSO-based Sparse ARMAX Model to predict momentum shifts and guide strategic decisions during games (Zhong et al., 2024). Ahmed (2014) analyzed the relationship between a binomial probability distribution and the process of match play in tennis and constructed a probability model for two players of predictable abilities. Momentum analysis has also been explored across various sports, particularly in football and basketball. Noel et al. (2024) developed a data pipeline and indicated that momentum should be studied more from a feature/performance indicator point-of-view and less from the view of the dependence of sequential outcomes in the future . Mingjia Qiu (2024) designed a quantitative framework to accurately identify momentum in basketball games, and explored the role of momentum in games. Notably, Ötting et al. (2023) investigated the potential occurrence of momentum shifts in the dynamics of football matches.

However, despite these advancements, existing literature lacks comprehensive approaches for real-time momentum analysis, particularly in tennis. Studies are often limited to retrospective analyses or are constrained by simplified models that overlook the complexity of in-game factors. Additionally, they tend to neglect situational variables, such as player condition, court type, and environmental factors which significantly influence momentum shifts in live matches (Wang & Lin, 2005; Martínez-Gallego et al., 2013). These gaps highlight the necessity for more advanced, adaptable models capable of analyzing and predicting momentum in real time, accommodating the multifaceted influences inherent in tennis matches.

Therefore, this study aims to develop a comprehensive model to quantify and predict momentum shifts in professional tennis matches using a machine learning approach. We hypothesize that the proposed momentum model, integrating player performance indicators and point-by-point dynamics, can effectively identify momentum shifts and predict match outcomes with high accuracy. The model is designed to offer both theoretical and practical value by enhancing our understanding of momentum as a quantifiable variable and providing actionable insights for tactical decision-making in competitive tennis.

Methods

Methodology

Sample and Data

The data from the study included detailed point-by-point data of 564 men's singles main draw matches during 2021-2023 U.S. Open and Wimbledon tournaments provided by Jeff Sackmann (2024) via Tennis Abstract (www.tennisabstract.com). In total, there were 135,110 points played by 211 individual players.

Procedures and Statistical Analysis

Initially, missing value and outlier were detected to ensure data quality. For instance, missing values in the serve speed (mph) were treated as null speeds when calculating each match's average serve speed, thereby maintaining the completeness and accuracy of dataset. Subsequently, cleaned point-by-point data were used to extract relevant indicators for analyzing player performance. The data processing workflow is illustrated in Figure 1.

Figure 1 Performance indicator screening flow chart

Based on the existing literature and information in the raw data, summary statistics of 12 indicators were extracted to represent player’s match performance. Additionally, a ranking gap was included as opposition effect factor on performance (see Table 1).

Table 1 Indicators Explanation Table

Index	Definition
Results	The match outcome, indicating whether the player1 won or play2 won
Aces	The total number of serves that landed in the service box and were untouched by the opponent
DF	The total number of consecutive faults during serve attempts, resulting in the loss of a point
1st_serve_success_rate	The proportion of successful first serves relative to the number of serves
1st_serve_won_rate	The proportion of points won on successful first serves
2nd_serve_won_rate	The proportion of points won on second serves
net_pt_won_rate	The proportion of points won at the net relative to net points played
break_pt_won_rate	The proportion of break points successfully won by the player
winner_won_rate	The proportion of points won by hitting winners (shots not touched by the opponent)
unf_err_rate	The proportion of unforced errors committed relative to the total number of errors
average_distance_run	The average distance covered per point during the match
average_won_rally_count	The average number of strokes in rallies won by the player
average_speed_mph	The average speed of all serves during the match
ranking_gap	The difference between the player’s ranking and the opponent’s ranking

Research Framework

The study first developed a comprehensive research framework to capture, analyze, and predict momentum in tennis matches. Figure 2 provides an overview of the framework, detailing the key stages of the research process, from data preparation to model application. The process began with the extraction of key performance indicators, followed by the establishment of a mathematical model to quantify momentum. To validate the model, a consistency check was performed by comparing the player with the higher momentum at the end of each set to the actual match outcome. This validation step assessed the model’s alignment with actual results. Subsequently, a predictive model was developed using a Gradient Boosting Decision Tree (GBDT) regression approach. This model analyzed turning points within tennis matches, identifying areas where momentum shifts significantly impact player performance. The predictive model's accuracy and interpretability were further evaluated to generate actionable tactical insights. These insights can be applied across various contexts, including pre-match analysis, in-match tactical adjustments, and post-match reviews to optimize training and performance.

Figure 2 The Structure of the Research Framework for Tennis Match Momentum Analysis

Tennis Momentum Model

In the modeling stage, the research transformed the weights into additive changes according to the different indicators with different impact weights.

The study combines this with the fact that research will use the ranking as an initialization momentum. The initialization of momentum scores in equations (1) to (3) is grounded in the use of ATP rankings, which serve as a proxy for pre-match strength and reputation, an approach that has been utilized in prior research to set player baseline states (Pham & Bufi, 2023). First, we will use the variable Mi(t) (the momentum of player i (the i ^thplayer)at time t (the t ^th time point)), represents the change in momentum of player i with time, M ₁ (t) represents the change in momentum of player 1 with respect to time t, M ₂ (t) represents the change in momentum of player 2 with respect to time, t, p ₁ (p _iis the rank of player i)denotes the ranking of player 1, denotes the ranking of player 2. Thus, study can get the initialized momentum score, as shown in equations (1), (2), and (3):

Equation (4) defines momentum as a recursive function over time, drawing on basic principles of temporal state propagation found in time-series models such as AR and ARMA. The additive momentum increments in equation (5) are inspired by feature-based performance scoring systems common in sports analytics (Ma, 2024), where different actions (e.g., ace, break_point, point_won) contribute unequally to performance trends. As study traverse each time point t in the race, the momentum variable is updated according to equation (4):

Whenever a player wins a point, his performance status is expected to improve and therefore his momentum should be elevated. Such increase is represented by an increment, as shown in equation (5). For instance, if the player wins a point, his momentum increases by a fixed value d_j (j is the j ^th scoring item, and d_j is the scoring points for scoring item j); Similarly, when player breaks serve successfully or hits an ace, this brings a lot of momentum, so his momentum should have a different increase. Our study can represent this increase using other increments, which research will set to d ₂, d ₃ respectively.

Furthermore, it was also assumed that the outcome of the previous game impacts the current game (Klaassen & Magnus, 2001). To reflect this, an initial momentum value is recalculated at the beginning of each game. This recalculation incorporates the player’s ranking and the weighted momentum value from the last point of the previous game, as shown in equation (6). Equation (6) introduces an inter-game adjustment mechanism based on weighted memory of previous performance, reflecting concepts from state transition models and psychological momentum theories (Ma, 2024), with the addition of a tunable weight β to model state carry-over. To balance the relationship between games and reflect players' state adjustments, experiments determined the optimal weighting coefficient β=0.2.. The equation for the initial momentum of each game is:

At this point, the study established a mathematical model that captures the flow of the situation as the game progresses. The model is represented by equation (7):

After getting the mathematical model of momentum, our study used python code to write the model and made the visualization of the data, taking d ₁ =1, d ₂ =1,d ₃ =1,s=1, Here, in order to show more intuitively the tendency of the players' momentum change when the match is in progress, this study introduced a concept of momentum difference, which is obtained by subtracting the momentum of player1 and player2, and at the same time.

To validate the model, a Kappa consistency test was conducted (Cohen, 1960). Momentum data from the 31 matches of the 2023 Wimbledon tournament were extracted and processed into a binary classification: values greater than 0 indicate that Player 1 had, while values less than 0 indicate that Player 2. Our research then calculated the match outcome for each point based on the momentum and filtered the results to obtain set-level outcomes. A Kappa coefficient model was established to compare these results with the actual set outcomes (game_winner), conducting a Kappa consistency test to assess alignment.

The Kappa coefficient was utilized to assess the agreement between predicted results derived from the model and actual outcomes from the raw data. As a widely recognized metric for measuring correlation between categorical data, the Kappa coefficient was appropriate for this binary classification task. This method provided a robust measure of the model’s consistency and alignment with real-world results. According to the widely accepted Kappa coefficient interpretation standard proposed by Cohen (1960), a Kappa value below 0 indicates no agreement, 0-0.20 indicates slight agreement, 0.21-0.40 indicates fair agreement, 0.41-0.60 indicates moderate agreement, 0.61-0.80 indicates substantial agreement, and values above 0.81 indicate almost perfect agreement. The Kappa value of 0.96 achieved in this study thus demonstrates an almost perfect alignment between the predicted and actual outcomes, validating the reliability of the proposed model.

Momentum Prediction Model

The study also aimed to predict critical factors influencing match dynamics. Building on the concept of momentum differential, a threshold of 0 was set to identify "swings" or turning points within matches. The GBDT regression model was trained on the data to predict these momentum swings, and feature importance was analyzed to determine the key predictors of momentum shifts. Several configurations of the GBDT model were tested to optimize performance. The final model parameters are presented in Table 2. Then, the feature importance is determined by calculating the contribution of each feature to the splits across all decision trees.

Table 2 Main Parameters

Parameter Name	Parameter Value	Parameter Name	Parameter Value
Data Split	0.7	Min Samples Leaf	1
Number of Base Learners	100	Min Weight Samples Leaf	0
Learning Rate	0.1	Max Tree Depth	4
Min Samples Split	5	Max Leaf Nodes	30

To evaluate the effectiveness and generalization ability of the proposed momentum prediction model, several alternative machine learning algorithms were tested alongside GBDT. These included LightGBM, XGBoost, and K-Nearest Neighbors (KNN). All models were trained on the same dataset using identical features and hyperparameter tuning strategies. Their performance was assessed using three key metrics: mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R²). The results, presented in Table 3, show that the GBDT model achieved the MAE (0.659) and RMSE (0.979), and the highest R² (0.828), indicating strong predictive accuracy and goodness of fit. LightGBM and XGBoost yielded comparable results, with only marginal differences from GBDT. These findings suggest that all three gradient-boosting frameworks are suitable for momentum prediction tasks, with GBDT performing slightly better overall. In contrast, the KNN model demonstrated weaker predictive performance across all metrics. This significant gap indicates that KNN, which lacks the ability to model complex feature interactions and sequential dependencies, is less suited for the nuanced task of modeling momentum dynamics in tennis.

Overall, these results reinforce the effectiveness of the proposed GBDT-based model while confirming the robustness of gradient-boosted ensemble methods for this specific prediction problem.

Table 3 Model Comparison

Model	MAE	RMSE	R²
GBDT	0.659	0.979	0.828
LightGBM	0.664	0.984	0.827
XGBoost	0.661	0.991	0.824
KNN	0.855	1.46	0.738

Results

Factor Analysis

To assess the selected indicators, the study first evaluated their orientation. It was determined that indicators such as ranking gap and double faults (DF) were negatively oriented, where lower values corresponded to better performance. Conversely, all other indicators were positively oriented. To ensure consistency in the weight analysis, negatively oriented indicators were inverted, aligning all indicators to follow a consistent trend.

Before proceeding with factor analysis, an independent sample T-test was performed (see Table 3). The test for homogeneity of variances yielded a significant P-value of 0.024** for average serve speed, indicating a violation of the homogeneity assumption. Consequently, this indicator was excluded from subsequent analyses, leaving a total of 12 indicators for consideration.

To identify relevant features for the model, Pearson correlation coefficients were calculated between each indicator and the player score. The magnitude of these coefficients provided insight into which indicators were most strongly correlated with player performance. Based on this analysis, five key indicators were selected for momentum calculation model: initial ranking, right to serve, point scored, break point won, and ACE occurrence.

During the model development phase, the latest ATP rolling rankings (from one week prior to each tournament) were used as initial input values. The processed dataset included quantitative variables such as point_no, server (referring to the player currently serving), point_winner, game_winner, p1_ace, p2_ace, p1_winner, p2_winner, p1_double_fault, p2_double_fault, p1_unf_err, p2_unf_err, p1_break_pt_won, and the result_gap from the previous point.In the modeling framework, result_gap served as the dependent variable, while the remaining indicators were used as predictors. To better interpret the magnitude of the statistical differences between winning and losing players, effect sizes (Cohen’s d) were calculated alongside traditional significance tests.

Table 4 summarizes the statistical results for each performance factor, including F and P values, as well as corresponding effect sizes. Indicators such as Aces (d = 0.14), Distance Run (d = 1.50), and Winners Won Rate (d = 0.32) demonstrated varying degrees of impact. Particularly, Distance Run exhibited a large effect size, highlighting its relevance in distinguishing dominant performance patterns.

Table 4 Factor Analysis Index

	Results (Standard Deviation)		F	P	Cohen's d
	1.0	0.0
Aces	5.803	8.203	0.567	0.455	0.14
Ranking gap(converted to positive)	57.673	57.673	0.000	1.000	-
DF	2.966	3.425	0.537	0.466	0.31
Double faults (DF)	0.063	0.098	3.300	0.074 *	12.14 ★
First serve success rate (%)	0.063	0.114	4.119	0.047**	10.86 ★
First serve points won rate (%)	0.084	0.085	0.094	0.760	11.83 ★
Second serve points won rate (%)	3.290	5.438	5.325	0.024**	0.22
Average serve speed	0.114	0.115	0.002	0.968	8.73 ★
Net points won rate (%)	0.179	0.191	0.157	0.693	5.40 ★
Break point conversion rate (%)	0.066	0.090	2.058	0.157	12.67 ★
Winners won rate (%)	0.090	0.078	0.836	0.364	0.32
Unforced error rate (%)	3.041	2.815	0.395	0.532	0.34
Average distance run	0.722	0.605	0.168	0.684	1.50

Note: ***, **, * represent the significance levels of 1%, 5%, and 10%, respectively.

★ Values of d > 8 are likely inflated due to very small standard deviations (SD < 0.12).

Kappa Coefficient Test

The Kappa coefficient of 0.96 demonstrates a nearly perfect alignment between the model's calculated momentum advantage and the actual set outcomes, validating its reliability. The test's significance levels (z = 33.081, p < 0.01) further confirm the statistical robustness of this alignment.

Model Testing Results and Evaluation

Figure 3 presents a partial prediction plot for the test dataset, comparing predicted values with actual values. In the plot, the blue line represents the true values, while the green line shows the predicted values. The model demonstrates a high degree of accuracy, as the predicted values closely follow the trend of the actual data, particularly around the 935th point of the season starting from the Round of 16. Some minor discrepancies are observed at smaller peaks and valleys, suggesting that while the model captures the general patterns effectively, there is room for refinement in specific intervals.

Figure 3 The Predicted Value Compared with The True Value

To systematically evaluate and address potential overfitting observed in Figure 3, the model was assessed using a 5-fold cross-validation approach. As shown in Table 5, the mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and R² scores of the cross-validation set closely mirrored those of both the training and test sets. Specifically, the training set achieved an MAE of 0.576 and an R² of 0.851, while the 5-fold cross-validation yielded an MAE of 0.609 and R² of 0.829. The test set recorded a comparable MAE of 0.659 and R² of 0.828. These minimal performance gaps indicate that the model generalizes well and does not exhibit signs of overfitting. Additionally, the RMSE difference between training (0.808) and test set (0.979) was small relative to the scale of the target variable, and the standardized effect size (Cohen’s d = 0.15) between the MAEs of training and test sets further confirmed negligible overfitting. Together, these results validate the stability and robustness of the proposed momentum prediction model.

Table 5 Model Comparison

Metric	Training Set	Cross-Validation	Test Set
MSE	0.653	0.749	0.959
RMSE	0.808	0.864	0.979
MAE	0.576	0.609	0.659
R²	0.851	0.829	0.828

For both the training and test sets, the corresponding values are illustrated in Figure 4.

Figure 4 Comparison of Error Metrics between Train and Test Sets

The close alignment between these metrics suggests that the model does not overfit the training data and maintains strong generalization capability. While this level of performance is satisfactory, further parameter optimization or exploration of alternative algorithms could enhance predictive accuracy on unseen data.

Feature Importance

The analysis of feature importance provided insights into the relative impact of various factors on momentum shifts. Figure 5 illustrates the key predictors identified by the model. The server variable ranked as the most significant factor influencing momentum swings, followed by point_no, which represents the progression of the match, and point_winner, which indicates the winner of a specific point. These features were determined to have the greatest impact on predicting momentum dynamics during a match.

Figure 5 Feature Importance Ratio

Model Application

Real-Time Momentum Visualization

To illustrate the model's effectiveness, the final match of the 2023 Wimbledon tournament between Carlos Alcaraz and Novak Djokovic was selected for visualization (Figure 6).

Figure 6 Momentum Visualization of the 2023 Wimbledon Men’s Final

The visualization shows that Djokovic initially held the momentum advantage, while Alcalaz gained control midway through the match, resulting in a significant momentum shift. The model accurately tracks this transition, with both players experiencing sharp momentum changes at critical points, indicating a close contest. Alcaraz's momentum eventually stabilized at a higher level, aligning with the actual match outcome, thereby demonstrating the model’s capability to depict shifts in competitive advantage over time. The momentum differential graph offers a clear view of each player’s relative advantage during the match.

The model was further applied to other matches, such as a round-of-16 match between Carlos Alcaraz and Nicolas Jarry (Figure 7), to illustrate different momentum patterns. By adjusting the match_id in the model code, momentum trajectories for various matches can be generated, showcasing the model’s adaptability to different match dynamics.

Figure 7 Momentum Visualization of a Round-of-16 Match

This visualization depicts momentum patterns from another match in the tournament. Unlike the final, this match exhibited more consistent momentum patterns, with fewer abrupt shifts, highlighting the variability in competitive dynamics across matches.

Pre-Match Analysis and Opponent Profiling

The findings of this study demonstrate the model’s suitability for pre-match analysis and opponent profiling. The model was used to calculate momentum turning points for Carlos Alcaraz and Novak Djokovic under various score scenarios during the U.S. Open and Wimbledon tournaments from 2022 to 2023, providing valuable insights into their strategic tendencies and performance shifts.

Figure 8 presents the momentum turning points for both Carlos Alcaraz across 22 matches and Novak Djokovic across 26 matches. Positive_Turning_Points indicate moments where momentum shifts from negative to positive, while Negative_Turning_Points represents shifts from positive to negative. For Carlos Alcaraz, at a score of 0:30, he encounters more negative turning points (75) than positive ones (62). Conversely, at 30:40, positive turning points (49) exceed negative ones (33). The lower sum values at scores such as 40:0, 40:15, and 40:30 reflect fewer momentum fluctuations, indicating greater stability at these moments. For Novak Djokovic, the total number of momentums turning points for Novak Djokovic is highest at a score of 15:0 (142), followed by 15:40 (132) and 15:15 (130). Negative turning points dominate at disadvantageous scores like 0:40 and 0:30, while critical moments such as 15:40 and 15:15 exhibit significant positive turning points.

Figure 8 Momentum Turning Points for Carlos Alcaraz and Novak Djokovic

DISCUSSION

The primary objective of this study was to develop a comprehensive model for quantifying and analyzing momentum in tennis matches, addressing the challenges of capturing real-time trends and transforming the abstract concept of momentum into concrete, data-driven visual representations. Using data from six major tennis tournaments, this study successfully defined momentum and identified key influencing factors through factor analysis. This foundation enabled us to build a model capable of representing and calculating momentum shifts, which we visualized using real-time data. By incorporating the GBDT algorithm, the model demonstrated practical applications in identifying and analyzing momentum shifts throughout matches. Our analysis confirmed the critical role of momentum in tennis performance, offering practical recommendations for pre-match analysis, in-match tactical adjustments, and post-match reviews. Furthermore, the model showed potential for use in opponent research and preparation, providing new insights and tools for coaches and players.

A significant challenge addressed in this study was the quantification momentum, an inherently dynamic and elusive factor in sports. By integrating player performance metrics, such as serve success rates, ranking disparities, and point-by-point results, the model effectively captured momentum changes in real time, thus overcoming a key limitation in previous research. Application of the model to the 2023 Wimbledon Championship demonstrated a strong alignment between momentum indicators and actual match outcomes, as evidenced by a Kappa coefficient of 0.96. This high consistency validates the model’s reliability, confirming the momentum can effectively represent match performance. These findings provide theoretical support for future momentum-based analysis and prediction in tennis.

Our findings align with previous research that has recognized momentum’s predictive ability in tennis (Moss & O'Donoghue, 2015). The model incorporated a wider range of dynamic factors, such as point-by-point results, serve advantages, and match sequences, making momentum capture more comprehensive and accurate than current models (Lin et al., 2024). While Moss and O'Donoghue (2015) focused on serve game patterns, our approach expanded the scope to capture real-time momentum changes in every rally, providing a more detailed analysis of match flow. Furthermore, although Ahmed’s (2014) probabilistic model emphasized match outcome prediction, it lacked continuous momentum analysis. In contrast, our model integrates both predictive and descriptive capabilities. demonstrated higher accuracy in identifying momentum shifts and their impact on match outcomes. This advancement enhances our understanding the temporal dynamics in tennis and provides greater accuracy in identifying and interpreting momentum shifts and their effects on match outcomes.

From a practical perspective, the model offers actionable value for both coaches and players. For coaches, the visualization and quantification of momentum shifts can assist in real-time tactical decisions- for instance, determining the optimal timing for coaching interventions, adjusting player positioning strategies, or recognizing when an opponent is gaining momentum. Players can also use the model outputs to heighten their situational awareness during matches, allowing them to respond more effectively to turning points. Knowing when momentum is likely to shift enables players to manage psychological pressure and adapt their tactics accordingly. Post-match momentum analysis can further provide valuable feedback on performance patterns, helping players and coaching staff refine training and strategic planning.

The analysis of feature importance revealed that the server variable had the greatest impact on momentum shifts. This is consistent with conventional match analysis, where the server is often seen as having more control over the pace and dynamics of play. Players with a stronger serve are better able to influence momentum, particularly during critical moments when momentum shifts are most pronounced. The point_no variable also proved to be significant, with momentum shifts becoming more pronounced as the match progresses. This suggests that factors such as physical endurance and psychological resilience become more critical in the later stages of the match. Specifically, turning points often occur in pivotal moments like the seventh game of a set or during tiebreaks, where the psychological stakes are particularly high. The point_winner variable further underscores the importance of consistently winning points to build momentum, which influences the trajectory of the match. Players who accumulate consecutive points can generate significant momentum, which impacts overall match results.

Despite its contributions, this study has certain limitations. The accuracy of the model relies on the data quality and completeness, meaning that missing or incomplete data could affect the reliability. Additionally, there is potential to improve the predictive accuracy by exploring alternative algorithms. Furthermore, while the model primarily focuses on technical indicators, it does not fully account for tactical, physical, and psychological factors, limiting its comprehensiveness. Future research could integrate these elements to increase the model’s practical applicability.

The results of this study are significant both theoretically and practically. Theoretically level, our model offers a robust framework for sports performance analysis by quantifying momentum, opening new avenues for exploring how momentum interacts with psychological, tactical, and physical performance, not only in tennis but also across other sports. Practically, coaches and players can use the model to better understand key momentum shifts and make informed strategic adjustments. By pinpointing when and how momentum changes occur, players can anticipate critical moments and adjust their tactics in real-time. Additionally, post-match momentum reviews provide insights for refining training and improving overall performance.

In conclusion, this study developed a data-driven model for analyzing momentum, which not only visualizes momentum changes but also predicts their impact on match outcomes. By integrating multiple dynamic factors, the model enhances predictive capabilities, making it a valuable tool for pre-match analysis and in-match tactical decision-making. These findings suggest that momentum is not solely a psychological concept but a quantifiable factor with a substantial impact on sports performance. Future research could expand on this model in incorporating additional variables, such as player fatigue and court conditions, to further refine momentum prediction. Ultimately, this study lays a solid foundation for the applying momentum in sports, with potential for broader applications across various competitive environments.

Concluding

CONCLUSION

This study quantified momentum in tennis and modeled its impact on match outcomes by constructing a machine learning based model that captures real-time momentum shifts. Overall, momentum changes during matches played a substantial role in shaping player performance, especially at pivotal scoring moments. The turning points of momentum were closely associated with serve success, return effectiveness, and break points related performance indicators. Such findings evidence that momentum is not merely an abstract concept but a measurable factor that significantly influences match results. The results reveal the significance of understanding momentum when interpreting the performance of professional tennis players, informing adapted tactical decisions and training methodologies by coaches and players. The developed momentum model could serve as an efficient analytical tool for performance analysts during pre-match debriefing and post-match reviews. By integrating this model into long-term player development, coaches and performance analysts can systematically monitor players’ technical, tactical, and physical performance while considering individual variations across match environments.

References

REFERENCES

Ahmed, T. (2014). Modelling Probabilities in Games of Tennis. In IB HL Maths Portfolio Type II. Oaktree International School. https://hoursandseconds.wordpress.com/wp-content/uploads/2013/11/modelling-tennis.pdf

Bayram, F., Garbarino, D., & Barla, A. (2021). Predicting tennis match outcomes with network analysis and machine learning. In International Conference on Current Trends in Theory and Practice of Informatics (pp. 505-518). Springer International Publishing. https://doi.org/10.1007/978-3-030-67731-2_37

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37-46. https://doi.org/10.1177/001316446002000104

Den Hartigh, R. J., Van Geert, P. L., Van Yperen, N. W., Cox, R. F., & Gernigon, C. (2016). Psychological momentum during and across sports matches: Evidence for interconnected time scales. Journal of Sport and Exercise Psychology, 38(1), 82-92. https://doi.org/10.1123/jsep.2015-0162

Dietl, H., & Nesseler, C. (2017). Momentum in tennis: Controlling the match. UZH Business Working Paper Series, (365). https://doi.org/10.5167/uzh-174238

Duen, S., & Peker, S. (2024). Predicting the Duration of Professional Tennis Matches Using MLR, CART, SVR and ANN Techniques. In C. Kahraman, S. Cevik Onar, S. Cebi, B. Oztaysi, A. C. Tolga, & I. Ucal Sari (Eds.), Intelligent and Fuzzy Systems. INFUS 2024. Lecture Notes in Networks and Systems (Vol. 1088). Springer, Cham. https://doi.org/10.1007/978-3-031-70018-7_37.

Iso-Ahola, S. E., & Dotson, C. O. (2014). Psychological Momentum: Why Success Breeds Success. Review of General Psychology, 18(1), 19-33. https://doi.org/10.1037/a0036406

Klaassen, F. J. G. M., & Magnus, J. R. (2001). Are Points in Tennis Independent and Identically Distributed? Evidence From a Dynamic Binary Panel Data Model. Journal of the American Statistical Association, 96(454), 500-509. https://doi.org/10.1198/016214501753168217

Lin, J., Shao, P., & Zhang, Q. (2024). Advancing tennis analytics: Comprehensive modeling for momentum identification and strategic insights. International Journal of Computer Science and Information Technology, 2(1), 104-117. https://doi.org/10.62051/ijcsit.v2n1.12

Ma, M. (2024). Momentum Dynamics in Competitive Sports: A Multi-Model Analysis Using TOPSIS and Logistic Regression. arXiv preprint arXiv:2409.02872. https://doi.org/10.48550/arXiv.2409.02872

Manuel, J. (2022). Capturing Momentum in Tennis. Opta Analyst. Retrieved October 10, 2024 from https://theanalyst.com/2022/03/capturing-momentum-in-tennis

Martínez-Gallego, R., Guzmán, J., James, N., Pers, J., Ramón-Llin, J., & Vučković, G. (2013). Movement characteristics of elite tennis players on hard courts with respect to the direction of ground strokes. Journal of Sports Science & Medicine, 12(2), 275-281. https://pmc.ncbi.nlm.nih.gov/articles/PMC3761832

Merriam-Webster. (2024). Momentum. Merriam-Webster.com dictionary. Retrieved October 17, 2024 from

Moss, B., & O’Donoghue, P. (2015). Momentum in US Open men’s singles tennis. International Journal of Performance Analysis in Sport, 15(3), 884-896. https://doi.org/10.1080/24748668.2015.11868838

Noel, J. T. P., Prado da Fonseca, V., & Soares, A. (2024). A Comprehensive Data Pipeline for Comparing the Effects of Momentum on Sports Leagues. Data, 9(2), 29. https://doi.org/10.3390/data9020029

Ötting, M., Langrock, R., & Maruotti, A. (2023). A copula-based multivariate hidden Markov model for modelling momentum in football. AStA Advances in Statistical Analysis, 107(1), 9-27. https://doi.org/10.1007/s10182-021-00395-8

Pham, C. L., & Bufi, K. (2023). Predicting Tennis Match Results Using Classification Methods. LUP Student Papers, 9121180. http://lup.lub.lu.se/student-papers/record/9121180

Qiu, M., Zhang, S., Yi, Q., Zhou, C., & Zhang, M. (2024). The influence of "momentum" on the game outcome while controlling for game types in basketball. Frontiers in Psychology, 15, 1412840. https://doi.org/10.3389/fpsyg.2024.1412840

Sackmann, J. (2024). Tennis slam point-by-point data. GitHub. Retrieved May 10, 2024 from

Sampaio, T., Oliveira, J. P., Marinho, D. A., Neiva, H. P., & Morais, J. E. (2024). Applications of Machine Learning to Optimize Tennis Performance: A Systematic Review. Applied Sciences, 14(13), 5517. https://doi.org/10.3390/app14135517

Tognini, M., & Perciavalle, V. (2022). Real-Time Data and Machine Learning in Sports Performance Analysis: A Case Study on Tennis. Journal of Sports Science & Technology, 21(4), 98-111.

Wang, L. H., & Lin, H. T. (2005). Momentum transfer of upper extremity in tennis one-handed backhand drive. Journal of mechanics in medicine and biology, 5(2), 231-241. https://doi.org/10.1142/S0219519405001436

Zheng Cao, Price J., & Stone, D. F. (2011). Performance Under Pressure in the NBA. Journal of Sports Economics, 12(3), 231-252. https://doi.org/10.1177/1527002511404785

Zhong, M., Liu, Z., Liu, P., & Zhai, M. (2024). Searching for the Effects of Momentum in Tennis and its Applications. Procedia Computer Science, 242, 192-199. https://doi.org/10.1016/j.procs.2024.08.262