Original Investigation
DOI: 10.30827/ijrss.33243


Markov-chain Modelling and Simulative Assessment of the Impact of Selected Tactical Behaviours in Modern Tennis


Modelado mediante cadenas de Markov y evaluación simulada del impacto de determinados comportamientos tácticos en el tenis moderno


International Journal of Racket Sports Science, vol. 5(1) (Enero-Junio, 2023), Pag. 1-13 . eISSN: 2695-4508


Received: 10-04-2022
Acepted: 19-06-2023

AUTHORS

Frederic Rothe

1 Technical University of Munich, Munich, Germany

Martin Lames ORCID

1 Technical University of Munich, Munich, Germany


Corresponding Author: Frederic Rothe freddirothe@web.de

Cite this article as: Rothe, F., & Lames, M. (2023). Markov-chain Modelling and Simulative Assessment of the Impact of Selected Tactical Behaviours in Modern Tennis. International Journal of Racket Sports Science, 5(1), 1-13 10.30827/ijrss.33243



ABSTRACT

Abstract

Game behaviour in net games or other sports is often captured in the form of discrete performance indicators which represent frequencies or relative frequencies of key behavioural variables. In this regard however, discrete performance indicators are often of low practical relevance as they lack information on the sequence of actions and the underlying interaction of players in a match. Thereby, establishing a connection between performance indicators and sport success also remains an open challenge. In tennis, finite Markov chain modelling based on a transition matrix has shown promise in circumventing these issues. The transition matrix allows the capture of equivalent classes of strokes as a sequence of states with the possibility of transitions between them, basically representing a rally. Furthermore, finite Markov chain modelling enables the determination of the relevance of state transitions regarding performance. Since existing state transition models may be outdated a major aim of the current study was to establish a newly designed transition matrix which is representative of the game structure of tennis. The sufficiency of the transition matrix as a descriptive tool was demonstrated using actual match data. Furthermore, the relevance of selected state transitions was determined using finite Markov chain modelling. Match data and emerging values for performance relevance were analysed with regard to the influencing factors of sex and court surface. This revealed only minor differences regarding both factors, specifically indicating a convergence of game structure in men and women.

Keywords: finite Markov chain modelling, state transitions modelling, tennis performance indicators, theoretical performance analysis, tactical behaviour.

Resumen

Los comportamientos durante los juegos de red u otros deportes suelen capturarse en forma de indicadores del rendimiento discretos que representan frecuencias o frecuencias relativas de variables conductuales clave. Sin embargo, en este aspecto, los indicadores del rendimiento discretos suelen tener poca relevancia práctica ya que carecen de información sobre la secuencia de acciones y la interacción subyacente de los jugadores durante un partido. Por lo tanto, establecer una conexión entre los indicadores del rendimiento y el éxito deportivo sigue siendo un reto. En tenis, el modelado mediante cadenas de Markov finitas basado en una matriz de transición se muestra prometedor para sortear estos problemas. La matriz de transición permite capturar clases de equivalencia de golpes como una secuencia de estados con la posibilidad de transiciones entre ellos, representando básicamente un intercambio de golpes. Adicionalmente, el modelado mediante cadenas de Markov finitas permite determinar la relevancia de transiciones de estado con relación al rendimiento. Dado que los modelos de transición de estado actuales pueden estar obsoletos, uno de los objetivos principales de este estudio fue establecer una matriz de transición con un diseño nuevo que fuera representativa de la estructura de un juego de tenis. Se pretendía demostrar la suficiencia de la matriz de transición como herramienta descriptiva utilizando datos reales de partidos. Adicionalmente, la relevancia de las transiciones de estado seleccionadas se determinó a través de modelado mediante cadenas de Markov finitas. Los datos de los partidos y los valores emergentes para la relevancia del rendimiento se analizaron en relación con dos factores influyentes: sexo y superficie del campo. Esto reveló solo pequeñas diferencias con respecto a ambos factores, indicando específicamente una convergencia de la estructura de juego en hombres y mujeres.

Palabras clave: modelado mediante cadenas Markov finitas, modelado de transiciones de estado, indicadores del rendimiento en tenis, análisis teórico del rendimiento, comportamiento táctico.



Introduction

1. INTRODUCTION


Performance analysis or the scientific analysis of sports performances generally has the objective of identifying and understanding factors which are crucial for performance in sports in order to provide information on how to maximize success (McGarry, 2009). In the context of game sports, i.e. invasion games, net games and striking/fielding games (Read & Edwards, 1992), performance is mainly determined by target-oriented behaviors or tactical actions and less dependent on biomechanical and physiological components (Hughes & Bartlett, 2002). In theoretical performance analysis, these tactical behaviors or game behaviors are generally assessed as performance indicators applying the research methods of game observation and notational analysis. Performance indicators reflect aspects which are deemed relevant for performance in a specific sport. Performance indicators mainly represent the frequency or relative frequency of a corresponding behavior (Hughes & Bartlett, 2002). In tennis, performance indicators are mostly presented as success rates of certain key behaviors or game variables. These may include among others, percentages of first serves or returns played in, point winning rates after respective serves and returns, as well as absolute variables like the number of aces and double faults (Ma et al., 2013; Reid, Morgan, & Whiteside, 2016).

An alternative method to scrutinize the structure of sports could possibly be found in Markov chain modelling (Lames & McGarry, 2007). Markov chain modelling represents a form of probabilistic modelling, which is applicable in a variety of fields, reaching from social sciences to epidemiology. A recent example for an application in medicine may be found in testing the efficacy of Covid-19 vaccines (Vygen-Bonnet et al., 2021). Markov chain modelling in game sports may circumvent some evident shortcomings of discrete performance indicators (Lames & McGarry, 2007).

With regard to their conceptualization, discrete performance indicators lack context since they constitute summative statistics in the form of mean values derived from an individual performance or a set of performances. This results in the negligence of underling interactions between teams/players and the sequence of events. Thus, by the mere assessment of discrete performance indicators it is hardly possible to examine the relation between actions and performance outcomes. Means derived from multiple performances, especially in performance profiles which accumulate different performances of an individual athlete (O’Donoghue, 2013), also suffer from the instability of game behavior (Lames & McGarry, 2007). This was for example demonstrated by McGarry and Franks (1996) who found squash players to exhibit varying shot responses to different opponents.

Finally, discrete performance indicators are commonly correlated to outcome variables like ranking position or winning/losing a match with the aim of explaining the success or the lack of it. However, success of certain tactical behaviors is highly context dependent, therefore establishing a definite causality between discrete performance indicators and outcomes might be misleading without knowing the sequential context of game behavior (Sampaio & Leite, 2013).

Markov chain modelling on the other hand can serve descriptive purposes and may also establish a link between game actions and outcomes. Lames (1991) introduced finite Markov chain modelling based on a transition matrix for tennis. Within the transition matrix (see examples in Tables 1-3), a rally is represented as succession of discrete states which are equivalent classes of strokes. Between different states, transitions are possible and quantified by the observed frequency of the corresponding game action. For instance, the first service error rate is given by the transition probability between the states “First service” and “Second service”. In general, a state transition models a player’s stroke and the associated outcome which may be a following stroke of the opponent as well as a point or an error of the player. This allows the clear differentiation between states, explaining why state transition modelling is especially convenient for net games. Moreover, individual states or state transitions, like the service error rate mentioned above, can be treated similarly to performance indicators. Aggregating states and transitions in the form of a transition matrix, containing each stroke of a match, results in a super rally. Most importantly, this allows to preserve the sequential context of the single rallies. State transition modelling thus provides a more comprehensive descriptive alternative to the generic display of performance indicators (Lames, 2020).

Net games were already successfully modelled as finite Markov chains in the literature, demonstrating their additional potential (Lames, 1991; Pfeiffer, Zhang, & Hohmann, 2010; Wang et al., 2020; Wenninger & Lames, 2016). Treating the transition matrix as finite Markov chain allows for further computations (Lames, 1991), which will be introduced below. These computations assume the Markov property. The Markov property denotes the so called “memorylessness” of the process, which means that the transition to a subsequent state is merely dependent on the present state of the process (Lames, 2020). Foremost, one may calculate expected rally length and point winning probability starting from each state. As these variables can be directly observed in the real game, they can be compared to their computed equivalents and thus used for validation purposes. Further, by manipulating selected transition probabilities in the transition matrix it is possible to examine the impact of changes in associated game behaviors by the resulting change in the calculated point winning probability (Lames, 1991). As a consequence, this may be seen as a viable method to establish a relation between game behavior and success including dependencies on sex, surface and playing level.

Besides net games, Markov chains have also been utilized in team sports, though the application in this context is scarce as of today. This is arguably due to difficulties in the implementation of the model, for example with regard to different ball possession times of players or teams, as well as objections against the Markov property in this context (Lames, 2020). Liu and Hohmann (2013), who produced one of few applications in football, used a transition matrix based on a grid of field positions of different players. Further, Lames et al. (1997) demonstrated the possibility of differentiating the impact of individual players on success in volleyball.

Generally, applications of finite Markov chain analyses are scarce and in the case of tennis only rather outdated studies exist e.g. Lames (1991). It can be expected that game structure changed considerably since then, for instance due to developments in material (Miller, 2006). Furthermore, the previous design of states was in part unsatisfactory. Concerning returns, there was no differentiation between such following a first or second serve. This might be regarded as a violation of the Markov property as it is known that rates of successful returns after both differ substantially (Lames, 2020). Furthermore, the serve can arguably be deemed the most important aspect in modern Tennis and has considerable influence on the first strokes after serve and return (O’Donoghue & Brown, 2008), which was not contained in the previous model. Finally, expert opinion suggests that net attacks at present do not have the importance they used to have in the 90ies. Thus, the previous model with its intensive treatment of net game seems to be outdated in this respect as well.

Taken together, the aim of the current study was to establish and validate a newly designed state transition model for tennis which represents the current game structure. The sufficiency of the resulting transition matrix regarding performance analyses was demonstrated with actual and recent match data from top-level tennis. Moreover, it utilized the transition matrix and finite Markov chain modelling to empirically assess state transitions and their tactical relevance with regard to the influencing factors playing surface and sex in modern tennis.


Methods

2. METHODS AND MATERIALS


2.1 Data acquisition and sample

Due to the context of present examination, systematic game observation was chosen as a method of data acquisition using a specifically designed observational system. Individual strokes constituted the unit of observation with the stroke types integrated in the transition matrix as attributes of observation (Lames, 1994, p. 48). The stroke types included are given below. Observer agreement was analyzed by comparing observation data with official data form the Australian and French Open. However, in the present context this is rather trivial as the features which were examined only included the initiation of the rally, rally length, as well as which player scored a point. The resulting agreement is 98.1% (Cohen’s 𝜅 = .979).

The analyzed matches were obtained from the online streaming platform “Eurosport player”. Examination included matches on hard and clay court, specifically the Australian Open 2020 (AO) and French Open 2021 (FO) men’s and women’s single competitions, starting from the quarterfinal. This results in the inclusion of 28 matches, 14 for men and women respectively, with data recorded for each player. Consequently, the sample size of recorded match performances was n = 56. In the sample of male players two matches were three set matches, three were four set matches and two were five set matches. In the sample of female players six matches were two set matches and one was a three set match. The analysis included nearly 30,000 shots in total.

The sample was chosen to be representative for elite level competition. Samples of men and women both included at least four players which were placed in the Top 30 of either the ATP or WTA world ranking at the respective point in time. In male payers average ranking position was 16, average height was 1.89m and average age was 28. In female players average ranking position was 28, average height was 1.75m and average age was 24. The quarterfinal of G. Muguruza vs. A. Pavlyuchenkova in the women’s singles of the Australian Open had to be excluded due to unavailability on the streaming platform. The match was therefore replaced by the round of last 32 match of G. Muguruza vs. K. Bertens to retain equal sample size. Procedures performed in the study were in strict accordance with the Declaration of Helsinki as well as with the ethical standards of the Technical University of Munich, Germany. Approval of an Ethics committee was not required.

2.2 State transition modelling in tennis

The state transition model for tennis used in this study is given in Figure 1. It contains the states and transitions between them representing the possible match flow in a tennis rally. A transition matrix is derived from this model and shows the empirical transition probabilities between the states for a specific match (see transition matrix in Fig. 2). Thus, a single state transition can literally be seen as equivalent to the ball travelling between both players and the whole matrix as a representation of the course of all rallies in the match in the sense of a “super-rally” described above.

The modelling process starts with first serve (S1). In the case of service error at the first serve, a second serve (S2) takes place. To omit a possible violation of the Markov property regarding the state return in the previous model, which was pointed out by Lames (2020), discrete states for first (R1) and second serve return (R2) are introduced.

To account for different success rates in the first few strokes of a rally, (O’Donoghue & Brown, 2008) groundstrokes were differentiated according to stroke numbers associated to the presumed period of advantage/disadvantage for the serving and returning player in the early phase of the rally. Thereby, the state groundstrokes 3/5 (GS 3/5) represents the first two strokes of the serving player after the return and the state groundstrokes 4/6 (GS 4/6) represents the first two strokes of the returning player after the return. The state groundstrokes >6 (GS >6) represents all subsequent strokes where an advantage/disadvantage resulting from the rally opening is no longer assumed.


Legend: S1: 1st Serve. S2: 2nd Serve. R1: 1st Serve Return. R2: 2nd Serve Return. GS35: Groundstrokes #3 & #5. GS46: Groundstrokes #4 & #6. GS>6: Groundstrokes > #6

v5n1a1image001.png

Figure 1 State transition model with possible state transitions


Presuming a lower frequency of net play, the newly defined state net includes all strokes where at least one of both players is positioned between the service line and the net. Likewise, the state includes attacking groundstrokes where the player clearly aims to approach the net, as well as groundstrokes played in response to one of the former situations. Also, the possibility of a transition from net play to further groundstrokes, e.g. by neutralizing a net attack with a lob and then continuing the rally from base-line is contained in the model.

Besides equivalence classes for specific strokes, the model also allows for the transitions to point and error which evidently represent the end points or absorbing states of the state transition model. The state point includes all winners, whereas the state error includes all strokes played out or in the net.

2.3 Finite Markov chain modelling

Besides the descriptive features of the associated transition matrix, finite Markov chain modelling constitutes a form of probabilistic modelling which allows for the calculation of interesting variables with regards to performance analysis. In the case of finite Markov chains, the transition matrix is distinguishable by a limited number of states. Moreover, it features the possibility of transitioning to absorbing states (point and error), which on entry imply the termination of the process in the current modelling step. All assertions made on definitions regarding Markov chains and related calculations in this paragraph are based on the textbook “Finite Markov Chains” by Kemeny and Snell (1976), referring to Lames (2020).

Calculations are based on the empirical transition matrices of both players in a match. Firstly, one may obtain the average frequency with which any state is touched in one step of the process, in this case a rally. From thereon it is likewise possible to calculate the steps until absorption from any state. Taking this value starting with state first serve gives the expected rally length. Other than that, also the absorption probability from any state in either of the absorbing states can be calculated. The absorption probability in state point is equivalent to the respective point winning probability. Here, the point winning probability starting with the first serve is of paramount relevance since this is the overall point winning probability and will subsequently be used to determine the impact of individual state transitions.

For the latter purpose, Lames (1991) introduced a method that allows the determination of the relevance of tactical behaviors or actions through manipulating the associated state transitions, i.e. by simulation. An alteration of the transition probabilities subsequently results in a positive or negative change of the overall point winning probability. The size of the change in winning probability reflects the impact of the simulated behavior and is termed performance relevance.

More concretely, the original transition matrix allows to calculate the overall point winning probability. After this, the performance relevance of a certain tactical behavior is determined by simulating a change in the frequency of the corresponding transition. Subsequently, the point winning probability is calculated again, now for the manipulated matrix. The difference between the point winning probability before and after the simulation denotes the performance relevance (PerfRel) (Lames, 1991). In the present study, the relevance of winners, errors and selected transitions in different stroke classes were simulated by manipulating the corresponding transition probabilities. Determining the PerfRel using the theory of Markov chains and above-mentioned method presents several conceptual challenges, e.g. modelling the amount of changing a transition to represent a comparable difficulty, the details of which are described in Lames (2020)

The adherence to the Markov property is the prerequisite for all computations involving Markov chain modelling (Lames, 2020). It may be tested by comparing the calculated model values for rally length and point winning probability to the corresponding real-world values, which can be obtained by game observation. Here the underlying premise is that if both values exhibit satisfying concurrency the violations of Markov property can assumed to be negligible. Further, this also evidences general model validity. We obtained a correlation of r = 0.946 between the observed winning probabilities (n = 56) and the predicted ones based on our state-transition model, assuming the Markov property. This can be deemed very sufficient even when applying strict standards. This also holds true for the concurrency of predicted and observed overall rally length with a correlation coefficient of r = 0.962.

2.4 Statistical testing

To examine discrepancies in tactical behavior and game structure regarding the factors sex and court surface transition probabilities and performance relevancies of the same were compared using a multi factorial ANOVA. The impact of court surface was assessed by comparing two tournaments, French Open (clay) and Australian Open (hard court).

Violations of the assumption of normal distributed data, which occurred in a few instances, were neglected, as the respective sample sizes of n = 56 for sex and court surfaces were deemed large enough to allow for the application of ANOVA. Besides, ANOVA is generally robust concerning non-normally distributed data which resembles a Gaussian distribution (Blanca Mena et al., 2017; Bortz & Schuster, 2011, p. 214; Herzog, Francis, & Clarke, 2019, p. 56). Caution has to be exerted, when drawing inferences from variables which showed differences in variance as this might inflate the Type 1 error rate. In our case though, the impact of unequal variances is not severe due to equally sized sample groups (Bortz & Schuster, 2011, p. 214; Herzog et al., 2019, p. 57). The significance level was set to α = 0.05.

Post-hoc two-sample t-tests were used to investigate differences regarding sex within both tournaments and court surface within both sex-groups. To account for the problem of multiple comparisons and associated inflation in type 1 error probability, Bonferroni corrections were applied, resulting in an applied significance level of α = 0.0125.

State transitions with a particularly low appearance of under n = 10 on average per match were excluded from the analysis.


Results

3. RESULTS


The result section first depicts a transition matrix for one match to demonstrate its capability to give an overall description of the match with strengths and weaknesses of a player relative to his opponent. Then, aggregated transition matrices are shown to provide something like a general structure of the rally in the sense of theoretical performance analysis (TPA; Lames & McGarry, 2007), for example for men on clay court. Furthermore, differences between game structures of male and female players as well as both tournaments are described. Finally, results on PerfRel are given with additionally testing the impact of tournament and sex.


3.1 Match Transition Matrix

A transition matrix of a single match gives the possibility of deconstructing a match as can be seen in Table 1. On the abstraction level of seeing the whole match as a super-rally one may compare corresponding transition probabilities and identify advantages and disadvantages of the players relative to the opponent.

The table below shows the transition matrices of the final of the Australian Open 2020 between Novak Djokovic and Dominic Thiem. Though this match was a contested five-set match (157 to 147 points) some evident differences can be read from the transition sub-matrices which arguably explain the outcome in favor of Djokovic.


Table 1 Transition matrixes of the male Australian Open final 2020

DjokovicS 2R1R2GS 3/5GS 4/6GS > 6NetPointError
S 135.158.2     6.7 
S 2  89.4    0.010.6
R 1   82.1  3.20.014.7
R 2   82.5  0.03.514.0
GS 3/5     79.1 0.78.811.5
GS 4/6    46.7 34.97.92.08.6
GS >6      80.92.74.312.1
Net    0.00.04.460.017.817.8
ThiemS 2R1R2GS 3/5GS 4/6GS > 6NetPointError
S 136.756.2     7.1 
S 2  91.9    0.08.1
R 1   76.9  3.80.019.2
R 2   78.6  2.40.019.0
GS 3/5     77.6 1.56.114.8
GS 4/6    45.1 34.47.42.510.7
GS >6      81.42.33.512.8
Net    0.00.00.052.131.316.7


Djokovic performed better at the return, exhibiting for example a higher rate of first serve returns with 82.1% compared to 76.9%. This corresponds to a considerably lower R1 error rate (14.7% vs. 19.2%). Second serve returns showed comparable results. Moreover, Djokovic showed lower rates of errors in GS 3/5 (11.5% vs. 14.8%), GS 4/6 (8.6% vs. 10.7%) and very slightly also in GS >6 (12.1% vs. 12.8%) indicating superiority in base line game. On the other hand, Thiem exhibited an advantage at the net evidenced by a higher point rate (31.3% vs. 17.8%) and lower error rate (16.7% vs. 17.8%) with both players approaching the net in roughly the same frequency. However, this advantage was evidently not sufficient to outweigh Djokovic’s superior return and base line game.

3.2 Aggregated transition matrices


Tables 2 and 3 display the average transition probabilities of men and women at both tournaments. The aggregated transition matrices allow the examination of differences in the game structure of various sample groups on a descriptive base.


Table 2 Average transition probabilities at the Australian Open

MenS 2R1R2GS 3/5GS 4/6GS > 6NetPointError
S 133.057.7 9.4
S 2 91.5 0.77.9
R 1 62.0 8.21.128.7
R 2 82.5 2.71.113.8
GS 3/5 76.1 1.57.415.0
GS 4/6 45.5 28.29.52.914.0
GS >6 77.83.44.814.0
Net 0.10.40.646.032.920.0
WomenS 2R1R2GS 3/5GS 4/6GS > 6NetPointError
S 133.859.7 6.5
S 2 90.1 0.09.9
R 1 74.3 3.00.522.2
R 2 79.9 0.53.516.1
GS 3/5 74.9 2.67.714.7
GS 4/6 48.0 28.64.15.513.9
GS >6 75.14.04.316.6
Net 0.00.33.646.930.718.4
♣: significant effect of sex within the tournament
♦: significant effect of court surface between tournaments


Table 3 Average transition probabilities at the French Open

MenS 2R1R2GS 3/5GS 4/6GS > 6NetPointError
S 135.459.4 5.3
S 2 92.0 0.47.6
R 1 69.1 9.20.820.9
R 2 82.9 1.62.413.1
GS 3/5 75.2 2.97.814.0
GS 4/6 47.2 29.76.62.813.7
GS >6 76.54.84.1 14.6
Net 0.10.01.354.026.418.2
WomenS 2R1R2GS 3/5GS 4/6GS > 6NetPointError
S 135.761.1 3.2
S 2 85.7 0.713.6
R 1 76.1 2.71.120.1
R 2 78.5 0.36.215.1
GS 3/5 71.1 2.89.616.6
GS 4/6 47.7 26.84.65.215.8
GS >6 71.62.37.718.3
Net 0.00.50.840.640.517.6
♣: significant effect of sex within the tournament
♦: significant effect of court surface between tournaments


Transitions from R1 to GS 3/5 show a significant effect of sex. Here men showed a lower transition probability to GS 3/5. Likewise, the associated error rate displayed an effect of court surface, which can be attributed mainly to higher error rates at the Australian Open by men. Regarding the state transitions from GS >6, transitions to subsequent GS >6 and errors both exhibited a significant effect of sex in the ANOVA. However, this effect was not present in the post-hoc tests.


Table 4 Descriptive statistics of the state transitions from the Australian Open (AO) included in the testing

State trans. AOMean Median Maximum Minimum Std Dev CV
MWMWMWMWMWMW
S1 to S233.033.834.732.343.846.318.924.46.77.020.320.7
S1 to R157.759.757.957.869.374.446.343.86.79.811.616.4
S2 to R291.590.192.092.296.710083.373.34.08.34.49.2
R1 to GS 3/562.074.360.574.382.193.549.157.78.910.114.413.6
R1 to error28.722.229.321.439.142.314.76.57.411.025.849.5
R2 to GS 3/582.579.981.980.994.393.371.455.66.210.07.512.5
GS 3/5 to GS 4/676.174.977.377.186.484.868.549.35.18.96.711.9
GS 3/5 to error15.014.715.513.620.425.410.05.63.25.421.336.7
GS 4/6 to GS 3/545.548.045.947.050.066.738.640.43.36.07.312.5
GS 4/6 to GS >628.228.627.430.434.935.722.917.04.45.515.619.2
GS 4/6 to error14.013.914.513.222.523.47.94.83.95.827.941.7
GS >6 to GS >677.875.179.576.289.086.066.361.45.96.07.68.0
GS >6 to error14.016.613.115.623.328.95.59.65.25.137.130.7
net to net46.046.946.742.760.010033.316.78.320.218.043.1


Table 5 Descriptive statistics of the state transitions from the French Open included in the testing

State trans. FOMean Median Maximum Minimum Std Dev CV
MWMWMWMWMWMW
S1 to S235.435.735.133.545.245.230.325.43.95.911.0216.53
S1 to R159.461.159.962.764.273.050.048.43.97.16.5711.62
S2 to R292.085.793.189.4100.096.383.372.04.48.24.789.57
R1 to GS 3/569.176.166.778.883.988.952.453.39.39.313.4612.22
R1 to error20.920.119.918.131.746.78.98.77.49.135.4145.27
R2 to GS 3/582.978.586.076.990.095.564.062.57.39.48.8111.97
GS 3/5 to GS 4/675.271.177.971.280.279.659.861.86.35.98.388.30
GS 3/5 to error14.016.613.316.628.024.06.410.65.43.538.5721.08
GS 4/6 to GS 3/547.247.747.547.953.152.939.240.93.83.58.057.34
GS 4/6 to GS >629.726.830.527.335.432.221.717.14.03.313.4712.31
GS 4/6 to error13.715.814.114.723.224.38.19.14.54.732.8529.75
GS >6 to GS >676.571.677.969.881.886.769.658.64.07.65.2310.61
GS >6 to error14.618.315.417.920.834.58.111.13.35.922.6032.24
net to net54.040.651.642.971.278.941.20.08.020.614.8150.74


Tables 4 and 5 show the descriptive statistics of the state transitions included in the statistical testing. We find only small differences between means and medians meaning that the distributions of the transition probabilities are quite symmetric. A characteristic feature is the large player-to-player variability of the transitions expressed in the span between minimum and maximum values as well as standard deviations. The coefficients of variation (CV%) speak also in favor of high match-to-match variability.

3.3 Performance relevance

The Figure 2 (above) displays the state transitions, exhibiting the highest values in PerfRel. Therein especially state transitions from groundstrokes to error exhibit high PerfRel, with values of over 1.5% displacement of the winning probability.


v5n1a1image002.png

Figure 2 State Transitions with Highest Performance Relevance


Table 6. Average performance relevance of state transitions with associated group differences

State TransitionsAustralian Open French Open Group Dif.
MWMW
S1 to R10.030.160.130.42
S2 to R20.430.500.470.63
R1 to GS 3/51.181.190.991.08
R2 to GS 3/50.430.430.380.33
GS 3/5 to GS 4/60.940.880.881.03
GS 4/6 to GS 3/50.800.740.890.93
GS >6 to GS >61.081.431.290.87FO; SW
net to net-0.28-0.19-0.33-0.17FO
S1 to point1.070.950.830.71-
GS 3/5 to point1.171.421.341.60-
GS 4/6 to point0.640.890.700.86-
GS >6 to point0.941.001.000.97-
net to point1.070.711.270.54-
S1 to S20.910.790.660.75
S2 to error0.460.500.470.62-
R1 to error1.431.271.231.24
R2 to error0.500.590.510.58-
GS 3/5 to error1.982.132.112.44
GS 4/6 to error1.251.471.391.54
GS >6 to error1.822.112.071.75
net to error0.890.601.070.49-
AO/FO: Significant effect of sex at Australian or French Open
SM/SW: Significant effect of court surface in men or women- : excluded from testing

The table above shows the PerfRel of all simulated state transitions at the individual tournaments. Additionally significant differences regarding sex within both tournaments as well as court surface within both sex groups are displayed. Significant differences in the PerfRel of transitions to subsequent strokes occurred in the transition from GS >6 to further GS >6. This can be attributed to the women at the French open which showed higher PerfRel than their male counterparts as well as than women at the Australian Open. Further the PerfRel in the transition from net to net was significantly lower in male players and even more so at the French Open.


DISCUSSION

4. DISCUSSION


4.1 Transition probabilities

One primary aim of this study was to assess game behavior and the game structure of tennis with regards to the factors sex and court surface using the transition matrix and transition probabilities incorporated therein. In the context of this analysis differences in tactical behavior regarding the factor court surface specifically refer to the difference between hard court and clay court. For this purpose, transition probabilities are compared to related performance indicators from the literature.

The interplay of serve and return seemed to be the main factor of discrepancy regarding sex and court surface in terms of tactical behavior. The associated transition probabilities from R1 to GS 3/5 as well as to error exhibited a significant influence of the factors sex and court surface respectively. Reflected in their lower transition probability to GS 3/5 following R1 male players arguably showed a more dominant and aggressive first serve. This may be attributed to the prevalence of higher serve speed in men which results from their greater physical capabilities. This gives return players less time to react impeding their ability to stay in the rally and leading to a greater proportion of first serve points (O’Donoghue & Ballantyne, 2004; O’Donoghue & Brown, 2008; Reid et al., 2016). Additionally, the impact of the first serve seemed to be even more profound on hard court. On this surface specifically men exhibited higher rates of first serve return errors than on clay. Thereby, one may conclude a greater impact of first serves on hard court in men. Its lower coefficients of restitution and friction lead to a lower ball bounce and higher ball speeds which make playing a successful return more difficult (Gillet et al., 2009; O’Donoghue & Ingram, 2001).

Present findings therein align with previous examinations which found similar properties regarding the game structure and the associated influence of sex on the first serve and subsequent return (O’Donoghue & Ballantyne, 2004; Reid et al., 2016). However, compared to even earlier findings of Lames (1991) the supposed difference in game structure regarding these factors may be less pronounced. Here significant differences were present in nearly all state transitions associated with first serve and return. Regarding the effect of court surface on the impact of first serves current examination also reflects previous results. These also indicated a higher impact of first serves on hard court which was for example evidenced by a higher percentage of aces and first serve points at the US Open (O’Donoghue & Ballantyne, 2004; O’Donoghue & Ingram, 2001). It is also important to note, when comparing results to previous examinations that since 2008 the Australian Open and US Open are played on the same hard-court surface (O’Donoghue & Brown, 2008). Therefore, it is presumably more appropriate to draw comparisons regarding data from the US Open if the examination predates this year.

Regarding the GS 3/5 and 4/6 following the return no differences were present regarding the factors sex and court surface in the examined state transitions. Thus, it can be assumed that tactical behavior regarding those strokes is relatively similar in the current examination thereby contradicting earlier findings. In contrast, O’Donoghue and Brown (2008) found different point winning probabilities in these stages of the rally with regard to the examined factors. This might indicate that in recent years game behavior in male and female players as well as on hard and clay court has become more homogenous. One possible explanation could be found in the progressing development of materials and equipment, which are known to influence stroke parameters (Haake et al., 2007; Miller, 2006). Those may act in favor of game structures converging regarding sex as well as court surface.

Regarding the later stages of the rally results also contradict earlier findings on sex-specific tactical behavior. Generally rallies were assumed to be longer in female players thereby indicating a preference to hold the ball in play instead of going for the point (Lames, 1991; O’Donoghue & Ingram, 2001). In contrast, current examination exhibited transition probabilities from GS >6 to subsequent GS >6 and errors which are quite the opposite of what would have been expected. Female players, on average, showed a higher transition probability from GS >6 to errors and a lower transition probability to further GS >6 than their male counterparts. This could even indicate slightly more offensive tactical behavior in female players in this phase of the rally. Looking at the actual rally length, which was calculated for validation purposes (see section 2.3), it further underpins this presumption with women exhibiting a lower average rally length than male players. Consequently, this may be seen as further evidence for a general change of tactical behavior in women which entails more proactive tactical behavior especially in the form of more offensive groundstrokes.

Yet, aforementioned results must be treated with caution regarding the possible conclusion of women showing more offensive tactical behavior than male players deduced from the significant differences in transition probabilities. Looking at both tournaments individually reveals that aforementioned discrepancies in average transition probability arguably result from the sample of the French Open. However, said differences were not significant regarding the factor sex when looking at the French Open individually. This is arguably due to the more stringent significance level in the post-hoc test and the overall effect being rather weak. Moreover, at the Australian Open transition probabilities showed almost no difference between men and women for said state transitions, thereby rather pointing in the direction of converging game structures. Additionally, actual rally length at the Australian Open was almost similar between men and women. Therefore, whether tactical behaviors of both sexes just converged or whether women even developed more offensive tactical behavior in this phase of the rally cannot be stated with certainty or may be dependent on court surface. Nevertheless, results likely affirm the hypotheses that game structure and tactical behavior changed considerably over time. Though using a different state transition system, Lames (1991) found more offensive tactical behavior in male players with regards to the rally indicated most notably by their significantly higher prevalence of attacking the net. In contrast current results generally show a low prevalence and importance of net play as well as more similar properties regarding tactical behavior at groundstrokes in both sexes.

Overall serve and return seem to remain the factors which exhibit the greatest discrepancies in tactical behavior between men and women. In subsequent groundstrokes, the game structure of male and female players seems to have been converged indicated by the lack of substantial differences in associated transition probabilities. Compared to previous times, female players seem to exhibit similar degrees of proactive and offensive tactical behavior in groundstrokes than male players. Therefore, it can be argued that overall game structures in men and women have become more homogenous in recent years. Significant differences only emerged with regard to serve and return where the physical advantage of men has the most weight. Furthermore, differences regarding the court surface only emerge regarding serve/return in the examined state transitions. This may likewise indicate a convergence in game structure on clay and hard court.

Finally, the transition matrix and included transition probabilities might also serve practitioners and coaches to derive implications for training as well as helping them in analyzing match performances. Referring to the given match example in 3.1, the transition matrix allows the identify of strengths and weaknesses regarding the frequency of strokes. Herein especially transitions to points and errors may help to identify strong and weak phases in the rally. Also, the transitions between transient states allow to broadly identify certain playing characteristics. For example, the transition to the state net gives implications whether a player shows more defensive tactical behavior in relying on ground strokes or if a player exhibits more offensive tactical behavior in approaching the net frequently. Further it is also of practical interest to identify where such transitions occur over the course of a rally. A frequent transition from first return to net for example indicates that a player has a proclivity to play serve and volley while the same transition occurring later in the rally could imply that a player shows variable tactical behavior and is good in assessing the right situation to approach the net.

4.2 Performance Relevance

The most notable feature of finite Markov Chain modelling in the context of PerfRel is the possibility to link state transitions to performance outcomes. Assertions made regarding PerfRel in this paragraph refer to the percentage displacement of the overall point winning probability resulting from the simulation in the respective state transition. Moreover, an aim of the present examination was to assess the characteristics of PerfRel with regards to the factors sex and court surface.

Generally state transitions to error, especially such from groundstrokes exhibited the greatest PerfRel. Firstly, this is plausible from a conceptual perspective as a change of the error rate is accompanied by an opposite change of strokes played in and point rates. This is especially convenient regarding the error rates in states R1 and GS 4/6, since as a returning player remaining in the rally for longer is associated with increasing point winning probability (O’Donoghue & Brown, 2008). Otherwise in GS 3/5 and >6 changing the error rate is arguably tantamount with changing the frequency of unforced errors. In the state GS 3/5 this can be assumed as the player is arguably in an advantageous position and therefore not forced to commit errors. In GS >6 the rally is likely more evenly matched and therefore presumably often decided by one of the players committing an unforced error. Likewise, both states exhibit the highest absolute frequency of errors besides the first serve presumably contributing to the equally high level of PerfRel. The significantly higher PerfRel of error rates in the states GS 3/5 and 4/6 in female players could reflect a greater reliance on groundstrokes in winning the rally. However, this assertion must be drawn with caution as said effect was limited to the ANOVA and was additionally very small.

S1 and GS 3/5 likewise showed comparatively high PerfRel with regard to the transitions probabilities to point, in men and women. Additionally, the transition from net to point exhibited a high PerfRel in men. The magnitude of the PerfRel in the transition from S1 to point seems plausible, since it reduces the number of errors as well as points played with a second serve and thus lower point winning probability. The relevance of the transition from GS 3/5 might be explained by the frequency of points being highest in this state accompanied by the advantage of the service players at these strokes. Regarding the PerfRel of the point rate in the state net, it seems that male players rely more heavily on net play than their female counterparts. However, this state transition was omitted from statistical testing as it did not meet the requirement of on average 10 occurrences per match.

Manipulations of the state transitions to subsequent transient states resulted in particularly high PerfRel in the transitions from R1 to GS 3/5 as well as from GS >6 to further GS >6. The PerfRel in the former again is most likely a result of the increasing chance of winning a point for the returning player by keeping the ball in play. The PerfRel regarding the transition from GS >6 to GS >6 may be explained similar to the PerfRel of the error rate in said state. The PerfRel of this transition scaled opposed with regards to sex and court surface. Herein, PerfRel was higher in female players at the Australian Open. In contrast PerfRel of this state transition was higher in male players at the Australian Open, even significantly so. A significant difference in terms of said PerfRel with regard to court surface was only present in women, with female players showing a higher PerfRel on hard court. This could implicate that more offensive behavior in later stages of the rally is convenient in female players on clay court. A possible explanation for this might lie in the slower properties of clay court, which possibly requires female players to take greater risk to win a rally.

Moreover, significant effects of sex were present in both, the transition from S1 to R1, as well as from net to net. The former is arguably more convenient in women as their slower second serve might put them in a disadvantage (O’Donoghue & Brown, 2008). However, the PerfRel of this transition is not paramount in both sexes. Further the transition from the state net to net exhibits negative PerfRel in both sex-groups, since it is obviously disadvantageous to not immediately finish the point when approaching the net. This seems to be even more pronounced in male players, potentially again due to their assumed greater reliance on net points. Overall results regarding sex further substantiate the findings on transition probabilities. Again, the emergence of only minor differences in PerfRel between men and women could indicate the game structures of both becoming more homogenous.



Concluding

5. CONCLUSION


Establishing a connection between game behavior and sport success is a crucial aspect in identifying key factors for performance in game sports (McGarry, 2009). Here, Markov chains constitute a promising tool especially in the analysis of net games like tennis. Besides, several conceptual challenges remain in the implementation of Markov chain modelling for performance analysis as has been pointed out by Lames (2020).

First, it would be interesting to explore future applications in other game sports. This has been done recently, for example in table tennis by Wang et al. (2020). In invasion games like football, this largely remains an open challenge, with only few applications so far, for example by Liu and Hohmann (2013). The characteristics of such game sports make it hard to establish a meaningful state transition system, for example due to different times of ball possession. One possible solution in this context could be the implementation of continuous Markov chains, as for example demonstrated by Meyer, Forbes, and Clarke (2006) for Australian football. This type of Markov chain uses a continuous time function which could be more suited for the assessment of invasion games.

Further, objections against the Markov property could be addressed by the implementation of higher order Markov chains. A transition to a subsequent state would then not only depend on the present state but could also emphasize several forgoing states. This could be utilized for example considering different winning probabilities following the first few strokes after a first serve in contrast to those following a second serve in tennis, thereby further increasing model quality. Second order Markov chains were used by Wang et al. (2020) in table tennis. Problems with this method are the inflation of the number of states and transitions and the loss of validity of the transitions for tactical behavior, e.g. low frequencies of single transitions, one tactical behavior is expressed in several transitions.

Another problem remains the static display of ratios in the transition matrix. While the transition matrix allows for a general reconstruction of the sequence of events, it still neglects fluctuations of performance over a match. A possible corrective in this matter may be found in the implementation of drifting Markov chains. Those incorporate a polynomic drift to deal with unsteady processes like game actions. This has for example been implemented in the stochastic analysis of DNA-sequences by Vergne (2008).

The present examination arguably also demonstrates the descriptive features of the state transition model as well as the possibility to determine the impact of game behavior through Markov chain modelling. The current examination was thereby able to identify distinct characteristics of the game structure of Tennis in elite level competitions regarding the factors sex and court surface. Therein the main discrepancies occurred with regards to first serve and first serve return. However, the central finding was that the game structure was relatively similar in relation to both factors. Maybe one may give some explanations for that: changes in material which enable women to play faster and accentuate the importance of groundstrokes. Furthermore, women have become more athletic and technically more constant. Thus, passively waiting for an error of the opponent is not a successful strategy for women anymore as may be perceived in the development of German Angelique Kerber.

Finally, the present examination entails several limitations. First the states in the transition matrix do not or only partially take into consideration information about shot type, playing direction, technique and anthropometrical information of the players included in the sample. However, these factors can be expected to influence states included in the transition matrix, especially the serve. Therefore, the implications drawn from the present examination are limited to the factors emphasized in the transition matrix which are mainly the shot number as well as to some extent the general shot type. Furthermore, while the transition matrix somewhat preserves the sequential context of a rally it still misses the specific context of the underlying individual rallies concerning the dynamic interaction process of players. This also applies to the other mentioned factors, especially playing direction and technique. Therefore, caution must be exerted when drawing conclusions about individual playing characteristics and performances from individual transitions as well as the PerfRel. Furthermore, the sample included in the present examination only allows for drawing conclusions regarding the absolute elite level of Tennis. Therefore, differences regarding the analyzed factors cannot be generalized to a broader population of tennis players.


References

REFERENCES


Blanca Mena, M. J., Alarcón Postigo, R., Arnau Gras, J., Bono Cabré, R., & Bendayan, R. (2017). Non-normal data: Is ANOVA still a valid option?. Psicothema, 29(4), 552-557.

Bortz, J., & Schuster, C. (2011). Statistik für Human-und Sozialwissenschaftler: Limitierte Sonderausgabe: Springer-Verlag.

Gillet, E., Leroy, D., Thouvarecq, R., & Stein, J.-F. (2009). A Notational Analysis of Elite Tennis Serve and Serve-Return Strategies on Slow Surface. The Journal of Strength & Conditioning Research, 23(2), 532-539. doi: 10.1519/JSC.0b013e31818efe29

Haake, S. J., Allen, T. B., Choppin, S., & Goodwill, S. R. (2007). The Evolution of the Tennis Racket and its Effect on Serve Speed. Paper presented at the Tennis Science and Technology 3, London.

Herzog, M. H., Francis, G., & Clarke, A. (2019). Understanding Statistics and Experimental Design: How to Not Lie with Statistics: Springer Nature.

Hughes, M., & Bartlett, R. (2002). The use of performance indicators in performance analysis. Journal of Sports Sciences, 20, 739-754. doi: 10.1080/026404102320675602

Kemeny, J. G., & Snell, J. L. (1976). Markov chains: Springer-Verlag, New York.

Lames, M. (1991). Leistungsdiagnostik durch Computersimulation: Ein Beitrag zur Theorie der Sportspiele am Beispiel Tennis: Deutsch.

Lames, M. (1994). Systematische Spielbeobachtung: Philippka.

Lames, M. (2020). Markov Chian Modelling And Simulations In Net Games. In C. Ley & Y. Dominicy (Eds.), Science Meets Sports: When Statistics Are More Than Numbers (pp. 147-170): Cambridge Scholars Publisher.

Lames, M., Hohmann, A., Daum, M., Dierks, B., Fröhner, B., Seidel, I., & Wichmann, E. (1997). Top oder Flop: Die erfassung der Spielleistung in den Mannschaftssportspielen. Sport-Spiel-Forschung Zwischen Trainerbank und Lehrstuhl, 101-117.

Lames, M., & McGarry, T. (2007). On the search for reliable performance indicators in game sports. International Journal of Performance Analysis in Sport, 7(1), 62-79.

Liu, T., & Hohmann, A. (2013). Applying the Markov Chain Theory to Analyze the Attacking Actions between FC Barcelona and Manchester United in the European Champions League Finale. International Journal of Sports Science and Engineering, 7(2), 79-86.

Ma, S. M., Liu, C. C., Tan, Y., & Ma, S. C. (2013). Winning matches in Grand Slam men's singles: an analysis of player performance-related variables from 1991 to 2008. J Sports Sci, 31(11), 1147-1155. doi: 10.1080/02640414.2013.775472

McGarry, T. (2009). Applied and theoretical perspectives of performance analysis in sport: Scientific issues and challenges. International Journal of Performance Analysis in Sport, 9(1), 128-140.

McGarry, T., & Franks, I. M. (1996). In search of invariant athletic behaviour in sport: an example from championship squash match-play. J Sports Sci, 14(5), 445-456. doi: 10.1080/02640419608727730

Meyer, D., Forbes, D., & Clarke, S. R. (2006). Statistical analysis of notational AFL data using continuous time Markov Chains. Journal of sports science & medicine, 5(4), 525.

Miller, S. (2006). Modern tennis rackets, balls, and surfaces. Br J Sports Med, 40(5), 401-405. doi: 10.1136/bjsm.2005.023283

O’Donoghue, P. (2013). Sports Performance Profiling. In Routledge handbook of sports performance analysis: Routledge.

O’Donoghue, P., & Ballantyne, A. (2004). The impact of speed of service in Grand Slam singles tennis. In Science and racket sports III (pp. 223-229): Routledge.

O’Donoghue, P., & Brown, E. (2008). The Importance of Service in Grand Slam Singles Tennis. International Journal of Performance Analysis in Sport, 8(3), 70-78. doi: 10.1080/24748668.2008.11868449

O’Donoghue, P., & Ingram, B. (2001). A notational analysis of elite tennis strategy. Journal of Sports Sciences, 19(2), 107-115. doi: 10.1080/026404101300036299

Pfeiffer, M., Zhang, H., & Hohmann, A. (2010). A Markov chain model of elite table tennis competition. International Journal of Sports Science & Coaching, 5(2), 205-222.

Read, B., & Edwards, P. (1992). Teaching Children to Play Games. Leeds: White Line Publishing

Reid, M., Morgan, S., & Whiteside, D. (2016). Matchplay characteristics of Grand Slam tennis: implications for training and conditioning. J Sports Sci, 34(19), 1791-1798. doi: 10.1080/02640414.2016.1139161

Sampaio, J., & Leite, N. (2013). Performance indicators in game sports. In T. McGarry, P. O’Donoghue, & J. Sampaio (Eds.), Routledge handbook of sports performance analysis (pp. 115-126): Routledge.

Vergne, N. (2008). Drifting Markov models with polynomial drift and applications to DNA sequences. Statistical applications in genetics and molecular biology, 7(1).

Vygen-Bonnet, S., Koch, J., Bogdan, C., Harder, T., Heininger, U., Kling, K.,. . . Mertens, T. (2021). Beschluss der STIKO zur 1. Aktualisierung der COVID-19-Impfempfehlung und die dazugehörige wissenschaftliche Begründung.

Wang, J., Zhao, K., Deng, D., Cao, A., Xie, X., Zhou, Z.,. . . Wu, Y. (2020). Tac-Simur: Tactic-based Simulative Visual Analytics of Table Tennis. IEEE transactions on visualization and computer graphics, 26(1), 407-417. doi: 10.1109/TVCG.2019.2934630

Wenninger, S., & Lames, M. (2016). Performance analysis in table tennis-stochastic simulation by numerical derivation. International Journal of Computer Science in Sport, 15(1), 22-36.