Performance analysis or the scientific analysis of sports performances generally has the objective of identifying and understanding factors which are crucial for performance in sports in order to provide information on how to maximize success (McGarry, 2009). In the context of game sports, i.e. invasion games, net games and striking/fielding games (Read & Edwards, 1992), performance is mainly determined by target-oriented behaviors or tactical actions and less dependent on biomechanical and physiological components (Hughes & Bartlett, 2002). In theoretical performance analysis, these tactical behaviors or game behaviors are generally assessed as performance indicators applying the research methods of game observation and notational analysis. Performance indicators reflect aspects which are deemed relevant for performance in a specific sport. Performance indicators mainly represent the frequency or relative frequency of a corresponding behavior (Hughes & Bartlett, 2002). In tennis, performance indicators are mostly presented as success rates of certain key behaviors or game variables. These may include among others, percentages of first serves or returns played in, point winning rates after respective serves and returns, as well as absolute variables like the number of aces and double faults (Ma et al., 2013; Reid, Morgan, & Whiteside, 2016).
An alternative method to scrutinize the structure of sports could possibly be found in Markov chain modelling (Lames & McGarry, 2007). Markov chain modelling represents a form of probabilistic modelling, which is applicable in a variety of fields, reaching from social sciences to epidemiology. A recent example for an application in medicine may be found in testing the efficacy of Covid-19 vaccines (Vygen-Bonnet et al., 2021). Markov chain modelling in game sports may circumvent some evident shortcomings of discrete performance indicators (Lames & McGarry, 2007).
With regard to their conceptualization, discrete performance indicators lack context since they constitute summative statistics in the form of mean values derived from an individual performance or a set of performances. This results in the negligence of underling interactions between teams/players and the sequence of events. Thus, by the mere assessment of discrete performance indicators it is hardly possible to examine the relation between actions and performance outcomes. Means derived from multiple performances, especially in performance profiles which accumulate different performances of an individual athlete (O’Donoghue, 2013), also suffer from the instability of game behavior (Lames & McGarry, 2007). This was for example demonstrated by McGarry and Franks (1996) who found squash players to exhibit varying shot responses to different opponents.
Finally, discrete performance indicators are commonly correlated to outcome variables like ranking position or winning/losing a match with the aim of explaining the success or the lack of it. However, success of certain tactical behaviors is highly context dependent, therefore establishing a definite causality between discrete performance indicators and outcomes might be misleading without knowing the sequential context of game behavior (Sampaio & Leite, 2013).
Markov chain modelling on the other hand can serve descriptive purposes and may also establish a link between game actions and outcomes. Lames (1991) introduced finite Markov chain modelling based on a transition matrix for tennis. Within the transition matrix (see examples in Tables 1-3), a rally is represented as succession of discrete states which are equivalent classes of strokes. Between different states, transitions are possible and quantified by the observed frequency of the corresponding game action. For instance, the first service error rate is given by the transition probability between the states “First service” and “Second service”. In general, a state transition models a player’s stroke and the associated outcome which may be a following stroke of the opponent as well as a point or an error of the player. This allows the clear differentiation between states, explaining why state transition modelling is especially convenient for net games. Moreover, individual states or state transitions, like the service error rate mentioned above, can be treated similarly to performance indicators. Aggregating states and transitions in the form of a transition matrix, containing each stroke of a match, results in a super rally. Most importantly, this allows to preserve the sequential context of the single rallies. State transition modelling thus provides a more comprehensive descriptive alternative to the generic display of performance indicators (Lames, 2020).
Net games were already successfully modelled as finite Markov chains in the literature, demonstrating their additional potential (Lames, 1991; Pfeiffer, Zhang, & Hohmann, 2010; Wang et al., 2020; Wenninger & Lames, 2016). Treating the transition matrix as finite Markov chain allows for further computations (Lames, 1991), which will be introduced below. These computations assume the Markov property. The Markov property denotes the so called “memorylessness” of the process, which means that the transition to a subsequent state is merely dependent on the present state of the process (Lames, 2020). Foremost, one may calculate expected rally length and point winning probability starting from each state. As these variables can be directly observed in the real game, they can be compared to their computed equivalents and thus used for validation purposes. Further, by manipulating selected transition probabilities in the transition matrix it is possible to examine the impact of changes in associated game behaviors by the resulting change in the calculated point winning probability (Lames, 1991). As a consequence, this may be seen as a viable method to establish a relation between game behavior and success including dependencies on sex, surface and playing level.
Besides net games, Markov chains have also been utilized in team sports, though the application in this context is scarce as of today. This is arguably due to difficulties in the implementation of the model, for example with regard to different ball possession times of players or teams, as well as objections against the Markov property in this context (Lames, 2020). Liu and Hohmann (2013), who produced one of few applications in football, used a transition matrix based on a grid of field positions of different players. Further, Lames et al. (1997) demonstrated the possibility of differentiating the impact of individual players on success in volleyball.
Generally, applications of finite Markov chain analyses are scarce and in the case of tennis only rather outdated studies exist e.g. Lames (1991). It can be expected that game structure changed considerably since then, for instance due to developments in material (Miller, 2006). Furthermore, the previous design of states was in part unsatisfactory. Concerning returns, there was no differentiation between such following a first or second serve. This might be regarded as a violation of the Markov property as it is known that rates of successful returns after both differ substantially (Lames, 2020). Furthermore, the serve can arguably be deemed the most important aspect in modern Tennis and has considerable influence on the first strokes after serve and return (O’Donoghue & Brown, 2008), which was not contained in the previous model. Finally, expert opinion suggests that net attacks at present do not have the importance they used to have in the 90ies. Thus, the previous model with its intensive treatment of net game seems to be outdated in this respect as well.
Taken together, the aim of the current study was to establish and validate a newly designed state transition model for tennis which represents the current game structure. The sufficiency of the resulting transition matrix regarding performance analyses was demonstrated with actual and recent match data from top-level tennis. Moreover, it utilized the transition matrix and finite Markov chain modelling to empirically assess state transitions and their tactical relevance with regard to the influencing factors playing surface and sex in modern tennis.