VSike every year at the dawn of winter, the Champions League offers a nice calculation exercise: what are the probabilities of the draw for the round of 16 which will take place on Monday, December 13? Why do they vary so much from match to match? How are they calculated? And, of course, who are the most and least likely opponents for PSG and Lille?
Remember the constraints that UEFA imposes during the draw: group winners must be matched with group runners-up. In addition, two teams from the same country or from the same group cannot compete.
Take the example of Chelsea and Lille. Since Lille have six possible opponents (all second in the group, except Paris, from the same country, and Salzburg, which Lille faced in the pool), one might think that Lille has a one in six chance of falling on Chelsea. But, since Chelsea only have four possible opponents, according to this logic, the probability of Chelsea-Lille should also be equal to one in four. There is something wrong!
It’s also tempting to think that in order to calculate the probability that two teams A and B meet, it suffices to list all the eligible draw results (there are 4,781 of them this year), then calculate the proportion of eligible draws. for which A meets B. For example, among the 4,781 possible results of the draw, there are exactly 1,192 results for which Lille meets Chelsea (ie 24.93%).
A sequential procedure that impacts the calculation of probabilities
It is tempting to conclude that the probability of Chelsea-Lille is 24.93%. This would be the case if the toss consisted of randomly drawing one ball from among 4,781, each ball representing a full qualifying draw. But the draw is of course not done like that! It follows a sequential procedure which impacts the calculation of probabilities.
Eight balls, corresponding to the second eight in groups, are placed in an urn. Each time a ball is drawn, an algorithm of backtracking (“Backtracking”) provides the list of possible opponents for this team. It can be more complicated than it seems: you have to anticipate future dead ends.
Imagine, for example, that the first four matches drawn are PSG-Manchester United, Sporting Lisbon-Liverpool, Benfica-Juventus and Salzburg-Manchester City, and that Chelsea are the fifth ball drawn from the ballot box. Even if Chelsea seem a priori to be able to play against Ajax, Bayern and Lille, only Real Madrid would be listed as a possible opponent for Chelsea.
Indeed, any other draw would lead to a dead end: Real Madrid could only meet a Spanish club or Inter Milan, which is prohibited. This explains why Chelsea-Real Madrid are so likely. Once the list of authorized opponents is provided, one of these opponents is drawn.
The influence of a single goal from an already eliminated team
By implementing the backtracking algorithm, we can calculate the exact probabilities by computer. The draw procedure has an impact on the odds: Chelsea-Lille are slightly more likely than they should be (25.19% vs. 24.93%), while Chelsea-Real, the most likely match of the round of 16, is a little less (31.27% against 32.32%). By “should be” I mean: if the 4,781 eligible draws were equally likely.
Lille’s most likely opponent is therefore Chelsea, far ahead of Atlético de Madrid, Inter Milan and Villarreal (16.18%) and Portuguese clubs (13.13%). As for PSG, their most likely opponent is Real Madrid (19.36%), ahead of Liverpool, Manchester United and Juventus (17.86%), then Ajax and Bayern (13.53%).
While Real are Paris’ most likely opponent, PSG are not, however, Real’s most likely opponents. An extreme case is funny: Chelsea are Ajax’s most likely opponent, but Ajax are Chelsea’s least likely opponent! Note also that if Chelsea is by far Lille’s most likely opponent, there is still almost a 75% chance that Lille will not fall on Chelsea!
Finally, I calculated what the odds would have been if Chelsea hadn’t conceded a goal in stoppage time in their last game against Zenith St. Petersburg. Chelsea would then have won Group H ahead of Juventus, and the odds would have been completely different. Not only for Chelsea and Juventus, but also for all the other teams. It is amazing how much a single goal (from an already eliminated team) can influence the rest of the competition in this way!
Julien Guyon is a mathematician and football fan. A quantitative analyst, he is also an associate professor in the mathematics department at Columbia University and at the Courant Institute of Mathematical Sciences at New York University. His work is available on his web page: http://cermics.enpc.fr/~guyon/ and on his Twitter account: @ julienguyon1977.