About World Football RankingsThe Rating SystemProcessThe InOnIt.com rankings use international match data from friendlies ("A" internationals), minor and major tournaments, and World Cup qualifying and finals. The algorithm attempts to model scoring. It models the number of goals scored in a given match by a team against a particular opponent as a Poisson process. Given the match data, the ratings perform a best-fit analysis by attempting to assign arbitrary ratings to individual sides and adjusting them to reflect actual results. In essence, the process assumes that the ability to score (or prevent scoring) is a transitive process. If A will score twice as often as B against a given opponent (C), the model assumes that A will score twice as often as B against any opponent D (although the rate of scoring for A and B will change based on the quality of D's defense). Because of the best-fit analysis, it's possible for a result between A and B to affect C's rating by shifting the relative value of previous results against A and B for every side that has competed with either, and shifting their opponents' and opponents' opponents (and so forth) ratings. This differs from the FIFA ratings or the proposed alternative Elo ratings (based on the chess rating system). A previous version of the algorithm used goal differential as the main variable and did not attempt to model individual goals for each side. During the development of that algorithm, several variables were tuned to produce a least-squared error for the predictions the algorithm made against actual results. Essentially, all matches were divided into ten groups. The algorithm used nine groups to generate ratings and used them to predict the results of the match in the tenth group. This analysis was repeated for each group as the target group (and the others as the rating groups). This produced an error function which depended upon a single variable (the variable being optimized; for example, the value of playing on home field, or the amount of regression toward the mean that is the best predictor, or the amount to value newer results over older ones). Using straightforward numerical methods, a tuning program found local minima of this function in order to find the best possible values for various constants. One interesting finding from this analysis is that more "important" games (e.g., World Cup finals and qualifiers) have more predictive value than other results (even for predicting "unimportant" results). This was not shocking (national teams deploy their best players in more important matches), but was not wholly expected. It's a nice verification that the model is measuring something real. These constants have been adapted for use with the new algorithm. The constants can be re-computed for the new algorithm at some point, but computing them the first time was extremely computationally expensive and there is no apparent reason that their values will change appreciably because of the change in model from goal differential to goals. FIFA World RankingsThe FIFA World Rankings measure ... well, something. But certainly not the quality of a side. Their algorithm is quite complex, and may serve some overarching purpose of "fairness" or creating incentives for national federations to behave in a particular way. But their composition has the flavor of the outcome of a series of "design meetings" that created arbitrary cutoffs (the double-the-top-seven system, for example) and/or an attempt to get the ratings to match some sort of preconceived notions by iteratively tweaking them with special rules and exceptions (the continental multipliers make little sense) until the inaugural set looked a certain way. Much as any American loves any rankings that (7-May-2006) have the United States fourth in the world, InOnIt.com must bow to reality and argue that the FIFA ratings, while perhaps a masterful bit of lawmaking, are a horrible mess at modeling anything (if in fact they seek to model at all; FIFA's page suggests that they do -- "a reliable measure for comparing national A-teams"). This is not an article about the FIFA world rankings. But developing these rankings provided some insights into the FIFA rankings:
Self-critiqueThe most important way in which scoring in football may differ from a true Poisson process is that a team that is leading will attempt to change tactics (and possibly personnel) in order to protect its lead, while a trailing team will make corresponding changes. It is possible to argue that these changes in fact cancel one another out -- while one side attempts to attack more (which will tend to make both it and its opponent score more goals), the other side defends more (which will tend to make both it and its opponent score less). But it's not clear whether the effect is symmetrical (there are good theoretical reasons it should be). One area in which this ought to induce a deviation in which the scores of the two sides are not independent is that FIFA's 3-1-0 scoring system -- in which both teams are penalized for a draw -- ought to induce more attacking to occur when the score is level than they do when the score is not level. Whether this occurs in practice is not clear. In competitions (e.g., friendlies, knockout stages of competitions) where the 3-1-0 scoring system is not relevant, it of course would not apply (and in friendlies, there is no clear payoff system at all). Whether the differences between sides in they ability to score or prevent scoring form the sort of transitive linear relationship modeled here is not clear. Other commentaryFeel free to give feedback (positive or negative) on these ratings (and whether you are happy to be identified). One commenter notes (5-May-2006): I calculated each teams % diff from the high rating (for France, this is 100*(2503-2419)/2503. The resulting curve is very near to being a perfect log function. It's almost eerie.Perhaps someone mathematically inclined can help think this through? World Cup OddsWith scoring in a match being modeled as a Poisson process, with a rate calculable given each side's offensive and defensive ratings, it's straightforward to do a Monte Carlo simulation of the World Cup and then use it to calculate the probabilities of selected outcomes, and thus odds of their occurrence. A few details:
|