However, I have a few brief thoughts about the debate in general, more conceptual than factual, given that we have two weeks of similar discussions to look forward to. And the more it is discussed, the more I perceive the need to try to explain (probably in vain) the concept of performance analysis.
Dogmatic answers help nobody
First, I have realized that most people (present company excluded, if you’ve read this far) expect a simple answer to a complex problem. In fact, their desired answer should ideally fit into 140 characters so that it can be tweeted, and should not leave any room for interpretation. Where such room is left, people will choose to interpret it in a way that either seems naive (Froome and Sky must be clean), or pseudoscientific (change the name of your website to “the speculation of sport”, for example).
Earlier today, I read this article which was written by Antoine Vayer, who is now becoming known for interpreting Tour performances in largely one way – proof of “beyond mutant” physiology (those words belong to Google Translate, which did such a job on that article that gave me flashbacks to assembling a table tennis table with Chinese-English instructions). He writes that Froome’s calculated power output of 446 W (this is normalized to a 70kg rider, by the way, so it’s 6.4 W/kg, not 446/66kg, please note) and describes it as “miraculous”, leaving no doubt as to his conclusion that Froome is doping (apologies if something has been lost in transation).
Vayer is certain of his conclusions, that is for sure. Too certain, in my opinion. Because here’s the thing: While he may very well be correct, he is equally likely to be wrong, for a couple of reasons. First, as I’ve really laboured at every opportunity (though the message still hasn’t been received, based on the pseudoscience criticisms), the estimation of power output is likely to carry with it a large enough error that you simply cannot say things with 100% conviction. Some confidence? Yes. 100% confidence? No, not yet. The error in estimation exists because the assumptions of environmental conditions and mass influence the calculated value. A 3% over-estimation turns a calculated power output of 446W, or 6.4 W/kg, into 425W and 6.2 W/kg, which has implications for its plausibility. On this note, I’d also point out that the number calculated is just as likely to be an under-estimate than an over-estimate, so instead of dismissing 6.4 W/kg as high in error, bear in mind it might actually be 6.7 W/kg, and low in error.
I must also add that I think people are over-playing this error, and using it to distract from the debate – 3% is large, and most of what we’ve seen is smaller, barring a few exceptional situations. If the error is acknowledged for obviously extreme situations, and performances are viewed collectively over time, rather than in isolation, I believe they get small enough to not impact the conclusions in a large way. Beware the “pixelation” that happens when you stand too close to the picture, and you still get value. So I really do believe that the methods we’ve used are more accurate than some wish to acknowledge, but they’re not perfect. In this world of extremist views, however, “not perfect” is a synonym for “worthless”, which the method most certainly is not.
The physiological implications – probabilities, not certainties
The second reason I feel Vayer should adopt more caution with his conclusions is that there is room for interpretation in terms of what is “mutant” as opposed to plausible, even if the power output is accurately calculated (or better yet, measured). Vayer has interpreted his calculated power output of 446 W as almost mutant, strongly associating Froome with known dopers in the past.
Again, maybe Vayer will be shown to be correct in time, but that Ax-3-Domaines performance in isolation is not necessarily proof of doping. Nor is the 411W to 420W range of the other finishers in the “suspicious” range in the way he suggests. Of course, all the performances are suspicious – as the top finishers in the Tour, they are, sadly, under suspicion by their presence and the sport’s history!
However, Vayer’s definitive conclusions seem, at least from the translations I see, to be made without full recognition that the performance is the result of very complex physiology which is influenced by many factors, one of the most important being the duration of the climb. I’m told that in his magazine, “Not Normal”, he does adjust the expected power output based on the length, which is vital, because intensity should be higher on shorter climbs (apologies again if I have misconstrued this previously). That correction, or explanation, is absent from the latest article (as is the 70 kg method explanation, which also causes major confusion), but I would also disagree on the model he has arrived to get there, and that’s what makes his certainty misplaced, in my opinion.
On the matter of length, when a climb is short (and 24 minutes is relatively short), then a number of 6-6.1 W/kg is less suspicious than it would be over 40 minutes. This should be obvious, I hope. Physiologically, 6.4 W/kg for 24 minutes does not ring any alarm bells in and of itself. Remember, the origins of this approach are basically that performance implies physiology. Therefore, you can work backwards from power to estimate the physiology driving it.
The challenge is that there are three factors to account for – the VO2max, the efficiency and the sustainable workrate. If only two existed, a model could be created where one is played off against another, until a power output created an impossible balance, and a given performance could be flagged as unreasonable. However, when three factors are in play, it’s impossible to do this with certainty – too many ‘degrees of freedom’.
For example, that 6.4 W/kg calculated for Froome can be used to calculate an implied VO2max, by assuming his efficiency and his sustainable workrate, two of the three variables. Typical assumptions for both (24% and 85% of maximum workrate) produce an estimated VO2max of 88.6 ml/kg/min. Very high, and I’d be doing a double-take if I saw that combination of relatively high efficiency and super high VO2max in a lab, but it’s not ‘mutant’ or impossible, just very unusual.
However, what happens if he is able to ride at 90% of maximum, given that it’s only a 23 minute effort off the back of a relatively easy stage? Well, now, the prediction for VO2max drops to 83.7 ml/kg/min. And if his gross efficiency (a measure of how thrifty he is in using oxygen to produce energy) were to increase to 25%, then it’s only 80.4 ml/kg/min. I have summarized a range of these possibilities in the table below, with blue for lower efficiency (23%), green for moderate efficiency (24%) and red for high efficiency (25%).
On the longer climbs, lasting 40 minutes or more (think Alp d’Huez or Mont Ventoux), the sustainable power has to be lower – 80 to 85% would be, in my opinion, quite aggressive assumptions for the end of a 5 hour day in a three week stage race. Based on the pVAM method we’ve been discussing, it appears that the typical intensity of those longer HC climbs at the end of stages is around 75 – 80% of maximum, especially considering that many of the climbs are at altitude.
So, in the table above, if a rider is at 6.4 W/kg for the 40 minutes of Alp d’Huez, with an efficiency of 24%, their VO2max would need to be around 94.2 ml/kg/min and they’d need to be riding at 80% of maximum, and I do believe that is unrealistically high. That’s why a realistic performance for those longer climbs is 6.0 – 6.1 W/kg, because that brings the physiology back down to very high, but plausible levels.
The point is that when many things are free to vary, the physiological prediction you make is a reasonable GUIDE, but not a perfect solution. Therefore, dogmatic conclusions about mutant physiology don’t do anyone any favors, least of all those of us trying to steer the conversation towards greater insight that is fair and realistic.
That said, the diagnosis, judgment and conviction of doping by performance IS possible, but the certainty only comes from truly eye-popping performances. If a guy produces 6.5 W/kg for 45 minutes, I’ll be calling shenanigans with confidence, because that’s a rider, however you play with the assumptions, who breaks the ‘rules’, as the table below shows. Then you can be more conclusive about calling out ‘mutants’ (but still not 100% sure, because we are limited by models. The big picture would complete this, taking it from 99% confident to 100% sure)
Froome’s performance on Ax-3-Domaines, whether it is 6.3 W/kg (Ferrari method), 6.4 W/kg (Vayer), or even 6.5 W/kg, lies below that line of “certainty”, but above a line of “clear plausibility”. If there are zones – red for certain doping, green for very realistic undoped, and orange for “could go either way”, we’re in orange on Saturday.
The historical analysis of the performances since 2009 (using the pVAM method) suggest that Froome’s performance, and the physiology driving it, lie closer to the limit than anything we’ve seen since the biological passport. And the way Sky went 1-2 on the day says they have found a competitive advantage, certainly. The source of the advantage is however not as clear cut, and so Vayer’s interpretation of a value as plausible vs suspicious vs mutant/miraculous is too certain.
This in mind, the method of analyzing performance is not worthless, and I fully understand what Vayer is doing and support it. In concept. However, it does a disservice to the sensible interpretation of data when a number is categorically held up as proof of doping.
If I did that, and wrote that “Froome is certainly doping based on my calculations that his power output was 6.5 W/kg”, I would hope that many of you reading this would call it out and run me out of town on the basis of exceeding the strength of the analysis with my conclusion!
The value – the long term process, the hypothesis and the big picture
The value of this analysis is not the definitive answer is provides, but rather the process it asks of people. All these “tools” – VAM, pVAM, CPL, estimated power output, physiological implications, are merely designed to help generate theories and hypotheses that, over time, might either be proven or disproven by results. If that hypothesis is that the Tour has gotten slower as a result of being cleaner since the biological passport and sponsor/media driven pressure, then let the analysis show it, over time. Because over time, all that error that seems to freak people out would become smaller and smaller, and a better understanding of the big picture would emerge. Not in isolated efforts, which is the point of what I tried to express immediately after the stage on Saturday. It’s the process, folks, not the outcome, as is the case with sports science all the time.
On that note, the performances of Sky riders (with the exception of Froome) on Sunday gave some cause for a bit more optimism, because they revealed that perhaps Saturday was a supra-maximal effort, the costs of which were paid on Sunday. The racing on Sunday was sustained through the whole stage, but the climbs were actually not as fast as they were Saturday – we were projecting between 5.0 and 5.2 W/kg on the final two climbs. The fact that the front group was as large as 30 pretty much the whole way up both those Category-1 climbs confirms that the pace was not fast. Yet Sky were absent. It is indeed ironic that we look for signs of weakness to dispel our cynicism. Once again, that day in isolation means little, but it was, after Saturday, a reason to think twice about Saturday’s performances. Avoid performance pixelation, step back, see the whole screen.
More reading and listening
I’ll call in a night there, and leave you, if you feel like more, with one more article to read and one podcast to listen to, on this subject.
This is a piece I wrote for a local paper – very short, very general, but summarizing what I’ve written about in the last few days.
Tomorrow, I have a more opinionated piece on cycling and why it should continue to attract questions, particularly for Sky