Last updated on October 26th, 2013 at 04:02 pm
Estimated reading time: 14 minutes
It’s been just over a week since the Court of Arbitration for Sport released the summary of its decision to issue bans to two Italian cyclists, Franco Pellizotti and Pietro Caucchioli. The significance of the decision is that the decision was based on suspicious blood values recorded as part of the biological passport system, thus lending a legal hand to the system.
They have therefore been banned not for failing dope tests or falling foul of criminal investigations (the other, seemingly more common way for cyclists to be ‘caught’ in this era of fallible testing). Rather, their ban is the first to be imposed based on the values measured in the biological passport system. One of the most significant aspects was that in the Pellizotti case, the UCI were appealing to CAS, who then overturned the decision by CONI (the Italian federation) to exonerate Pellizotti (in the Caucchioli case, incidentally, CONI banned the rider, and so Caucchioli was appealing to CAS. The verdicts give the passport some serious credibility.
In the words of the CAS summary: “the CAS Panel has reviewed in detail the biological passport program applied by the UCI and has found that the strict application of such program could be considered as a reliable means of detecting indirect doping methods.”
This article from Juliet Macur of the NY Times summarizes the case, and its significance, nicely, so I won’t add too much to it.
An opportunity to discuss the science of the biological passport
What I will say, however, is that the case, and the resulting discussion, represents a great opportunity to discuss the science of the passport, and it’s legal ‘clout’. I’ve been very fortunate to have struck up a relationship with Prof Yorck Olaf Schumacher, one of the premier experts of the passport, and he has been very helpful in answering some of my questions and steering me in the direction of publications that help to explain the system.
So since the verdict, I’ve been working on pulling together some information that I hope helps to explain the passport a little more clearly, so that we can all understand its strength, the effect it has had on cycling, and also the limitations, because they will inspire future development to give it even more strength.
The Science: How does the passport system work?
About 18 months ago, we did an interview with Prof Schumacher, in which he explained the basic concept behind the biological passport. Here’s an excerpt from that interview:
“In the biological passport, we try to identify suspect constellations of biological markers that can not be caused or explained by other means than doping. This applies to markers of the haematological system, but extends to endocrinology and other organs”
So what are these “suspect constellations of biological markers”, and how does the legal “burden” affect the evaluation of the values measured? This is the fundamental starting point in understanding where the biological passport is headed as an anti-doping tool. So let’s begin with some blood physiology.
Reticulocytes, blood doping and the off-score
The central “characters” in the biological passport story are reticulocytes and hemoglobin, which are combined statistically to produce an Off-score. Hemoglobin you know – the oxygen carrier, which picks O2 up in the lungs and delivers it to the tissues. Reticulocytes are immature red blood cells, which of course, carry hemoglobin. Their “life-span” before maturing is about one day, which means that at any moment, a certain percentage of your blood cells are reticulocytes, the rest are mature red blood cells. This percentage is important, as we shall see.
Both blood doping (the removal and re-infusion of red blood cells) and EPO use increase the oxygen carrying capacity of the blood in order to improve performance, and their use is the target of the biological passport.
Take a look at the following graph, to illustrate some concepts. I’ve redrawn (and simplified) the graph from a research study by Torben Pottgiesser, published in the journal Transfusion only a week ago (more on this later)[cite]10.1111/j.1537-2995.2011.03076.x[/cite]. The cyclist is known to be blood doping – the researchers withdrew 500ml (shown by the red “down” arrows) and reinfused 280ml of red blood cells (shown by the blue “up” arrows) at various points during a simulated 42-week cycling season.
Biological passport 101 says that:
- “Normal” reticulocyte % is between 0.5% and 1.5%, but it can quite naturally lie outside this range. Also, the absolute level is by itself tells nothing about doping – the graph above shows this, because not once did this subject’s reticulocyte % rise above 1.5%, yet he was blood doping for almost a year. I’ll cover this crucial aspect in a little detail later.
- After a withdrawal, the percentage of reticulocytes generally goes UP. This is because the body responds to the sudden loss of red blood cell content by stimulating more red blood formation. This means more new blood cells as a percentage of the total cell number, and is the whole point, because when you re-infuse that blood later, you get a double-benefit.
- On the other hand, the re-infusion of blood (the blue arrows) causes a drop in reticulocytes. Why? Because the cells that are being re-infused are “older” (they’ve been stored in a refrigerator!) and so the new blood, post infusion, has more red blood cells, but fewer of them are immature.
The opposite is true for hemoglobin, incidentally. Here, the withdrawal of blood is characterized by a fall in Hb concentration, while the re-infusion of blood increases Hb levels acutely.
These two measurements, which are affected by blood doping and also EPO use (since EPO will stimulate red blood cell formation thus increasing reticulocyte %) provide nice “flags” for measurement. They are used to calculate what is called the “OFF-score”, or “stimulation index”, a ratio of hemoglobin to reticulocytes (the calculation, for those who are interested, is Hb x 10 – 60 (square root of the reticulocyte %)).
The Off-score is of interest because it would be able to pick up both withdrawal of blood (characterized by a rise in reticulocytes and a fall in Hb), as well as the re-infusion of blood (reticulocytes fall and Hb concentration rises).
As is the case with reticulocytes, there is a “normal” or undoped range in Off-scores that lies between 80 and about 110, but because of differences between individuals, natural variation and probabilities, it’s not good enough just to set an upper limit and use it to ban cyclists. This is where the issue of probabilities and variation comes in – if you are going to enforce passport results to ban dopers, you need to be 99.9% sure that those measured values do not occur in an undoped athlete.
Probabilities and the legal considerations: False positives
The end result is that in order for the biological passport to stand up to forensic and legal scrutiny, one has to manage the risk of “false positives”, cases where a cyclist is not doping but their blood values are flagged as “suspicious”. The only ways to manage this are to:
- Set confidence limits or boundaries that are safe and unlikely to produce many false “strikes”. If you do this, and set a confidence limit of 99.9%, then you can define samples as suspicious or abnormal if they exceed the statistical individual threshold with a 99.9% probability. Put differently, if you set the limit at 99.9%, then the chance of finding values outside this boundary from an undoped athlete is 1 in 1000 samples. Those are pretty good odds, and are in line with most legal precedents. If you lowered your probability level to 99%, then the chances of finding a value outside this is 1 in 100, without doping. Obviously, not quite as good.
- Test and research what constitutes “normal”, how much variation is acceptable without doping and to progressively tighten the boundaries or limits so that you can begin to say with confidence that a given change in blood markers indicates doping.
Return for a moment to that study by Pottgiesser that I mentioned earlier – this study simulated a 42-week cycling season and split a group of cyclists into a blood-doping group and a non-doping group. Just to highlight that “false positives” do happen, the following figure is taken from their results – it shows the reticulocyte, HB and Off-scores for a cyclist who is NOT doping. The darker line in the middle of each graph is the measured values, while the two lighter lines represent that confidence limit that I was talking about earlier. Here, it is set at 99%.
The key graph here is Hb on the top left – you will see that on the very first day, this cyclist would have been given a “strike” for a Hb value that lay outside the 99% limit. Every other value was fine, but that one, for whatever reason, lay above the threshold – clearly the threshold is “imperfect” (Just a note on the reason – it may be related to the fact that this is the first reading, and therefore, the boundaries are set for all subject with no previous values. Individuals who have naturally high levels may fall outside this range. The addition of more measurements however improves the probability limits)
For comparison’s sake, here is the doped subject whose graph of reticulocytes I redrew earlier in the post:
This cyclist would have received three “strikes” during the season – one for Hb and two for the Off-score. All three, of course, are legitimate in this case.
The conclusion of this Pottgiesser paper, incidentally, is that the off-score had high sensitivity in detecting autologous blood transfusions – in 11 cyclists, it caught 8 during this simulated season at a probability level of 99%. At 99.9%, as you might expect from more stringent limits, it picked up 5 out of 11 doping athletes over the ‘season’. The only false positive came from Hb in that one subject, not from the Off-score, which was recommended for future use in the biological passport model.
As I mentioned previously, this Off-score is attractive because it can pick up both withdrawal (reticulocyte % rises and Hb falls) and re-infusion (reticulocyte % falls and Hb rises) of blood. Just to emphasize and illustrate, here is the above doped subject’s Off-score again. I’ve highlighted the two “strikes” with orange diamonds, where the measured values lie beyond the boundaries that are set by a 99% probability level (the lines shown in light blue). You can see how the Off-score picked up both the infusion (first strike) and the withdrawal (second strike):
The biological passport process – multiple stages
OK, so having established what the passport is measuring, and also that there is this probability issue, the next piece of the puzzle is the process. And given that the passport works on the balance of probabilities, this is how the system is set up to run.
About 800 samples form the collective exposed to the biological passport programme, and they are providing blood samples regularly – these samples are analysed as I explained above.
All measurements are analysed by a team of experts using software that is based on Bayesian statistics, and they calculate the probability of the values being found in a normal, undoped sample. As mentioned above, these limits are set at 99.9%, which means that by finding values beyond those boundaries will mean that such values are only found in one of 1000 cases in an undoped individual.
Just to illustrate a key point, if they set the limits at 99%, then in the cycling collective of 800, you’d expect 8 cases of “false positives” where the rider’s value is flagged when there is no doping. This is why the limits have to be very strictly set in order to have legal “clout” – it’s too easy to dismiss a method that produces this many false positives. The downside, of course, is that cyclists who are doping can still go undetected, but there is this compromise between “cavalier” testing with high risk of false positives and the desire to catch every doper. There are built-in steps to manage this, however.
Cases are usually opened only if several different variables are beyond these boundaries on more than one occasion. By this measure, our non-doping subject who was picked up in the research study I described earlier would not face investigation, which is how it should be.
When this happens, the experts get together to evaluate and analyze the values. If they feel that the profile is typical for a certain doping intervention, the athlete is contacted and questioned about potential reasons for his values. His justifications are again evaluated by the experts. Only if they are still convinced that the profile is typical of doping and is not caused by the explanations put forward by the athlete (as has happened for Pellizotti and co), do they suggest the opening of a procedure against the athlete.
Clearly, the process is quite long and has multiple “security” levels to protect the clean athlete or to filter out pathologies that might cause abnormal values.
The effectiveness of the passport – if we can’t catch them all, is it worth it?
That’s a quick overview of the biological passport – how it works, what it measures, and why it is not as simple as just setting a limit and banning everyone who exceeds it. Sometimes, you can dope but still remain within those limits, whereas other times, people who do not dope may exceed them!
So the key is understanding probabilities. Having hopefully done that, you may now be wondering whether the biological passport is even worth anything? Is it effective, given that the probability has to be so high, and there is this much physiological variation that we can “miss” dopers (even the Pottgiesser study caught 8 out of 11 – three fell through). I have read a great deal on the internet and there seems to be an opinion that if the biological passport cannot assure detection and conviction of doping, then it is ineffective and should not be bothered with.
That’s a whole other debate. I want to say that the answer is a resounding “Yes, it is effective”, and I have some pretty cool data to back up that I believe the biological passport is having a significant effect on doping in cycling. And I also want to discuss where it may be headed in the future.
But right now, this is a lot to take in (and even more to try to write!), so I’ll leave it at this, and say join me soon for more on this topic!
Ross
So the officials evaluating cases like this are going to be asking questions like “How likely is it that a cyclist can produce an Off-score outside the 99.9% probability limit if they are not doping?”, and “What kind of conditions would explain the biological profiles that we’re being presented with?”
To explain the CAS process a little, what happens is that when a case is presented to CAS, a panel consisting of three judges is put together. Judges come from a pool of people who work for CAS, and one is proposed by each party, with a third neutral judge making up the three-person panel. These three are judges, not scientists, and so you can appreciate how a case like this presents to them some pretty serious challenges. When it comes to expertise, CAS has the choice: Usually, they will listen to the experts of either side and then come to a judgment. In some cases, they will bring in neutral experts who can explain and translate what is often very complex science in order to help them reach the right verdict. The same would have happened in the Oscar Pistorius case, incidentally, but as far as I know, they did not opt for a neutral expert in either case.
As a result, the studies like those of Pottgiesser become incredibly important. The reality here is that we are entering unchartered waters, and the more data the better, for the arguments of those wishing to use the passport as a tool.