Leading expert in pharmacoepidemiology and drug safety, Dr. Stephen Evans, MD, explains how to detect scientific fraud and misconduct in clinical research. He details the mindset and statistical methods required to uncover data fabrication. Dr. Stephen Evans, MD, discusses the motives behind fraud and contrasts its prevalence in clinical trials versus post-marketing studies. He illustrates a powerful detection technique involving the analysis of digit preference in reported numbers.
Detecting Scientific Fraud and Misconduct in Clinical Trials and Drug Safety Research
Jump To Section
- Fraud Detection Mindset
- Clinical Trial Monitoring
- Fraud in Trials vs. Post-Marketing Studies
- Motives for Research Fraud
- Digit Preference Analysis for Fraud Detection
- Statistical Detection Methods
- Full Transcript
Fraud Detection Mindset
Dr. Stephen Evans, MD, emphasizes that detecting scientific fraud begins with a specific mindset. Researchers and regulators must first allow for the possibility that fraud can occur. This awareness is the foundational step in developing effective detection strategies.
A proactive approach to fraud detection involves constant vigilance. Dr. Stephen Evans, MD, notes that assuming data integrity without verification is a critical error. The mindset must include skepticism and a commitment to rigorous data validation processes.
Clinical Trial Monitoring
Regulatory authorities like the FDA conduct careful monitoring of clinical trials. Dr. Stephen Evans, MD, explains that this often involves on-site visits to locations where data is collected. However, he suggests this method is not always the most effective approach.
Statistical analysis plays a crucial role in optimizing monitoring efforts. Dr. Evans recommends using statistical methods to determine which sites require on-site monitoring. This data-driven approach improves the efficiency and effectiveness of fraud detection in clinical research.
Fraud in Trials vs. Post-Marketing Studies
Dr. Stephen Evans, MD, identifies important differences in fraud occurrence between study types. Fraud is easier to detect in clinical trials than in observational or post-marketing studies. The structured nature of trials provides more opportunities for pattern recognition.
Post-marketing studies often use electronic health records created for clinical purposes. Dr. Stephen Evans, MD, notes that healthcare professionals rarely record fraudulent patient data in these systems. The greater risk in post-marketing research lies in deficient analysis rather than data fabrication.
Motives for Research Fraud
Understanding researcher motives is crucial for fraud detection. Dr. Stephen Evans, MD, explains that academic investigators may commit fraud seeking professional glory. Positive trial results can bring significant recognition and career advancement.
Financial incentives also drive research misconduct. Dr. Evans describes how industry-funded trials provide payment for participant data. Some investigators may invent data or take shortcuts to receive these payments, creating clear patterns that detection methods can identify.
Digit Preference Analysis for Fraud Detection
Dr. Stephen Evans, MD, illustrates a powerful fraud detection method using digit preference analysis. When humans invent numbers, they cannot create truly random distributions. This creates detectable patterns that differ from authentic data.
The technique involves examining the last digits of reported measurements. Dr. Stephen Evans, MD, explains that people show consistent preferences for certain numbers (like 7) and avoid others (like 0 or 9). These patterns become evident through statistical analysis of large datasets.
Statistical Detection Methods
Dr. Stephen Evans, MD, develops specialized statistical methods for fraud detection. These techniques identify anomalies that suggest data fabrication. The methods are particularly effective for subjective measurements like blood pressure readings.
Dr. Evans describes how comparing real trial data against invented data reveals clear differences. The statistical patterns in fabricated data consistently deviate from expected natural distributions. These detection methods continue to evolve as researchers develop new ways to identify research misconduct.
Full Transcript
Dr. Anton Titov, MD: Closer to the conclusion of our most interesting discussion, Professor Evans, another area of your expertise is the discovery of scientific fraud and misconduct. How can one actually detect scientific fraud and misconduct in clinical trials or in post-marketing and drug safety analysis?
Dr. Stephen Evans, MD: I think that you have to have a mindset that allows for the possibility, first of all. At the moment, in many clinical trials, particularly those that are monitored by the FDA or regulatory authorities, there is careful monitoring of what goes on in those trials.
Though the monitoring by visiting where the sites are collecting the data is not the most effective way of doing it. Usually, statistical analysis is used to determine where you should carry out the on-site monitoring. So that I think can be improved.
You need a mindset, you need analysis, you need to know what to look for in the data. There are patterns when people invent data that do not occur in real data.
I wouldn't really, in some senses, want to go through all the tricks of detecting fraud. Someone said to me that I should be very careful in explaining what I do to detect fraud because otherwise, people will find ways to get around it.
I'm not sure that I agree with that. I think it's my job to invent new statistical methods to detect fraud and misconduct in trials.
It's actually easier to detect fraud in trials than it is in observational studies or in post-marketing drug safety analysis. But a lot of the post-marketing studies are done in electronic health records that are used for clinical purposes.
It will rarely then be the data themselves that are fraudulent because doctors don't write down fraudulent data for their patients on the whole, or other health professionals recording the data. But it is the analysis of the data that might be deficient.
We do not, from my experience, see as much fraud in post-marketing safety analysis as we see in academic trials, where the result of the trial gives glory to the investigator. You need to be aware of the motives of people when they commit fraud.
Many doctors participate in randomized trials that are funded by industry, and they like the money that comes from that. So they may be tempted, and sometimes fall into the temptation to have shortcuts or to invent data in order to be paid for that data in a trial.
I think we have pretty good ways of detecting when that occurs. We have less good ways of detecting it when observational studies are done badly, but there are possibilities of looking at that as well.
Dr. Anton Titov, MD: One of the fascinating papers that you published—and I think it's an open secret since it has been published—is how you compare a trial of a certain nutrition intervention for cardiovascular disease and also a medical intervention, and showed that the analysis of the last digits in the data could really reveal whether there is some scientific misconduct happening in the analysis or not because of the non-random distribution. Could you please briefly discuss that kind of approach as an illustration of one of many methods of your analysis that can discover these situations?
Dr. Stephen Evans, MD: If I were to ask all your audience to think of a number, a single number between zero and nine, and ask them to write it down now, and I was able to go and look at those results, I would not find an even distribution of the numbers between zero and nine.
There would be, for example, very few zeros and relatively few nines; rather more sevens. As soon as human beings start to invent numbers, they cannot invent them randomly unless they use a computer to do so. And if they use a computer to do so, then there are ways of detecting that.
So when we end up with anything that is subjective—and it used to be particularly the case with blood pressures, or with heights and weights where somebody wrote down a number after making an examination of a patient—then you would find digit preference. And that wasn't necessarily fraudulent.
But if you are having to invent all your numbers for a randomized trial and write them down, the patterns that human beings have in writing those numbers down enable you to detect differences from what is likely to be real data.
In the example you found, we had a real data trial and data where it was very clearly invented. And we could detect the difference between them because the human beings involved in inventing the data couldn't reproduce what was seen in the real world.