The Big Data marketing campaign distracts us from our greatest opportunities involving data. As we chase the latest Big Data technologies to increase volume, velocity, and variety (the 3 V’s), we will never resolve the fundamental roadblocks that have been plaguing us all along. I’ve written a great deal over the last few years about the fundamental skills of data sensemaking and communication that are needed to evolve from the Data Age in which we live to the Information Age of our dreams. It is essential that we develop these basic skills, but we must face many other concerns and resolve them as well before collecting more data faster and in greater variety will matter. I recently wrote about one of those concerns in a blog post titled Big Data Disaster about the problems created by credit bureaus that shroud their scoring methodologies in mystery and have largely ignored their responsibility to base credit ratings on accurate data. We have a right to know how these bureaus determine our credit worthiness and we should never be denied opportunities due to data errors that they haven’t seriously attempted to prevent or correct. Today, I want to raise another important concern about data: the suppression of data of interest to the public. I believe in data transparency. Information that concerns us-especially that which can make the difference between health and illness, life and death-should not be held hostage and hidden. This is an ethical issue concerning our use of data. Pharmaceutical companies routinely suppress the results of unfavorable clinical trials. They even make it difficult in many cases to know that those trials were ever conducted. This results not only in a great deal of wasted research to repeatedly find what was already discovered and hidden, but also in lost lives and false hope. This suppression of data should be criminal, but it isn’t. It is, however, deeply wrong.
While teaching my workshop recently in London, one of my students recommended that I read a new book titled Bad Pharma by Ben Goldacre. She did so, she said, because she saw Goldacre and me as similar in our willingness to speak out against wrong. I speak out mostly against data sensemaking technologies that fail to deliver useful functionality, but Goldacre, a medical doctor, is speaking out against a systemic problem in the pharmaceutical industry, which involves regulatory agencies, publications, and academic institutions as well. What he reveals, all based on well-documented facts, is chilling. It is an incredible example of science at its worst.
In the book’s introduction, Goldacre writes:
We like to imagine that medicine is based on evidence and the results of fair tests. In reality, those tests are often profoundly flawed. We like to imagine that doctors are familiar with the research literature, when in reality much of their education is funded by industry. We like to imagine that regulators let only effective drugs onto the market, when in reality their approve hopeless drugs, with data on side effects casually withheld from doctors and patients…
Drugs are tested by people who manufacture them, in poorly designed trials, on hopeless small numbers of weird, unrepresentative patients, and analyzed using techniques which are flawed by design, in such a way that they exaggerate the benefits of treatments. Unsurprisingly, these trials tend to produce results that favour the manufacturer. When trials throw up results that companies don’t like, they are perfectly entitled to hide them from doctors and patients, so we only ever see a distorted picture of any drug’s true effects…
Good science has been perverted on an industrial scale.
Goldacre documents the problem and its effects in great detail and goes on to describe with equal clarity what we can do to correct it.
If we wish to usher in a true information age, we must first develop an ethical approach to data, its dissemination, and use. What Goldacre reveals about the pharmaceutical industry is but one example of data being selfishly and harmfully held hostage by powerful organizations. Problems like this will go unresolved and, in fact, will never even be addressed, if we’re spending all of our time chasing Big Data. First things first; let’s learn to use data responsibly.