In any newspaper we may find a mention of some exciting new scientific discovery. It may be evidence of a promising new cure, or perhaps an association between our behaviour and health, or some aspect of human psychology which we can relate to ourselves. The scientists have been beavering away on our behalf. Newspaper mentions are a distillation of around 40,000 studies a year which are recorded in the public domain. But what weight can we place on this information: are we being offered knowledge or entertainment?
Such studies are the outcome of the scientific method. This can be simplified by its stages: observe the material world and identify possible patterns; formulate hypotheses which could reveal the rule behind these patterns; test the hypotheses and accept those that are verified by experience. The corpus of modern science is based on this methodology, and so we should be properly grateful. But we need to be wary.
The published study is not a gift just to us: it is meat and drink to the aspiring scientist because the progress of his career and his reputation can best be achieved through his public contribution to new and potentially useful knowledge. A negative study, often an important contribution in itself, gets no fanfares. As a result there is a sad history of positive studies, some of which are at least questionable. Problems may range from an optimistic massage of data to outright fraud. While many branches of science are susceptible, those concerned with psychology and sociology are particularly vulnerable because the concepts and outcomes are more difficult to measure than those in the physical sciences, even with recent assistance from fMRI brain scans.
Scientific fraud (search for examples on the internet if you are interested) is less important to us than earnest scientists who believe so strongly in their hypotheses that they are tempted to make their data fit the conclusion which they “know” to be true. One obvious way is to bin those studies that prove negative and start again – perhaps with minor alterations – until they get the results they want. We read the triumphant final study, but know nothing of the preceding failures. One authoritative source described data massage as “rife”. And I write as one who has occasionally been tempted to omit an awkward result which conflicts with the statistical confirmation I need.
Another problem is the need to account for other characteristics which may be affecting the outcomes. Take, as an example, a measurement of the benefits resulting from breastfeeding. If, as I understand to be the case, breastfeeding mothers are likely to have a higher level of education and better living standards, this may skew the results. One may employ a control for this, but omit other, less obvious, characteristics. And each control imposed can raise the cost or validity of the study. Nor does such a study necessarily identify the drawbacks.
Few studies review complete populations – they use a sample. So, even if the study itself is gold plated, the results can never be precisely reliable. Probability theory is used to calculate the margins of error. Avoiding technicality, the p value indicates reliability. This must be 0.05 or below for respectability. Oddly enough, there is always a large a number of studies which just scrape in under this threshold – too large for coincidence. A similar statistical effect occurred when teacher discretion provided more C grades at GCSE than were warranted.
But there is a longstop. Substantial studies will be published in peer-reviewed journals. The intention is to check studies through expert criticism, and to provide the information needed for the study to be replicated by others. But even this system is flawed. Peer reviews, it has been strongly argued, are of questionable value and replication is a thankless task – thus too rarely done. In one case a pharmaceutical company decided to replicate 53 published studies of new drugs. Nine out of 10 failed.
What defence does the layperson have against inaccurate results? Certainly, one virtue is healthy scepticism. A reader with some close knowledge of a subject will be able to compare other studies and, if he has the skill, to analyse the study and its statistics, so he may want to obtain the full paper. For most people, commentary in a responsible publication such as New Scientist or Scientific American is the best bet.
There is, however, a post scriptum consolation. Religious believers are often accused of superstition, magic and claims resulting from wishful thinking. It may be a comfort to know that scientists, notwithstanding their steely pragmatic evidence, all too often prefer their own interests to the hard knocks of truth.