In the previous article we learned about Fisher’s original idea for using significance testing. Unfortunately, the current use of Null Hypothesis Significance Testing (NHST) hardly resembles Fisher’s original idea. In this article I will discuss some of the criticisms of NHST.
Disproving Null Hypotheses
The main aim of NHST seems to be disproving null hypotheses. Unfortunately, while a small (‘significant’) p value indicates that the null hypothesis is false, the converse is not true. Even a large non-significant p value does not provide evidence for the null hypothesis. This is captured by the dictum ‘one can reject a null hypothesis, but never accept it’.
Sometimes, however, one does not want to reject the null hypothesis. Studies of bioequivalence intend to determine if one treatment is ‘as good’ as another. The researcher in such instances does not want to demonstrate superiority of a particular treatment, rather that the two treatments are equally effective. Non-inferiority trials attempt to demonstrate that a treatment is ‘not worse than’ another treatment.
The standard treatment for a disease cures 50% of those receiving the treatment at a cost of 100 units of currency. A limited-resource setting is endemic for the disease and cannot afford the treatment. Researchers develop a new treatment that costs only 10 currency units. They wish to establish that the new treatment is ‘as good’ as the original treatment. They do not want to reject the null (nil) hypothesis that there is no difference between the two treatments. In fact, they want to demonstrate that there is no difference between the two treatments.