So what’s the deal with the conflict between Neyman and Fisher?
It appears to be this:
- Neyman wanted to be explicit with alternative hypotheses.
- Neyman used the explicit alternative hypothesis to generate and prove the optimality of tests.
- Neyman wanted to calculate power and saw it as an important part of the whole hypothesis testing set up.
- Fisher used p-values while Neyman used fixed \(\alpha\)s.
- Fisher talked about inductive evidence while Neyman talked about inductive behavior.
How should modern frequentists think about this?
First, let me say this. Neyman is just simply correct in the two first points. Any p-value has a family of implied alternatives, and you should be explicit about them. The way he proved optimality of tests should be uncontroversial, at least in the very few settings it is possible.
Maybe you don’t need to calculate power all the time; and at least not in the classical sense of calculating power for one alternative at a time. It is possible to, for instance, calculate the power using some prior over parameter values or the expectation of the p-value under an alternative.
I think you should report p-values. Why not? They contain more information, simple as that. The only exception is if your test does not have nested acceptance sets so that its p-value does not exist. But I doubt you’re in that situation.
The fifth point is a joke. Who cares about “inductive behavior”. Next!