You should develop good habits as a student, it is often said. Still, I was never guided in this. And I honestly don’t think my teachers had any good habits to teach anyway.
As a statistician, what habits should you definitely learn right away? I can think of some:
i) Good programming habits. Make projects or R-packages. Lint your code. Always write tests; plenty of tests. Document your code. Never skip documenting and testing your code! Yeah, testing and documenting code – how often did your teacher show you how to do that? I was taught, implicitly anyway, that teachers automatically write correct code and have no need to test it.
I also think it’s nice to stick to some style, but documentation and testing are much more important.
We also need some guidance on how to structure stuff. Typically you’ll have many small projects. Some here and some there. How to keep track of them, remembering where the code is, etc?
ii) Good checking-your-calculations habits. Yeah, seriously! A lot of my time goes into checking my calculations. It often works like this: I write down some formula on a piece of paper, in Lyx or gasp in Latex. Then I copy the formula into R
, make a simulation or numerical approximation of the formula, and check them for consistency. Now, 8/10 of the two results do not agree. Why don’t they? There are a couple of potential causes here:
- The numerical approximation is not correctly implemented.
- The formula wasn’t copied correctly into
R
. - The formula is wrong.
Obviously, all of these can be true at the same time.
What you really need is structured testing. Because, you know what’s going to happen? You will write your document, verify some formula, and put it down for a couple of days or weeks. And when you get back to your document you do not know what formulas have been verified or not. And ideally, you would like to know the strength of the verification.
The thing is, math is bug-ridden, and you need some method to be systematic in documenting and testing it.