Fri, 10 Mar 2023 00:00:00 +0000/2023/03/10/markov-chain/Introduction Suppose you conduct an experiment of the following form. You start out with an \(X_0\), then do a binary experiment \(Y_1\) with conditional probability \(P(Y_1\mid X_0)=f_\theta(X_0)\). Then you choose your next value of \(X\) deterministically, based solely on the value of \(X_0\) and \(Y_1\) using a known \(g(x,y)\), i.e., \(X_1\mid Y_0,X_1 = g(X_1,Y_0)\). Doing this \(n\) times, you will get a sequence of values \(X_0,X_1,X_2,\ldots,X_n\) and \(Y_1,Y_2,\ldots,Y_n\).
Our goal is to do inference on \(\theta\).Some thoughts on spaced repetition systems and math.
Thu, 09 Mar 2023 00:00:00 +0000/2023/03/09/srs-math/Update: 2023-03-10. SRS should be especially suitable for math. Definitions, examples and Theorems are not only hard to understand, they are hard to remember. Would you remember what a normal subgroup is \(\sim10\) years after taking my algebra class? Probably not! Unless you had to use the concept several times the last few years.
Everyone says the same about spaced repetition systems (SRS). They work best to make sure you keep knowing what you know.On power analysis
Sat, 25 Feb 2023 00:00:00 +0000/2023/02/25/power-analysis/Here’s a demonstration that you should do integrated, or Bayesian, power analysis when making power calculations. Consider the case when, to the best of your knowledge, the true effect size is distributed according to the gamma distribution with parameters \(2\) and \(2\). You know the standard deviation is 1, to keep things simple.
In this case, taking the mean of the effect size distribution (1) and do power analysis on that would be a mistake.Social scientists need mostly elementary mathematics
Wed, 31 Aug 2022 00:00:00 +0000/2022/08/31/social-science-math/They need elementary math for many reasons.
Figure out precisely what you want to say. (Think e.g. the difference between the mean and median, what does “it’s 90% nutrition” mean.) Doing simple modeling. (I’m not an advocate of complicated economics style modelling such as general equilibrium, most uses of differential equations, and so on.) Gain awareness of important phenomena and understand how to deal with them. An example would be range restriction.Weight loss notes. Planning and keeping track.
Wed, 31 Aug 2022 00:00:00 +0000/2022/08/31/weight-loss/I wanted to get an idea of how long it will take to reach my body fat percentage goal of 8% (I’m a male; females shouldn’t go this low.) So I wrote a small script.
We’ll need the following constant, an approximation of how many calories there is in a kilogram of body fat (i.e., adipose tissue).
kcal_per_kg <- 7700 Then you provide approximations for your average calorie expenditure and intake.Latent Poisson process and Laplace's rule
Wed, 15 Jun 2022 00:00:00 +0000/2022/06/15/poisson-laplace/(Note: This is related to the post on Laplace’s rule).
For each interval length $s$ we have a sequence of binary variables $Y_{1}(s),Y_{2}(s),\ldots Y_{n/s}(s)$. We want to connect these variables, for each $s$, in a meaningful way and apply Laplace’s rule. One way to do so is by assuming a latent Poisson process.
Let $X_{t}$ be a Poisson process with parameter $\lambda$, $t\in[0,n]$. Then the number of observations in the time frame $[t,t+s]$ is distributed as $$ X_{t+s}-X_{t}\sim\text{Poisson}(\lambda s).Time in Laplace's rule in the context of a "for all" problem
Mon, 06 Jun 2022 00:00:00 +0000/2022/06/06/laplace/We are working with a sequence of independent binary events on a discrete time-scale, say seconds, with success probability $\pi$. Call this sequence of events $X_{t}$, $t=1\ldots$. For instance, $X_{t}=1$ if it rains at second $t$ and $0$ otherwise. For each positive integer $m$, define the “for all”''"-type variable
$$ Y_{t}^{m}=\begin{cases} 1 & \text{if }X_{t(m-1)+1}=\cdots=X_{tm}=1,\\ 0 & \textrm{otherwise}. \end{cases} $$
For instance, the variable $Y_{t}^{60}=1$ if it rained during the entirety of the the $t$th minute.Calculating the variance
Tue, 22 Sep 2020 00:00:00 +0000/2020/09/22/calculating-the-variance/Here’s how to prove that $$E[(y-\mu)^{T}A(y-\mu)]=E(y^{T}Ay)-\mu^{T}A\mu,$$
when $A$ is a matrix and $y$ a random vector with mean vector $\mu$.
$$\begin{eqnarray*} E[(y-\mu)^{T}A(y-\mu)] & = & E(y^{T}Ay-\mu^{T}Ay-y^{T}A\mu+\mu^{T}A\mu)\\ & = & E(y^{T}Ay)-E(\mu^{T}Ay)-E(y^{T}A\mu)+\mu^{T}A\mu \end{eqnarray*}$$
Here $$E(\mu^{T}Ay)=\mu^{T}AE(y)=\mu^{T}A\mu,$$ by linearity of the expectation operator. Likewise, $$E(y^{T}A\mu)=E(y)^{T}A\mu=\mu^{T}A\mu,$$
and $$E(y^{T}Ay)-E(\mu^{T}Ay)-E(y^{T}A\mu)+\mu^{T}A\mu=E(y^{T}Ay)-\mu^{T}A\mu$$ as claimed.Decision theory
Wed, 04 Mar 2020 00:00:00 +0000/2020/03/04/decision-theory/Decision theory confuses me. On one hand, evidential decision theory is obviously wrong. On the other hand, causal decision theory is obviously applied incorrectly.
Evidential decision theory is wrong since it conditions on the act made. One reason why you shouldn’t do this is that this presupposes that your act is a random variable, itself strange, but it’s also conceptually wrong. Decision theory is about making choices, in most cases making a choice between random variables.Some advice
Thu, 16 Jan 2020 00:00:00 +0000/2020/01/16/some-advice/It is not my place to give advice about being an academic, but here I go anyway.
Use a reference manager. Paperpile serves me well. Have a label or a folder for each paper you’re working on. You want to use a reference manager since your pdfs will make a garbled mess in no time. And it’s a pain to write the references into BibTex again and again and again!Fisher and Neyman
Sun, 05 Jan 2020 00:00:00 +0000/2020/01/05/fisher-and-neyman/So what’s the deal with the conflict between Neyman and Fisher?
It appears to be this:
Neyman wanted to be explicit with alternative hypotheses. Neyman used the explicit alternative hypothesis to generate and prove the optimality of tests. Neyman wanted to calculate power and saw it as an important part of the whole hypothesis testing set up. Fisher used p-values while Neyman used fixed \(\alpha\)s. Fisher talked about inductive evidence while Neyman talked about inductive behavior.Problems with confidence intervals
Sat, 04 Jan 2020 00:00:00 +0000/2020/01/04/problems-with-confidence-intervals/In a sense, it is not reasonable to expect finite and well-behaved confidence for parameters. The argument goes as follows:
When testing parameters in classical settings such as for the t-test, the assumption of normality is crucial, or at least semi-crucial. It is possible to construct ok confidence intervals using the Berry-Esseen theorem, but it is at least sometimes possible to show that no well-behaved approximate confidence interval exists when we cannot reliably bound the third moment.Business science
Thu, 02 Jan 2020 00:00:00 +0000/2020/01/02/business-science/How would you evaluate the quality of a typical business school scientific field? I am talking about for instance management, marketing, organizational psychology, accounting (is that a field though?), finance, strategy, et cetera.
The first rule of thumb could be to look at how quantitative the field is, using the idea that quantitative science is better than qualitative science. But there are some good reasons not to trust this rule of thumb.Good habits
Fri, 20 Dec 2019 00:00:00 +0000/2019/12/20/good-habits/You should develop good habits as a student, it is often said. Still, I was never guided in this. And I honestly don’t think my teachers had any good habits to teach anyway.
As a statistician, what habits should you definitely learn right away? I can think of some:
i) Good programming habits. Make projects or R-packages. Lint your code. Always write tests; plenty of tests. Document your code. Never skip documenting and testing your code!Papers are too long
Thu, 12 Dec 2019 00:00:00 +0000/2019/12/12/papers-are-too-long/Papers are too long.
The introductions are too long, the digressions are too long. There are too many results, there are too many words.
Here’s an ideal paper for me: Explain the problem in one paragraph. Say something about how other people have handled the problem in the next paragraph. Say something about related problems, if applicable. Then solve the problem.
How long will such a paper be? About two pages!Computer algebra systems
Thu, 05 Dec 2019 00:00:00 +0000/2019/12/05/computer-algebra-systems/There are at leastttwo reasons to learn Maple or Mathematica, or maybe some open source alternatives such as Sage. These two reasons I’m thinking about are i) To actually be able to calculate difficult stuff, such as integrals with difficult integrands or horrible determinants. ii) To have a reliable way to verify your computations.
Obviously both of these are important. So why aren’t these programs more widely used? There might be a reason I am not aware of — but I actually think there are some low-hanging fruits here.Automatic Testing
Wed, 04 Dec 2019 00:00:00 +0000/2019/12/04/automatic-testing/I believe it is important to improve the reliability of R programming. Packages such as testthat help with this, but these packages aren’t comprehensive enough. What I would like it more automatic testing. In some packages, univariateML being a great example, there are a lot of functions following the same basic prototype: Every function has the same formals and they return similar objects. What I would like is to specify the prototype and automatically check that each of the approximately 25 functions in univariateML adheres to it.Organizing Projects
Tue, 03 Dec 2019 00:00:00 +0000/2019/12/03/organizing-projects/My current main is problem is that I don’t manage to handle all my silly ideas and projects. I often work, say, 5 - 10 days on a project, manage to write down some results (some half-assed), then get myself back on track on my “real” project. The problem is that the projects aren’t finalized properly. For instance, I wrote quite a bit about transferring parameters in meta-analysis earlier this year.Rant about learning Japanese
Mon, 02 Dec 2019 00:00:00 +0000/2019/12/02/japanese-rant/The most difficult part of learning Japanese is to know what to do. I have motivation to do stuff, I just don’t know what that stuff is. That’s partly a lie though; I do plenty of stuff. It’s just that the stuff is inefficient, as all of it is memorization of kanji (chinese characters) and vocabulary. I also distrust most resources without any SRS: How can be reasonably sure I won’t forget what I’ve learned by a week without a program for reinforcing it?Counterfactuals in the Probit Model
Wed, 27 Nov 2019 00:00:00 +0000/2019/11/27/counterfactuals-in-the-probit-model/This post is about counterfactuals in the probit model. I wrote this while reading Pearl et al’s 2016 book Causal Inference in Statistics: A Primer.
The probit model with one normally distributed covariate can be written like this:
\[ \begin{eqnarray*} \epsilon & \sim & N\left(0,1\right)\\ X & \sim & N\left(0,1\right)\\ Y\mid X,\epsilon & = & 1_{X+\epsilon\geq0} \end{eqnarray*} \]
Now we wish to find the density of the counterfactual \(Y_{x}\), see Pearl et al.Skolematematikk
Tue, 29 Oct 2019 00:00:00 +0000/2019/10/29/skolematematikk/Start med $\frac{3}{2y}-\frac{3z}{4}-\frac{1}{2}$. Nå bruker vi at vi kan gange med et tall (ulikt $0$) over og under brøkstreken:
$$\frac{3}{2y}=\frac{3\cdot2}{2y\cdot2}=\frac{6}{4y}$$
$$-\frac{3z}{4}=-\frac{3z\cdot y}{4\cdot y}=-\frac{3zy}{4y}$$
$$-\frac{1}{2}=-\frac{1\cdot2y}{2\cdot2y}=-\frac{2y}{4y}$$
Dermed er
$$\begin{eqnarray*} \frac{3}{2y}-\frac{3z}{4}-\frac{1}{2} & = & \frac{6}{4y}-\frac{3zy}{4y}+-\frac{2y}{4y}\\ & = & \frac{6-3zy-2y}{4y}\\ & = & \frac{-2y-3zy+6}{4y} \end{eqnarray*}$$About
Fri, 04 Oct 2019 00:00:00 +0000/about/This is my microblog. Each post has a specific purpose and will not always make sense in isolation. I am Jonas Moss by the way.
Fri, 04 Oct 2019 00:00:00 +0000/2019/10/04/first-post/I made this small site to be able to share small documents with mathematical notation. When I want to share snippets of code I use Gist, but it’s hard to use Latex there. Think of this microblog as a Gist for Latex.
Example of Latex Let $X$ be a metric space. A sequence of elements $x_i\in X$ converges to an $x\in X$ if there for every $\epsilon > 0$ is an $N$ such that if $n\geq N$ then $d\left(x_{n},x\right)<\epsilon$.