Hypothesis tests

Author

Affiliation

Beniamino Sartini

University of Bologna

Published

May 1, 2024

Modified

June 16, 2024

Setup

library(dplyr)
library(ggplot2)
# ================== Setups ==================
n <- 500 # number of simulations 
set.seed(1) # random seed 
mu_0 <- 2.4 # H0 mean 
mu_true <- 2 # true mean
alpha <- 0.05 # confidence level
# ============================================
# Simulated random variable 
Xn <- rnorm(n, mean = mu_true, sd = 4)
# Grid of points for pdf
x <- seq(-4, 4, 0.01)
x_breaks <- seq(min(x), max(x), 1)

A statistical hypothesis test is a method of statistical inference used to decide whether the data sufficiently support a particular hypothesis. An hypothesis test typically involves a calculation of a test statistic, that under the null hypothesis $H_{0}$ can assume a certain distribution. Then a decision is made, either by comparing the test statistic to a critical value or equivalently by evaluating a p-value computed from the test statistic. Then, the purpose of a test is to reject or not the null hypothesis at a fixed a confidence level $α$ . Note that an hypothesis is never accepted, but always non-rejected. In general, two kind of tests are available:

A two-tailed test is appropriate if the estimated value is greater or less than a certain range of values, for example, whether a test taker may score above or below a specific range of scores.
A one-tailed test is appropriate if the estimated value may depart from the reference value in only one direction, left or right, but not both.

Let’s consider a $t - t e s t$ for the mean of a sample $X_{n} = (x_{1}, \dots, x_{i}, \dots, x_{n})$ , i.e. $T (X_{n}) = \frac{μ (X_{n}) - μ_{0}}{\frac{σ (X_{n})}{\sqrt{n}}} \overset{H_{0}}{\sim} t_{n - 1} \underset{n \to \infty}{⟶} N (0, 1)$ where $μ (X)$ is the sample mean, $σ (X)$ is an unbiased estimator of the population standard deviation and $μ_{0}$ is the mean under the null hypothesis $H_{0}$ . The statistic test under $H_{0}$ follows a student-t distribution with $n - 1$ degrees of freedom. Notably, for large samples the statistic converges to a normal random variable.

1 Two-tail test

For example, let’s simulate a sample $X_{n}$ of $n = 500$ observations from a normal distribution (i.e. $X_{n} \sim N (2, 4^{2})$ ). Then, let’s consider the hypothesis: $H_{0} : μ (X) = 2.4 H_{1} : μ (X) \neq 2.4$ and the statistic test is then defined as: $T (X_{n}) = \sqrt{500} \frac{μ (X_{n}) - 2.4}{σ (X_{n})} \overset{H_{0}}{\sim} t (499) .$

Since it is a two-tailed test the critical value at a confidence level $α$ , denoted as $t_{α}$ , is such that: $\begin{aligned} α & = P ([T (X_{n}) < - t_{α / 2}] \cup [T (X_{n}) > t_{α / 2}]) \\ ⇕ \\ t_{α / 2} & = P^{- 1} (P (T (X_{n}) > t_{α / 2})), \end{aligned}$ where $P^{- 1}$ and $P$ are respectively the quantile and distribution functions of a Student- $t$ . Hence, with $α = 0.05$ if $- 1.9604 < T (X_{n}) < 1.9604$ we do not reject the null hypothesis, i.e. $μ (X_{n})$ is equal to $μ_{0} = 2.4$ , otherwise we reject it, i.e. the two means are different.

Two-tailed test

# Statistic T
z <- sqrt(n)*(mean(Xn) - mu_0)/sd(Xn)
pdf <- dt(x, df = n-1)
# Critical value left 
z_left <- c(qt(alpha/2, df = n-1), dt(qt(alpha/2, df = n-1), df = n-1))
# Critical value right 
z_right <- c(qt(1-alpha/2, df = n-1), dt(qt(1-alpha/2, df = n-1), df = n-1))
ggplot()+
  geom_segment(aes(x = z_left[1], xend = z_left[1], y = 0, yend = z_left[2]), color = "red")+
  geom_segment(aes(x = z_right[1], xend = z_right[1], y = 0, yend = z_right[2]), color = "red")+
  geom_ribbon(aes(x = x[x < z_left[1]], ymin = 0, ymax = dt(x[x < z_left[1]], df = n-1), fill = "rej"), alpha = 0.3)+
  geom_ribbon(aes(x = x[x > z_right[1]], ymin = 0, ymax = dt(x[x > z_right[1]], df = n-1), fill = "rej"), alpha = 0.3)+
  geom_ribbon(aes(x = x[x > z_left[1] & x < z_right[1]], ymin = 0, ymax = dt(x[x > z_left[1] & x < z_right[1]], df = n-1), 
                  fill = "norej"), alpha = 0.3)+
  geom_line(aes(x, pdf))+
  geom_point(aes(z, 0), color = "black")+
  scale_fill_manual(values = c(rej = "red", norej = "green"), 
                    labels = c(rej = "Rejection Area", norej = "Non Rejection Area")) + 
  scale_x_continuous(breaks = x_breaks) +
  labs(y = "", x = "x", fill = NULL)+
  theme_bw()+
  theme(
    legend.position = "top",
    panel.grid = element_blank()
  )

2 Left-tailed test

For example, let’s consider another the hypothesis: $H_{0} : μ (X) \geq 2.4 H_{1} : μ (X) < 2.4$ The statistic test $T (X_{n})$ do not changes, however it is a left-tailed test. Hence, the critical value is $t_{α}$ is such that $P (x < t_{α}) = 0.05$ . Applying the quantile function $P^{- 1}$ of a student- $t$ we obtain: $\begin{aligned} α & = P (T (X_{n}) < t_{α}) \\ ⇕ \\ t_{α} & = P^{- 1} (P (T (X_{n}) < t_{α})), \end{aligned}$ where $P^{- 1}$ and $P$ are respectively the quantile and distribution functions of a Student- $t$ . Hence, for $α = 0.05$ if $T (X_{n}) < - 1.6451$ we do not reject the null hypothesis, i.e. $μ (X_{n})$ is greater than $μ_{0}$ , otherwise we reject it and $μ (X_{n})$ is lower than $μ_{0}$ .

Left-tailed test

# Critical value left 
z_left <- c(qt(alpha, df = n-1), dt(qt(alpha, df = n-1), df = n-1))
ggplot()+
  geom_segment(aes(x = z_left[1], xend = z_left[1], y = 0, yend = z_left[2]), color = "red")+
  geom_ribbon(aes(x = x[x < z_left[1]], ymin = 0, ymax = dt(x[x < z_left[1]], df = n-1), fill = "rej"), alpha = 0.3)+
  geom_ribbon(aes(x = x[x > z_left[1]], ymin = 0, ymax = dt(x[x > z_left[1]], df = n-1), 
                  fill = "norej"), alpha = 0.3)+
  geom_line(aes(x, pdf))+
  geom_point(aes(z, 0), color = "black")+
  scale_fill_manual(values = c(rej = "red", norej = "green"), 
                    labels = c(rej = "Rejection Area", norej = "Non Rejection Area")) + 
  scale_x_continuous(breaks = x_breaks) +
  labs(y = "", x = "x", fill = NULL)+
  theme_bw()+
  theme(
    legend.position = "top",
    panel.grid = element_blank()
  )

In this case we reject the null hyphotesis, hence $μ (X_{n})$ is lower than $μ_{0}$ .

3 Right-tailed test

Let’s consider the other case, i.e. $H_{0} : μ (X) \leq 2.4 H_{1} : μ (X) > 2.4$ It is always one-side test, but in this case is right-tailed. Hence, the critical value $t_{α}$ is such that $\begin{aligned} 1 - & α = P (T (X_{n}) < t_{α}) \\ ⇕ \\ t_{α} & = P^{- 1} (P (T (X_{n}) < t_{α})), \end{aligned}$ where $P^{- 1}$ and $P$ are respectively the quantile and distribution functions of a Student- $t$ . Hence, for $α = 0.05$ if $T (X_{n}) > 1.6451$ we do not reject the null hypothesis, i.e. $μ (X_{n})$ is lower than $μ_{0}$ , otherwise we reject it and $μ (X_{n})$ is greater than $μ_{0}$ .

Right-tailed test

# Critical value right 
z_right <- c(qt(1-alpha, df = n-1), dt(qt(1-alpha, df = n-1), df = n-1))
ggplot()+
  geom_segment(aes(x = z_right[1], xend = z_right[1], y = 0, yend = z_right[2]), color = "red")+
  geom_ribbon(aes(x = x[x > z_right[1]], ymin = 0, ymax = dt(x[x > z_right[1]], df = n-1), fill = "rej"), alpha = 0.3)+
  geom_ribbon(aes(x = x[x < z_right[1]], ymin = 0, ymax = dt(x[x < z_right[1]], df = n-1), 
                  fill = "norej"), alpha = 0.3)+
  geom_line(aes(x, pdf))+
  geom_point(aes(z, 0), color = "black")+
  scale_fill_manual(values = c(rej = "red", norej = "green"), 
                    labels = c(rej = "Rejection Area", norej = "Non Rejection Area")) + 
  scale_x_continuous(breaks = x_breaks) +
  labs(y = "", x = "x", fill = NULL)+
  theme_bw()+
  theme(
    legend.position = "top",
    panel.grid = element_blank()
  )

Coherently with the previous test in this case we don’t reject $H_{0}$ , hence $μ (X_{n})$ is lower than $μ_{0} = 2.4$ .

Citation

BibTeX citation:

@online{sartini2024,
  author = {Sartini, Beniamino},
  title = {Hypothesis Tests},
  date = {2024-05-01},
  url = {https://greenfin.it/statistics/tests/hypothesis-tests.html},
  langid = {en}
}

For attribution, please cite this work as:

Sartini, Beniamino. 2024. “Hypothesis Tests.” May 1, 2024. https://greenfin.it/statistics/tests/hypothesis-tests.html.