Gaussian mixture

Author
Affiliation

Beniamino Sartini

University of Bologna

Published

May 1, 2024

Modified

July 25, 2024

Setup
library(dplyr)
# required for figures  
library(ggplot2)
library(gridExtra)
# required for render latex 
library(backports)
library(latex2exp)
# required for render tables
library(knitr)
library(kableExtra)
# Random seed 
set.seed(1)

Let’s consider a linear combination of a Bernoulli and two normal random variables, all assumed to be independent, i.e. (1)XBZ1+(1B)Z2, or in compact form XGM(μ1,μ2,σ12,σ22,p). Formally, BBer(p) is a Bernoulli while Z1=N(μ1,σ12) and Z2=N(μ2,σ22) are two independent Gaussian random variables.

Gaussian mixture simulation
# ================== Setups ==================
t_bar <- 5000 # number of steps ahead  
# parameters
par <- c(mu1=-2,mu2=2,sd1=1,sd2=1,p=0.5) 
# ============================================
# Gaussian mixture simulation 
N_1 <- rnorm(t_bar, mean = par[1], sd = par[3])
N_2 <- rnorm(t_bar, mean = par[2], sd = par[4])
B <- rbinom(t_bar, 1, prob = par[5])
Xt <- B*N_1 + (1 - B)*N_2
# Empiric pdf and cdf  
ker <- density(Xt, from = min(Xt), to = max(Xt))
ker$cdf_emp <- cumsum(ker$y/sum(ker$y))
# Components normal pdf 
ker$pdf_Z1 <- dnorm(ker$x, mean = par[1], sd = par[3]) 
ker$pdf_Z2 <- dnorm(ker$x, mean = par[2], sd = par[4])
# Mixture pdf and cdf 
ker$pdf <- par[5]*ker$pdf_Z1 + (1-par[5])*ker$pdf_Z2
ker$cdf <- cumsum(ker$pdf/sum(ker$pdf))
# =================== Plot ===================
# Plot trajectory
plot_gm <- ggplot()+
  geom_point(aes(1:t_bar, Xt), alpha = exp(-0.00009*t_bar))+
  labs(x = "t", y = TeX("$X_t$"))+
  theme_bw()
# Plot pdf 
plot_pdf <- ggplot()+
  geom_line(aes(ker$x, ker$y))+
  geom_line(aes(ker$x, ker$pdf), color = "red")+
  labs(x = NULL, y = "Pdf")+
  theme_bw()+
  coord_flip()
# Plot cdf 
plot_cdf <- ggplot()+
  geom_line(aes(ker$x, ker$cdf_emp))+
  geom_line(aes(ker$x, ker$cdf), color = "red")+
  labs(x = NULL, y = "Cdf")+
  theme_bw()+
  coord_flip()
plot_gm
gridExtra::grid.arrange(plot_pdf, plot_cdf, ncol = 2)
(a) Simulation.
(b) Pdf and Cdf.
Figure 1: Gaussian Mixture simulation and density function with parameters μ1=2, μ2=2, σ1=1, σ2=1 and p=0.5.

1 Distribution and density

The distribution function of a Gaussian mixture reads explicitely as: (2)FX(x)=pΦ(xμ1σ1)+(1p)Φ(xμ2σ2), where Φ is the cumulative distribution function of a standard normal random variable. Taking the derivative, it can be easily shown that the density function reads: (3)fX(x)=pϕ(xμ1σ1)+(1p)ϕ(xμ2σ2). where ϕ is the density function of a standard normal random variable.

dnorm_mix <- function(params) {
  # parameters
  mu1 = params[1]
  mu2 = params[2]
  sd1 = params[3]
  sd2 = params[4]
  p = params[5]
  function(x, log = FALSE){
    probs <- p*stats::dnorm(x, mean = mu1, sd = sd1) + (1-p)*stats::dnorm(x, mean = mu2, sd = sd2)
    if (log) {
      probs <- base::log(probs)
    }
    return(probs)
  }
}

Proof. The distribution function of a Gaussian Mixture is defined as: FX(y)=P(Xy)=E{1Xx} Hence, we can rewrite it in terms of conditional expectation with respect to B, i.e.  FX(y)=E{1Xx|B}==E{1Xx|B=0}P(B=0)+E{1Xx|B=1}P(B=1)==pP(Z1x)+(1p)P(Z2x) Hence, standardizing the normal random variable we obtain, FX(x)=pΦ(xμ1σ1)+(1p)Φ(xμ2σ2). where Φ denotes the distribution function of a standard normal. Knowing that fX(x)=dFX(x)dx and that ϕX(x)=dΦ(x)dx, where ϕ is the density function of a standard normal we obtain the result, i.e. fX(x)=pϕ(xμ1σ1)+(1p)ϕ(xμ2σ2).

2 Moments

Given that Z1, Z2 and B are independent, the expectation can be computed as: E{X}=pμ1+(1p)μ2 The second moment is computed as: E{X2}=p(μ12+σ12)+(1p)(μ22+σ22) Hence, the variance reads: V{X}=p(1p)(μ1μ2)2+σ12p+σ22(1p)

Proof. Given that Z1, Z2 and B are independent, the expectation can be computed as: E{X}=E{E{X|B}}==E{X|B=1}P(B=1)+E{X|B=0}P(B=0)==pE{Z1}+(1p)E{Z2}==pμ1+(1p)μ2 The second moment is computed similarly to the first one, i.e.  E{X2}=E{E{X2|B}}==E{X2|B=1}P(B=1)+E{X2|B=0}P(B=0)==E{B}E{Z12}+E{1B}E{Z22}==pE{Z12}+(1p)E{Z22}==p(μ12+σ12)+(1p)(μ22+σ22) The variance, by definition, is given by: V{X}=E{X2}E{X}2 where the first moment squared is E{X}2=[pμ1+(1p)μ2]2==p2μ12+(1p)2μ22+2p(1p)μ1μ2 Hence the variance, V{X}=p(μ12+σ12)+(1p)(μ22+σ22)p2μ12(1p)2μ222p(1p)μ1μ2==pμ12+pσ12+μ22+σ22pμ22pσ22p2μ12(1p)2μ222p(1p)μ1μ2==μ12p(1p)+pσ12+(1p)σ22+p(1p)μ222p(1p)μ1μ2==p(1p)(μ12μ222μ1μ2)+pσ12+(1p)σ22==p(1p)(μ1μ2)2+pσ12+(1p)σ22

3 Maximum likelihood

Minimizing the negative log-likelihood gives an estimate of the parameters, i.e. argminμ1,μ2,σ1,σ2,p{i=1tlog(fX(xi))}, or equivalently maximizing the negative log-likelihood, i.e.  argmaxμ1,μ2,σ1,σ2,p{i=1tlog(fX(xi))}.

Maximum likelihood for Gaussian mixture
# Initialize parameters 
init_params <- par*runif(5, 0.3, 1.1)
# Log-likelihood function
log_lik <- function(params, x){
  # Parameters
  mu1 = params[1]
  mu2 = params[2]
  sd1 = params[3]
  sd2 = params[4]
  p = params[5]
  # Ensure that probability is in (0,1)
  if(p > 0.99 | p < 0.01 | sd1 < 0 | sd2 < 0){ s
    return(NA_integer_)
  }
  # Mixture density
  pdf_mix <- dnorm_mix(params)
  # Log-likelihood
  loss <- sum(pdf_mix(x, log = TRUE), na.rm = TRUE)
  return(loss)
}
# Optimal parameters
# fnscale = -1 to maximize (or use negative likelihood)
ml_estimate <- optim(par = init_params, log_lik, 
                     x = Xt, control = list(maxit = 500000, fnscale = -1))
Table 1: Maximum likelihood estimates for a Gaussian Mixture.
Parameter True Estimate Log-lik Bias
μ1 -2.0 -1.9905803 -10332.56 0.0094197
μ2 2.0 2.0006381 -10332.56 0.0006381
σ1 1.0 1.0457653 -10332.56 0.0457653
σ2 1.0 1.0033373 -10332.56 0.0033373
p 0.5 0.4937459 -10332.56 -0.0062541

4 Moments matching

Let’s fix the parameter of the first component, namely μ1 and σ12 and a certain probability p. Then let’s compute the sample estimate of the expectation of Xt, i.e. E{X}=1ti=1txi=μ^ and the sample variance: V{X}=1ti=1t(xiμ^)2=σ^2 In order to obtain an estimate of the second distribution such that the Gaussian Mixture moments match exactly the sample estimates we solve the system for μ2 and σ22: {μ^=pμ1+(1p)μ2σ^2=p(1p)(μ1μ2)2+σ12p+σ22(1p) which lead to a unique solution, i.e. μ2=μ^pμ11pσ22=σ^2pσ121pp(μ1μ2)2

5 Moment generating function

The moment generating function of a Gaussian mixture random variable () reads: MX(t)=pexp{μ1t+t2σ122}+(1p)exp{μ2t+t2σ222}

Proof. Applying the definition of moment generating function on a gaussian mixture we obtain: MX(t)=E{etX}==pE{etZ1}+(1p)E{etZ2}==pMZ1(t)+(1p)MZ1(t)==pexp{μ1t+t2σ122}+(1p)exp{μ2t+t2σ222} where MZ1(t) and MZ2(t) are the moment generating functions of a Gaussian random variable.

6 Esscher transform

The Esscher transform of a Gaussian mixture random variable reads: Eθ{fX(x)}=p~N(μ1+θσ12,σ12)+(1p~)N(μ2+θσ22,σ22) where p~ are defined as: p~=pMZ1(θ)pMZ1(θ)+(1p)MZ2(θ)

Proof. In general, the Esscher transform of a density function is computed as:
Eθ{fX(x)}=eθXfX(x)MX(θ)=eθXfX(x)eθXfX(x)dx Substituing the density function of a Gaussian mixture we obtain: Eθ{fX(x)}=eθX(pfZ1(x)+(1p)fZ2(x))eθX(pfZ1(x)+(1p)fZ2(x))dx Let’s examine the product of the first component: peθXfZ1(x)=peθx2πσ1exp{(xμ1)22σ12}=12πσ12exp{(xμ1)22σ12+θx} In order to rewrite it as the pdf of a normal with mean (μ1+θσ12), we note that the expansion of such product reads: (xμ1θσ12)2=x22xμ12θxσ12+μ2+2μθσ12+θ2σ12 Hence, let’s add and subtract inside the exponential ±μ1θ±θ2σ122, i.e. peθXfZ1(x)=p2πσ1exp{(xμ1)22σ12+θxμ1θθ2σ122}exp{μ1θ+θ2σ122}==p2πσ1exp{μ1θ+θσ122}exp{(xμ1θ2σ12)22σ12} Hence let’s collect the first part and the denominator (mgf of the Gaussian mixture in θ) and define the new probability: p~=pexp{μ1θ+θ2σ122}pexp{μ1θ+θ2σ122}+(1p)exp{μ2θ+θ2σ222} That can be equivalently written in terms of the moment generating functions, i.e.  p~=pMZ1(θ)pMZ1(θ)+(1p)MZ2(θ) Hence, repeating the same for the second component we obtain that the Esscher transform of a Gaussian mixture is defined as: Eθ{fX(x)}=p~N(μ1+θσ12,σ12)+(1p~)N(μ2+θσ22,σ22)

Back to top

Citation

BibTeX citation:
@online{sartini2024,
  author = {Sartini, Beniamino},
  title = {Gaussian Mixture},
  date = {2024-05-01},
  url = {https://greenfin.it/statistics/distributions/gaussian-mixture.html},
  langid = {en}
}
For attribution, please cite this work as:
Sartini, Beniamino. 2024. “Gaussian Mixture.” May 1, 2024. https://greenfin.it/statistics/distributions/gaussian-mixture.html.