Rachel Chen
Back to Science for FUN

Markov Chain Monte Carlo

Bayesian Statistics

MCMC vs Monte Carlo Simulation vs Monte Carlo Sampling

These three terms sound similar but represent nested concepts.

Monte Carlo Simulation

The broadest umbrella. Any method that uses random sampling to study a system or estimate a quantity — simulating neural noise, pricing financial options, estimating π\pi by throwing random darts. The goal is to mimic a stochastic process.

Example

Simulating the “Forest Ranger”

Signal Detection Theory (SDT)is widely applied in psychophysics, where we model an observer’s ability to distinguish signal from noise.

1. Setting the model

Human perception is noisy and prone to internal or external interference, so we define:

  • Noise (NN): when there is no smoke, factors like water vapor or shifting shadows create sensory noise. The noise distribution can be modeled as N(μnoise,σnoise2)\mathcal{N}(\mu_{\text{noise}}, \sigma_{\text{noise}}^2).
  • Signal + Noise (S+NS+N): when actual smoke is present, the distribution of signal + noise is also a normal distribution, N(μsignal+noise,σsignal+noise2)\mathcal{N}(\mu_{\text{signal+noise}}, \sigma_{\text{signal+noise}}^2). Here, μsignal+noise>μnoise\mu_{\text{signal+noise}} > \mu_{\text{noise}}.
  • Decision criterion (cc): Forest ranger's internal threshold. If the perceived smoke intensity exceeds cc, the ranger sounds the alarm.

2. Monte Carlo iteration

We use a computer to act as the “virtual ranger” and conduct 10,000 simulated trials:

  • Random sampling: in each trial, randomly draw a value from either the NN or S+NS+N distribution. This represents the instantaneous sensory strength perceived by the ranger.
  • Decision making: compare the sampled value against cc to determine the response.

3. Statistical output (results analysis)

Aggregating 10,000 trials gives the probabilities of the four possible outcomes:

OutcomeTruthResponse
HitSmoke presentAlarm
MissSmoke presentNo alarm
False alarmNo smokeAlarm
Correct rejectionNo smokeNo alarm

Interactive visualization

Drag the sliders to change the distribution means and the criterion. Shaded regions show the four outcomes. Click the button to run 10,000 actual samples and compare empirical to theoretical rates.

0.00
2.00
1.00

sensitivity d=(μsμn)/σd' = (\mu_s - \mu_n)/\sigma = 2.00

Hit
smoke → alarm
84.1%
Miss
smoke → no alarm
15.9%
False Alarm
no smoke → alarm
15.9%
Correct Rejection
no smoke → no alarm
84.1%

Monte Carlo Sampling

A narrower sub-category. Draws samples from a known probability distribution p(x)p(x) to approximate expectations:

Ep[f(x)]    1Ni=1Nf(xi),xip(x)\mathbb{E}_{p}[f(x)] \;\approx\; \tfrac{1}{N}\sum_{i=1}^{N} f(x_i), \quad x_i \sim p(x)

Requires that p(x)p(x) be sampleable directly (or via rejection / importance sampling). Samples are i.i.d.

Markov Chain Monte Carlo (MCMC)

A technique for doing Monte Carlo sampling when you cannot sample from p(x)p(x) directly. It constructs a Markov chain whose stationary distribution is p(x)p(x); each draw depends only on the previous one. Heavily used for Bayesian posteriors, where the normalizing constant is unknown:

p(θdata)    p(dataθ)p(θ)p(\theta \mid \text{data}) \;\propto\; p(\text{data} \mid \theta)\, p(\theta)

Samples are correlated, not i.i.d. — analysis requires burn-in and convergence diagnostics (R^\hat{R}, ESS). Common algorithms: Metropolis–Hastings, Gibbs, HMC, NUTS.

MCMC ⊂ Monte Carlo Sampling ⊂ Monte Carlo Simulation