Chapter 4 — Introduction to Estimation Theory

Companion material for Chapter 4. Covers classical and Bayesian parameter estimation, MLE, MAP, MMSE, the Cramér–Rao bound, regression, and hypothesis testing.


Excursion: Plato’s Allegory of the Cave

Wikipedia

§ 4.1 Embedding in Statistics

Parameter Estimation Setup

Parameter estimation block diagram.
NotePlaceholder: Figure

Diagram showing the two directions: probability (model → data) vs. statistics (data → model).

Likelihood

Source: https://probability4datascience.com/

Maximum Likelihood

Source: https://probability4datascience.com/

Source: https://probability4datascience.com/

§ 4.3 Parameter Estimation

Quality Criteria

Source: https://probability4datascience.com/

Variance in Curve Fitting

Source: https://probability4datascience.com/
CautionPlaceholder: Computational Example

Compare estimators for the Gaussian mean: sample mean, trimmed mean, median. Compute bias and variance by Monte Carlo.

Confidence Intervals

Confidence interval example.
CautionPlaceholder: Computational Example

Simulate 100 experiments, compute 95% confidence intervals for each, and show that ~95 contain the true parameter.


§ 4.4 Maximum Likelihood Estimation

Likelihood Function

TipPlaceholder: Interactive

Likelihood surface explorer: given N coin flips, plot L(p) as a function of success probability p. Slider for N and number of heads.

MLE in Action

Coin MLE vs. MAP comparison.

Bayesian reasoning about coin flips — the MLE and MAP perspectives on the same estimation problem. [From previous lecture version — review before using.]

CautionPlaceholder: Computational Example

Implement MLE for the Gaussian mean and variance. Compare biased vs. unbiased variance estimator.


§ 4.5 Bayesian Estimation

Choosing priors

What effect do the following priors have on the estimation?

Source: https://probability4datascience.com/

Which prior should we choose?

  • Based on your preference, e.g., you know from historical data that the parameter should behave in certain ways.
  • Based on physics, e.g., the parameter has a physical interpretation, so you need to abide by the physical laws.
  • Choose a prior that is computationally “friendlier”. This is the topic of the conjugate prior, which is a prior that does not change the form of the posterior distribution.

Prior to Posterior

TipPlaceholder: Interactive

Prior-to-posterior updater: start with a Beta prior on p, observe coin flips one by one, watch the posterior sharpen.

MAP Estimator

MAP binary decoding.
CautionPlaceholder: Computational Example

Compare MAP and ML estimates for a coin bias problem with a Beta prior. Show how strong priors pull MAP toward the prior mean.

Source: https://probability4datascience.com/

Source: https://probability4datascience.com/ Source: https://probability4datascience.com/

Bayesian MMSE Estimator

CautionPlaceholder: Computational Example

Conjugate Gaussian model: compute the MMSE estimate analytically. Verify the posterior mean as a weighted combination of prior mean and observation.


§ 4.6 Cramér–Rao Bound

Fisher Information

CautionPlaceholder: Computational Example

Compute the Fisher information for Gaussian and Poisson models. Compare the CRB to the empirical variance of the MLE across Monte Carlo trials.

TipPlaceholder: Interactive

CRB visualizer: plot the bound as a function of sample size N and noise \sigma^2. Overlay the empirical MSE of the MLE.


§ 4.7 Regression Estimation

MMSE vs. Least Squares

MMSE vs. LS estimator comparison.

Linear Regression

Linear regression example.

Cross-correlation based peak detection for estimating delay between two noisy observations — a concrete least-squares estimation example.

CautionPlaceholder: Computational Example

Fit a linear model with numpy.linalg.lstsq. Visualize residuals and show they should be white if the model is correct.


§ 4.8 Hypothesis Testing

Binary Decision Problem

Hypothesis testing setup.

Detect sinusoids in noise using autocorrelation and spectral analysis. Adjustable SNR and observation length — directly relates to the hypothesis testing framework.

Frequency-domain delay estimation using cross-PSD phase — connects detection theory to spectral methods.

Likelihood Ratio Test

CautionPlaceholder: Computational Example

Implement a likelihood ratio test for Gaussian shift detection. Compute empirical false alarm and detection rates.

ROC Curve

TipPlaceholder: Interactive

ROC curve explorer: vary the decision threshold and observe P_D vs. P_{FA} trade-off across different SNR values.

Gaussian Shift Detection

Gaussian shift detection setup.
CautionPlaceholder: Computational Example

Monte Carlo verification of the theoretical ROC curve for Gaussian shift detection. Compare to the analytical Q-function expression.