Random Experiments, Random Variables and Random Processes
Key Concepts
- Random Variable
- Stochastic (Random) Process
- Cumulative Distribution Function (CDF) and Probability Density Function (PDF)
- Basic Distributions: Uniform, Gaussian, Laplacian
Uniform Distribution
A continuous real-valued RV uniformly distributed in the interval (x_{\text{min}}, x_{\text{max}}) has:
f_X(x) = \begin{cases} \frac{1}{x_{\text{max}} - x_{\text{min}}} & \text{for } x_{\text{min}} \leq x \leq x_{\text{max}} \\ 0 & \text{otherwise} \end{cases}
Applications of Uniform Distribution
Random process sinusoids with uniform random phase: X(\eta,t) = \sin( \omega t + \phi(\eta)) \quad \text{ with } \quad \phi(\eta) \sim U[0,2\pi]
Random Directions on a Sphere
The direction of arrival of a signal (for example, at a sensor or microphone array) can be represented by its azimuth and elevation angles: \phi_i \sim U[0, 2\pi], \qquad \theta_i \sim U[0, \pi],
A random direction on the unit sphere can then be expressed in Cartesian coordinates as x_i = \sin(\theta_i)\cos(\phi_i), \quad y_i = \sin(\theta_i)\sin(\phi_i), \quad z_i = \cos(\theta_i).
Random points in a shoebox-shaped room
In a room with dimenstions L \times W \times H, a random position is a 3-dimensional vector with uniform distributions: {\bf x}_i \sim U[0,L] \times U[0,W] \times U[0,H]
Gaussian Distribution
The density of the Gaussian is given by the closed form expression f_{X}(x) = \frac{1}{\sqrt{2\pi} \cdot \sigma_{X}} e^{-(x-m_{X})^2/ (2 \sigma_{X}^2)}
An example of a Gaussian is given below: \mathcal{N}(m_{X}=3,\, \sigma_{X}^2=2^2).
Application of Gaussian Distribution
A common occurence of Gaussian distribution is when many effects superpose.
Sensor Noise - Thermal Noise in Microphones
Even in complete silence, the random thermal motion of electrons inside the microphone’s diaphragm and amplifier circuitry generates a small, fluctuating voltage — this is Johnson–Nyquist noise. It is broadband and approximately Gaussian-distributed, producing a faint hiss that can be heard if you amplify the signal enough. In high-quality condenser microphones, the equivalent input noise level is typically around 10–20 dBA SPL, setting the lower bound for measurable sound pressure. This unavoidable noise floor illustrates that every sensor, regardless of quality, introduces some randomness due to thermal agitation.
Recording of self-noise of a Sennheiser Ambeo microphone:
Reverberation
A classic example in audio is the reveberation of sound in a room. After the initial phase, the room is filled with room and the distribution of sound pressure in all positions in the room approximates a Gaussian distribution. A large number of reflected sound waves superpose to constitute the resulting sound field.
Multidimensional Probability Distributions and Densities
Key Concepts
- Joint Distributions and densities
- Marginal Distributions and densities
- Statistical Independence
- Conditional Distribution and Densities
2D Gaussian Distribution
Visualize the joint Gaussian distribution with parameters \mathbf{m}_X = \begin{bmatrix} 1 \\[4pt] 0.5 \end{bmatrix} \quad \mathbf{C}_{XX} = \begin{bmatrix} 1.2 & 0.45 \\[4pt] 0.45 & 0.2 \end{bmatrix} Note that the marginals of the PDF are scaled for better plotting.
The birds-eye view of the joint distribution
Conditional Distribution
The conditional distribution of Y given X=x_0 is: f_{Y|X}(y|x_0) = \frac{f_{XY}(x,y)}{f_X(x_0)}
Independence of random variables
Two random variables X and Y are independent if: f_{XY}(x,y) = f_X(x) \cdot f_Y(y)
For the joint Gaussian, we can have two different joint distributions, but with identical marginal distributions.
Empirical distribution from data
Distributions can also be estimated directly from data samples x^{(1)}, \dots, x^{(M)}. The empirical cumulative distribution function (CDF) is defined as \widehat{F}_X(x) = \frac{1}{M}\sum_{i=1}^{M} \mathbf{1}(x^{(i)} < x), where \mathbf{1}(\text{statement}) equals 1 if the statement is true and 0 otherwise.
To estimate a probability density function (PDF) empirically, some form of smoothing is required. The most straightforward approach is a histogram, which divides the data range into discrete bins of a chosen width and counts the relative frequency of samples in each bin. \widehat{f}_X(x) = \frac{\text{count in bin containing } x}{M \cdot h}.
Application: Speech
We illustrate with speech analysis an empirical random variable.