Graphical Lasso

A. Covariance Matrix $\Sigma$

The covariance matrix measure how multiple variables vary together. It is a square matrix, symmetric, with size determined by the number of variables you have. If you have a vector of random variables $\mathbf{X} = (X_1, X_2, \cdots, X_p)$ , the covariance matrix $\Sigma$ is defined as:

\Sigma = Cov(\mathbf{X}, \mathbf{X}) = \begin{pmatrix} Var(X_1) & Cov(X_1, X_2)& \cdots & Cov(X_1, X_p) \\ Cov(X_2, X_1) & Var(X_2) & \cdots & Cov(X_2, X_p)\\ \vdots & \vdots & \ddots & \vdots \\ Cov(X_p, X_1) & Cov(X_p, X_2) & \cdots & Var(X_p)\\ \end{pmatrix}

Diagonal elements: variances $Var(X_i)$
Off-diagonal elements: covariances $Cov(X_i, X_j)$ $C o v (X_{i}, X_{j})$ , which indicate how two variables change together:
- Positive covariance indicates variables tend to increase or decrease together.
- Negative covariance indicates variables tend to vary in opposite directions.
- Zero covariance indicates no linear relationship between variables(doesn’t imply independence)

B. Precision Matrix( $\Theta = \Sigma^{-1}$ ):

Precision matrix is simply the inverse of the covariance matrix and is written as:

\Theta = \Sigma^{-1}

Diagonal elements give an idea about the variance of each variable conditional on all other variables.
Off-diagonal zeros indicate conditional independence. Specifically:
- $\Theta_{i, j} = 0$ means variable $X_i$ and $X_j$ are conditionally independent given all the other variables.
- Thus, precision matrix is very useful in graphical models because it reveals conditional independence structure clearly.

C. Empirical Covariance Matrix S

The empirical covariance(sample covariance matrix) is an estimate of the true covariance matrix based on your actual data. For an observational dataset of (n) samples and (p) variables, your data matrix is:

X = \begin{pmatrix} x_{1,1} & x_{1, 2} & \cdots & x_{1, p}\\ x_{2, 1} & x_{2, 2} & \cdots & x_{2, p}\\ \vdots & \vdots & \ddots & \vdots\\ x_{n, 1} & x_{n, 2} & \cdots & x_{n, p} \end{pmatrix}

It assumes your data is centred(mean-subtracted), empirical covariance matrix $S$ is computed as:

S = \frac{1}{n-1}X^TX

This empirical covariance matrix is our practical tool to infer the covariance structure since the true covariance matrix of whole population is generally unknown.

Graphical Models and Graphical Lasso

A. Gaussian Graphical Model(GGM):

A GGM assumes your multi-dimensional data vector $\mathbf{X}$ is from a multivariate Gaussian distribution:

\mathbf{X} \sim \mathcal{N}(\Sigma)

The structure of conditional dependence is captured by the precision matrix:

$\Theta_{i, j} = 0$ implies conditional independence of the corresponding variables given the others.
Each non-zero entry represents a conditional dependency. This means the precision matrix defines a graphical model(network), where nodes represent variables and edges represent conditional dependencies.

Thus, the main objective of graphical modelling is estimating $\Theta$ , which reveals conditional independencies.

B. Graphical Lasso:

The graphical lasso(glasso) is an estimator that simultaneously estimates the precision matrix and provides sparsity(many zeros), simplify the graphical structure. Graphical Lasso solves the following optimization problem:

\hat{\Theta}{glasso} = \arg\max{\Theta \succ 0} \left(\log\det\Theta - \text{tr}(S\Theta) - \lambda \sum_{i \neq j}|\Theta_{i, j}|\right)

$\log\det{\Theta}$ ensures $\Theta$ is positive definite.
$\text{tr}(S\Theta)$ relates directly to Maximum Likelihood from the Gaussian distribution assumption.
$\sum_{i \neq j}|\Theta_{i, j}|$ imposes an L1 penalty(lasso penalty) which induces sparsity. A larger $\lambda$ value results in more zeros(sparser) in the precision matrix.

This L1 penalty makes graphical lasso valuable compared to simple precision estimations, specialising in interpolation, dimension reduction, and providing useful graphical network.

Graphical Lasso

Graphical Lasso

A. Covariance Matrix Σ\SigmaΣ

B. Precision Matrix(Θ=Σ−1\Theta = \Sigma^{-1}Θ=Σ−1):

C. Empirical Covariance Matrix S

Graphical Models and Graphical Lasso

A. Gaussian Graphical Model(GGM):

B. Graphical Lasso:

Graphical Lasso

A. Covariance Matrix Σ\SigmaΣ

B. Precision Matrix(Θ=Σ−1\Theta = \Sigma^{-1}Θ=Σ−1):

C. Empirical Covariance Matrix S

Graphical Models and Graphical Lasso

A. Gaussian Graphical Model(GGM):

B. Graphical Lasso:

A. Covariance Matrix $\Sigma$

B. Precision Matrix( $\Theta = \Sigma^{-1}$ ):

A. Covariance Matrix $\Sigma$

B. Precision Matrix( $\Theta = \Sigma^{-1}$ ):