Matrix Algebra 👨‍💻

MATH 4780 / MSSC 5780 Regression Analysis

Dr. Cheng-Han Yu
Department of Mathematical and Statistical Sciences
Marquette University

Matrix

  • A matrix \({\bf A}\) that has \(n\) rows and \(m\) columns is defined as \[{\bf A} = (a_{ij})_{n \times m}\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1m} \\ a_{21} & a_{22} & \cdots & a_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nm} \end{bmatrix}_{n \times m}\]
(A <- matrix(data = 1:6, 
             nrow = 2, ncol = 3))
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
[1] "matrix" "array" 
$dim
[1] 2 3

But what is the geometrical meaning of matrices?

https://gfycat.com/cooperativequerulousfirefly (3Blue1Brown)

Column Vector

  • If \(m = 1\), it becomes a \(n\) by 1 matrix or a column vector of size \(n\). \[\small {\bf y} = \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{bmatrix}_{n \times 1} \quad {\bf 1} = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix}_{n \times 1}\]
A
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
## 2nd column
A[, 2]
[1] 3 4
## becomes a numeric vector
class(A[, 2]) 
[1] "integer"
## keep its matrix class
A[, 2, drop = FALSE] 
     [,1]
[1,]    3
[2,]    4
class(A[, 2, drop = FALSE])  
[1] "matrix" "array" 

Row Vector

  • If \(n = 1\), it becomes a \(1\) by \(m\) matrix or a row vector of size \(m\). \[{\bf x} = \begin{bmatrix} x_{1} & x_{2} & \dots & x_{m} \end{bmatrix}_{1 \times m}\]
  • By default, a vector means a column vector.
## 2nd row
A[2, , drop = FALSE]
     [,1] [,2] [,3]
[1,]    2    4    6
## keep its matrix class
class(A[2, , drop = FALSE]) 
[1] "matrix" "array" 

Transpose

  • Let \({\bf A}\) be an \(n \times m\) matrix. The transpose of \({\bf A}\), denoted by \({\bf A}'\) or \({\bf A}^T\), is a \(m \times n\) matrix whose columns are the rows of \({\bf A}\). \[{\bf A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ \end{bmatrix} \quad {\bf A}' =\begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6\end{bmatrix}\] \[{\bf X}_{n \times 2}\begin{bmatrix} 1 & x_1 \\ 1 & x_{2} \\ \vdots & \vdots \\ 1 & x_n \end{bmatrix} = \begin{bmatrix} {\bf 1} & {\bf x} \end{bmatrix}\] \[{\bf X}'_{2 \times n} = \begin{bmatrix} 1 & 1 & \cdots & 1 \\ x_1 & x_2 & \cdots & x_n \end{bmatrix} = \begin{bmatrix} {\bf 1}' \\ {\bf x}' \end{bmatrix}\]

Transpose in R

A
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
t(A)
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6

Linear Independence

  • Vectors \({\bf v}_1, {\bf v}_2, \dots, {\bf v}_k\) is said to be linearly independent if for scalars \(c_{1},c_{2},\dots,c_{k} \in \mathbf{R},\) \[c_{1}{\bf v}_1 + c_{2}{\bf v}_2 + \cdots + c_{k}{\bf v}_k = \bf 0\] can only be satisfied by \(c_i = 0\) for all \(i = 1, 2, \dots, k\).

Are \({\bf v}_1 = (1, 1)'\) and \({\bf v}_2 = (-3, 2)'\) linearly independent?

Ways to check linear independence:

  • solve the homogeneous linear system \(\begin{bmatrix} 1 & -3 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} c_1 \\ c_2 \end{bmatrix}=\begin{bmatrix} 0 \\ 0 \end{bmatrix}\) for \(c_1\) and \(c_2\).

  • check if \(\text{det} \left( \begin{bmatrix} 1 & -3 \\ 1 & 2 \end{bmatrix} \right)\) is non-zero.

Linear Independence

V <- matrix(c(1, 1, -3, 2), 2, 2)
solve(V, b = rep(0, 2))
[1] 0 0
det(V)
[1] 5

Rank

  • The rank of a matrix \({\bf A} = \begin{bmatrix} {\bf a}_1 & {\bf a}_2 & \cdots & {\bf a}_m \end{bmatrix}\) is the number of linearly independent columns (dimension of the column space).

  • If \(k\) of the \(m\) column vectors \({\bf a}_1, {\bf a}_2, \dots, {\bf a}_m\) are linearly independent, the rank of \({\bf A}\) is \(k\).

  • The remaining \(m-k\) columns of \({\bf A}\) can be written as a linear combination of the \(k\) linearly independent columns.

Rank

(M <- matrix(1:9, nrow = 3, ncol = 3))
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
# install.packages("Matrix")
library(Matrix)
rankMatrix(M)[1]
[1] 2

Can you see why the rank is 2, meaning that one column can be written as a linear combo of the other two?

  • \({\bf a}_3 = 2{\bf a_2}-{\bf a}_1\)
2 * M[, 2] - 1 * M[, 1]
[1] 7 8 9

Operations: Addition

  • Addition: \({\bf A + B}\) is adding the corresponding elements together \(a_{ij} + b_{ij}\).

  • \({\bf A}\) and \({\bf B}\) must have an equal number of rows and columns.

A
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
(B <- matrix(6:1, nrow = 2, ncol = 3))
     [,1] [,2] [,3]
[1,]    6    4    2
[2,]    5    3    1
A + B
     [,1] [,2] [,3]
[1,]    7    7    7
[2,]    7    7    7
(B <- matrix(1:4, nrow = 2, ncol = 2))
     [,1] [,2]
[1,]    1    3
[2,]    2    4
A + B
Error in A + B: non-conformable arrays

Operations: Multiplication

  • Multiplication: The product of matrices \({\bf A}\) and \({\bf B}\) is denoted as \({\bf AB}\).
  • The number of columns in \({\bf A}\) must be equal to the number of rows \({\bf B}\).
  • The result matrix \({\bf C}\) has the number of rows of \({\bf A}\) and the number of columns of the \({\bf B}\).

Operations: Multiplication

A
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
(B <- matrix(1:12, nrow = 3, ncol = 4))
     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12
(C <- A %*% B)
     [,1] [,2] [,3] [,4]
[1,]   22   49   76  103
[2,]   28   64  100  136
(B <- matrix(1:8, nrow = 2, ncol = 4))
     [,1] [,2] [,3] [,4]
[1,]    1    3    5    7
[2,]    2    4    6    8
A %*% B
Error in A %*% B: non-conformable arguments

Operations: Multiplication

  • \({\bf A}{\bf B} \ne {\bf B}{\bf A}\) in general.

  • \(({\bf A}{\bf B})' = {\bf B}'{\bf A}'\).


(C <- matrix(1:4, 2, 2))
     [,1] [,2]
[1,]    1    3
[2,]    2    4
(D <- matrix(2:5, 2, 2))
     [,1] [,2]
[1,]    2    4
[2,]    3    5
C %*% D
     [,1] [,2]
[1,]   11   19
[2,]   16   28
D %*% C
     [,1] [,2]
[1,]   10   22
[2,]   13   29
t(C %*% D)
     [,1] [,2]
[1,]   11   16
[2,]   19   28
t(D) %*% t(C)
     [,1] [,2]
[1,]   11   16
[2,]   19   28

Special Matrices: Symmetric and Identity

  • A square matrix \((m = n)\) \({\bf A}_{n \times n}\) is a symmetric matrix if \({\bf A = A}'\).

\[{\bf A} = \begin{bmatrix} 1 & 2 \\ 2 & 5 \\ \end{bmatrix} \quad {\bf A}' =\begin{bmatrix} 1 & 2 \\ 2 & 5 \end{bmatrix}\]

  • A square matrix \({\bf I}_{n \times n}\) whose diagonal elements are 1’s and off diagonal elements are 0’s is called an identity matrix of order \(n\). \[{\bf I} = \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix}\]

Identity and Diagonal Matrix

## Identity matrix
I <- diag(5)
I
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    0    0    0    0
[2,]    0    1    0    0    0
[3,]    0    0    1    0    0
[4,]    0    0    0    1    0
[5,]    0    0    0    0    1
## A diagonal matrix
diag(c(4, 2, 5))
     [,1] [,2] [,3]
[1,]    4    0    0
[2,]    0    2    0
[3,]    0    0    5
## Extract diagonal elements
D <- matrix(1:4, 2, 2)
D
     [,1] [,2]
[1,]    1    3
[2,]    2    4
diag(D)
[1] 1 4

Inverse Matrix

  • The inverse of a square matrix \({\bf A}\), denoted by \({\bf A}^{-1}\), is a square matrix such that \[{\bf A}^{-1}{\bf A} = {\bf A}{\bf A}^{-1} = {\bf I}\]
D
     [,1] [,2]
[1,]    1    3
[2,]    2    4
D_inv <- solve(D)
D_inv
     [,1] [,2]
[1,]   -2  1.5
[2,]    1 -0.5
D_inv %*% D
     [,1] [,2]
[1,]    1    0
[2,]    0    1
D %*% D_inv
     [,1] [,2]
[1,]    1    0
[2,]    0    1

Idempotent, Orthogonal, Positive Definite

  • A square matrix \({\bf A}\) is idempotent if \({\bf A}{\bf A} = {\bf A}\).
  • A square matrix \({\bf A}\) is orthogonal if \({\bf A}^{-1} = {\bf A}'\) and hence \({\bf A}'{\bf A} = {\bf I}\).
  • Let \({\bf A}\) be a \(n \times n\) matrix and \({\bf y}\) be a \(n \times 1\) vector. \[{\bf y}'{\bf A} {\bf y} = \sum_{i=1}^n\sum_{j=1}^na_{ij}y_iy_j\] is called a quadratic form of \({\bf y}\).
  • A symmetric matrix \({\bf A}\) is said to be
    • positive definite if \({\bf y}'{\bf A} {\bf y} > 0\) for all \({\bf y \ne 0}\).
    • positive semi-definite if \({\bf y}'{\bf A} {\bf y} \ge 0\) for all \({\bf y \ne 0}\) and \({\bf y}'{\bf A} {\bf y} = 0\) for some \({\bf y \ne 0}\).
  • If \({\bf A}\) is symmetric and idempotent, then \({\bf A}\) is positive semi-definite.

Special Matrices: Idempotent, Orthogonal

## Idempotent and symmetric matrix
(A <- matrix(1, 3, 3)/3)
     [,1] [,2] [,3]
[1,] 0.33 0.33 0.33
[2,] 0.33 0.33 0.33
[3,] 0.33 0.33 0.33
A %*% A
     [,1] [,2] [,3]
[1,] 0.33 0.33 0.33
[2,] 0.33 0.33 0.33
[3,] 0.33 0.33 0.33
## positive semi-definite
y <- c(-2, 5, 2)
t(y) %*% A %*% y
     [,1]
[1,]  8.3
## Orthogonal matrix
(E <- matrix(c(1, -1, 1, 1)/sqrt(2), 2, 2))
      [,1] [,2]
[1,]  0.71 0.71
[2,] -0.71 0.71
E %*% t(E)
        [,1]    [,2]
[1,] 1.0e+00 2.2e-17
[2,] 2.2e-17 1.0e+00
t(E) %*% E
         [,1]     [,2]
[1,]  1.0e+00 -2.2e-17
[2,] -2.2e-17  1.0e+00

Trace

  • The trace of \({\bf A}_{n \times n}\), denoted by \(\text{tr}({\bf A})\), is defined as \[\text{tr}({\bf A}) = a_{11} + a_{22} + \dots + a_{nn}\]

  • \(\text{tr}({\bf A}{\bf B}) = \text{tr}({\bf B}{\bf A})\)

(G <- matrix(1:9, 3, 3))
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
sum(diag(G))
[1] 15

Eigenvalues and Eigenvectors

  • For a square matrix \({\bf A}\), if we can find a scalar \(\lambda\) and a non-zero vector \({\bf x}\) such that \[{\bf Ax} = \lambda {\bf x},\]
    • \(\lambda\) is an eigenvalue of \({\bf A}\)
    • \({\bf x}\) is a corresponding eigenvector of \({\bf A}\).
  • A \(n \times n\) matrix \({\bf A}\) will have \(n\) eigenvalues and \(n\) corresponding eigenvectors.

https://gfycat.com/ifr/WickedSaltyAnemonecrab (3Blue1Brown)

Eigen-decomposition

  • Eigen-decomposition: Any symmetric matrix \({\bf A}_{n \times n}\) can be decomposed as \[{\bf A} = {\bf V\boldsymbol \Lambda V'}\]

  • \(\boldsymbol \Lambda\) is a \(n \times n\) diagnonal matrix whose elements are eigenvalues \(\lambda_j\) of \({\bf A}\) \[\boldsymbol \Lambda= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}\]

  • \({\bf V} = [{\bf v}_1 \quad {\bf v}_2 \quad \dots \quad {\bf v}_n]\) is a \(n \times n\) orthogonal matrix whose columns are the eigenvectors of \({\bf A}.\)

Eigenvalues and Eigenvectors

  • \(\text{tr}({\bf A}) = \sum_{i=1}^n\lambda_i\)
  • \(|{\bf A}| = \prod_{i=1}^n \lambda_i\)
  • \(\text{rank}({\bf A}) =\) the number of non-zero \(\lambda_i\)
  • If \({\bf A}\) is symmetric, all \(\lambda_i \in \mathbf{R}\)

Eigenvalues and Eigenvectors

(A <- matrix(c(3, 1, 1, 2), 2, 2))
     [,1] [,2]
[1,]    3    1
[2,]    1    2
eigen_decomp <- eigen(A)
(lam <- eigen_decomp$values)
[1] 3.6 1.4
(V <- eigen_decomp$vectors)
      [,1]  [,2]
[1,] -0.85  0.53
[2,] -0.53 -0.85
V %*% diag(lam) %*% t(V)
     [,1] [,2]
[1,]    3    1
[2,]    1    2
sum(diag(A))
[1] 5
sum(lam)
[1] 5
det(A)
[1] 5
prod(lam)
[1] 5