Mutiplication $Ax$ using columns of $A$,
$$ \begin{bmatrix} 2 & 3 \\ 2 & 4 \\ 3 & 7 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = x_1 \begin{bmatrix} 2 \\ 2 \\ 3 \end{bmatrix} + x_2 \begin{bmatrix} 3 \\ 4 \\ 7 \end{bmatrix} $$
Multiplication using dot products/rows is low level and used for computing. Understanding is higher level, using vectors.
Thus $Ax$ is linear combination of the columns of $A$. This is fundamental.
The combination of columns fill out the column space of $A$. The column space of the above example is a plane.
The rank of a matrix is the dimension of its column space.
$$\mathbf{A} = \mathbf{CR}$$
$\mathbf{C}$ is the basis of $\mathbf{A}$, and $\mathbf{R}$ is the row-reduced echelon form of $\mathbf{A}$
The big factorization for data science is the "SVD" of $A$, where the first factor $C$ has $r$ orthogonal columns and the second factor $R$ has $r$ orthogonal rows.
Two ways to arrive at matrix-matrix multiplication,
Both involve $mnp$ multiplications.
Outer product helps us look for the important part of matrix $A$, we do not usually want the biggest number in $A$, what we want is the largest piece of $A$. And those pieces are the rank one matrices.
Every $m \times n$ matrix leads to four subspaces - two subspaces of $\mathbb{R}^m$ and two more of $\mathbb{R}^n$.
$r$ independent equations $Ax = 0$ have $n - r$ independent solutions