About
Solving closest point in the span of many vectors
Goal:
- An algorithm that, given a vector b and vectors v1, . . . , vn, finds the vector in Span {v1, . . . , vn} that is closest to b.
Special case: We can use the algorithm to determine whether b lies in Span {v1, . . . , vn}: If the vector in Span {v1, . . . , vn} closest to b is b itself then clearly b is in the span; if not, then b is not in the span.
Testing if b is in Span {v1, . . . , vn} is equivalent to testing if the equation Ax = b has a solution.
More generally: Even if Ax = b has no solution, we can use the algorithm to find the point in <math>\{Ax : x \in \mathbb{R}\}</math> closest to b.
Computation Problem
Goal: find the coefficients x1, . . . , xn so that x1 v1 + · · · + xn vn is the vector in Span {v1, . . . , vn} closest to b.
Two methods:
Norm Minimization
Equivalent: Find the vector x that minimizes:
- <math> \| b - A.x \|</math>
- <math> {\| b - A.x \|}^2</math>
Given a matrix equation Ax = b that might have no solution, find the best solution available in the sense that the norm of the error b - Ax is as small as possible.
Two algo:
- one based on Gaussian elimination.
- one based on orthogonality that is much faster and more reliable than gradient descent
Least square
Find the vector x that minimizes:
- <math> \| b - A.x \|</math>
- <math> {\| b - A.x \|}^2</math>
- <math> {\| \begin{bmatrix}b\end{bmatrix} - \begin{bmatrix} \begin{array}{r|r|r} && \\ v_1 & \dots & v_n \\ && \end{array} \end{bmatrix}. \begin{bmatrix}x\end{bmatrix} ||}^2</math>
- <math> {|| \begin{bmatrix}b\end{bmatrix} - \begin{bmatrix} \begin{array}{r} a_1 \\ \hline \\ \dots \\ \hline \\ a_m \\ \end{array} \end{bmatrix}.\begin{bmatrix}x\end{bmatrix} \|}^2</math>
- <math>(b[1]-a_1.x)^2 + \dots + (b[m]-a_m.x)^2</math>
This problem is called least squares (”methode des moindres carres”, due to Adrien-Marie Legendre but often attributed to Gauss).
- This problem can be addressed using gradient descent.
Projection
High-dimensional projection onto/orthogonal to
For a vector b and a vector space V, we define the projection of b onto V (written <math>b^{||V}</math> ) and the projection of b orthogonal to V (written <math>b^{\perp V}</math> ) so that <MATH> b = b^{||V} + b^{\perp V} </MATH> where:
- the projection onto V: <math>b^{||V}</math> is in V, and
- the projection orthogonal to V: <math>b^{\perp V}</math> is orthogonal to every vector in V.
Lemma
High-Dimensional Lemma: The point in vector space V closest to b is <math>b^{||V}</math> and the distance is the norm <math>\|b^{\perp V}\|</math> .
Computation
Projection orthogonal
The below function will find the projection of b orthogonal to the space spanned by mutually orthogonal vectors.
# input: a vector b, and a list vlist of mutually orthogonal vectors
# output: the projection orthogonal of b (\perp) to the vectors in vlist
def project_orthogonal(b, vlist):
for v in vlist:
b = b - project_along(b, v)
return b
where:
- vlist consists of mutually orthogonal vectors: the <math>i^{th}</math> vector in the list is orthogonal to the <math>j^{th}</math> vector in the list for every <math>i \neq j</math> .
- project_along is the one-dimensional projection along the line
The projection of a vector b orthogonal to a vector space is unique, so in principle the order of vectors in vlist doesn’t affect the output of project orthogonal(b, vlist).
The below function will return the <math>\sigma</math> of all generators.
def aug_project_orthogonal(b, vlist):
alphadict = {len(vlist):1}
for i in range(len(vlist)):
v = vlist[i]
sigma = (b*v)/(v*v) if v*v > 0 else 0
alphadict[i] = sigma
b = b - sigma*v
return (b, alphadict)
Solution
- Apply orthogonalize to <math>{v_1} , \dots , {v_n}</math> and obtain a span of mutually orthogonal vector <math>v_1^* , \dots , v_n^*</math> . Then <math>V = Span\{v_1^* , \dots , v_n^*\}</math>
- Call project_orthogonal(b, [<math>v_1^* , \dots , v_n^*</math> ]) and obtain <math>b^{\perp}</math> as the result.
Exactly the same computations take place when orthogonalize is applied. Ie applying orthogonalize to the list of vector <math>[{v_1} , \dots , {v_n}, b]</math> will return <math>[v_1^* , \dots , v_n^*, b^{\perp}]</math>