Skip to main content

Section 4.1 General Orthogonal Expansion

The notion of mean square convergence makes senes in a general inner product space.

Definition 4.1.1. Inner Product Space.

A vector space \(V\) over the reals \(\bbR\) is called an inner product space if there is a function \((x, y)\in V\times V\mapsto (x, y)\in \bbR\) such that
  1. \((a_{1}x_{1}+a_{2}x_{2},y)=a_{1}(x_{1},y)+a_{2}(x_{2}, y)\) for any \(x_{1}, x_{2}, y\in V\) and any \(a_{1}, a_{2}\in \bbR\text{;}\)
  2. \((x, y)=(y, x)\) for any \(x, y\in V\text{;}\)
  3. \((x, x) \ge 0\) for any \(x\in V\) and equals \(0\) iff \(x=0\text{.}\)
Note that the first two properties imply
\begin{equation*} (x, a_{1}y_{1}+a_{2}y_{2})= a_{1}(x, y_{1})+a_{2}(x, y_{2}) \text{ for any $x, y_{1}, y_{2}\in V$.} \end{equation*}
Due to this and the first property, we say an inner product on a vector over the reals is bilinear.
The space of real valued continuous function on a finite interval \([a, b]\text{,}\) \(\cC[a, b]\text{,}\) has a natural inner product: \((f, g) :=\int_{a}^{b} f(x)g(x)\, dx\text{.}\)

Definition 4.1.2. Orthogonal Relation.

Two vectors \(x, y\) in an inner product space \(V\) are said to be orthogonal to each other if \((x, y)=0\text{.}\)
Note that, since \((y, x)=(x, y)\text{,}\) it follows that if \((x, y)=0\text{,}\) then \((y, x)=0\text{.}\) So the orthogonal relation is symmetric in \(x\) and \(y\text{.}\)
In the context of \(\cC[a, b]\text{,}\) two functions \(f, g\in \cC[a, b]\) are orthogonal in \(\cC[a, b]\) if \(\int_{a}^{b} f(x)g(x)\, dx=0\text{.}\) Note that it is important to specify the interval of integration. The orthogonality relation stated earlier says that, when \(n\ne m\text{,}\) the functions \(\sin \left(\frac{n \pi x}{l}\right), \sin \left(\frac{m \pi x}{l}\right)\) are orthogonal on \([0, l]\text{.}\) But these two functions may not be orthogonal on a different interval such as \([0, l/2]\text{.}\)

Proof.

For any \(x, y\in V\text{,}\) the function \(t\mapsto (x+t y, x+t y)\) in \(t\) is a quadratic function in \(t\) when \(y\ne 0\) due to the bilinear property of the inner product, and is nonnegative. Its minimum is attained at \(t=- (x, y)/(y, y)\text{.}\) Evaluating \((x+t y, x+t y)\) at this \(t=- (x, y)/(y, y)\) gives
\begin{equation*} - \frac{(x, y)^{2}}{(y, y)}+ (x, x), \text{ which is $\ge 0$.} \end{equation*}
This proves the Cauchy-Schwarz inequality when \(y\ne 0\text{.}\) But the case of \(y=0\) is trivial.
The triangle inequality follows from the Cauchy-Schwarz inequality by
\begin{align*} \Vert x +y \Vert^{2} \amp =(x+y, x+y)=(x, x)+2(x, y)+(y, y) \\ \amp \le (x, x)+2\Vert x \Vert \Vert y \Vert +(y, y) =\left(\Vert x \Vert + \Vert y \Vert \right)^{2}. \end{align*}
The Pythagorean relation clearly follows from the above line of proof when \((x, y)=0\text{.}\)
In the context of \(\cC[a, b]\text{,}\) the Cauchy-Schwarz inequality takes the form of
\begin{equation*} \vert \int_{a}^{b} f(x) g(x)\, dx \vert \le \sqrt{ \int_{a}^{b} |f(x)|^{2}\, dx} \sqrt{ \int_{a}^{b} |g(x)|^{2}\, dx} \end{equation*}
for \(f, g\in \cC[a, b]\text{,}\) and the triangle inequality takes the form of
\begin{equation*} \sqrt{ \int_{a}^{b} |f(x)+g(x)|^{2}\, dx} \le \sqrt{ \int_{a}^{b} |f(x)|^{2}\, dx} + \sqrt{ \int_{a}^{b} |g(x)|^{2}\, dx}. \end{equation*}

Exercise 4.1.4. A Set of Functions Orthogonal on \([0, l]\).

Verify that the set of functions \(\left\{ \sin \left(\frac{n \pi x}{l}\right): n\in\mathbb N\right\}\) are mutually orthogonal to each other on \([0, l]\) and that
\begin{equation*} \Vert \sin \left(\frac{n \pi x}{l}\right)\Vert = \sqrt{\frac l2} \text{ for $n\in\mathbb N$.} \end{equation*}
Then show that
\begin{equation*} \int_{0}^{l} \vert \sum_{n=1}^{N}b_{n} \sin \left(\frac{n \pi x}{l}\right)\vert^{2} \, dx = \frac l2 \left( \sum_{n=1}^{N} |b_{n}|^{2} \right). \end{equation*}
Hint.
Use the relation \(2\sin A \sin B=\cos(A-B)-\cos(A+B)\text{.}\)

Exercise 4.1.5. A Set of Functions Orthogonal on \([-l, l]\).

Verify that the set of functions \(\left\{ 1, \cos \left(\frac{n \pi x}{l}\right), \sin \left(\frac{n \pi x}{l}\right): n\in\mathbb N\right\}\) are mutually orthogonal to each other on \([-l, l]\) and that
\begin{equation*} \Vert 1 \Vert =\sqrt{2l}, \Vert \cos \left(\frac{n \pi x}{l}\right) \Vert =\Vert \sin \left(\frac{n \pi x}{l}\right)\Vert = \sqrt{l} \text{ for $n\in\mathbb N$.} \end{equation*}
Then show that
\begin{align*} \amp \int_{0}^{2l} \vert a_{0}+\sum_{n=1}^{N}\left(a_{n} \cos \left(\frac{n \pi x}{l}\right) + b_{n} \sin \left(\frac{n \pi x}{l}\right)\right) \vert^{2} \, dx\\ = \amp l\left( 2 |a_{0}|^{2} + \sum_{n=1}^{N}\left(|a_{n}|^{2} + |b_{n}|^{2}\right) \right). \end{align*}
Hint.
Use the relations \(2\sin A \sin B=\cos(A-B)-\cos(A+B), 2 \cos A \cos B= \cos(A-B)+\cos(A+B), 2\sin A \cos B=\sin(A+B)-\sin(A-B)\text{.}\)
It is often necessary and productive to work with spaces of complex valued functions, which should be regarded as vector spaces over \(\bbC\text{.}\) The notion of an inner product can be extended to a vector space over \(\bbC\text{,}\) with some modification.

Definition 4.1.6. Hermitian Inner Product Space.

A vector space \(V\) over \(\bbC\) is called a Hermitian inner product space if there is a function \((x, y)\in V\times V\mapsto (x, y)\in \bbC\) such that
  1. \((a_{1}x_{1}+a_{2}x_{2},y)=a_{1}(x_{1},y)+a_{2}(x_{2}, y)\) for any \(x_{1}, x_{2}, y\in V\) and any \(a_{1}, a_{2}\in \bbC\text{;}\)
  2. \((x, y)=\overline{(y, x)}\) for any \(x, y\in V\text{;}\)
  3. \((x, x) \) is a nonnegative real number for any \(x\in V\) and equals \(0\) iff \(x=0\text{.}\)
Two vectors \(x, y\in V\) are (Hermitian) orthogonal if \((x, y)=0\text{.}\)
Note that the first two properties imply
\begin{equation*} (x, a_{1}y_{1}+a_{2}y_{2})=\overline{(a_{1}y_{1}+a_{2}y_{2}, x)}= \overline{a_{1}}(x, y_{1})+\overline{a_{2}}(x, y_{2}) \text{ for any $x, y_{1}, y_{2}\in V$.} \end{equation*}
Note that a Hermitian inner product on a vector space is not bilinear in both variables; it is linear in the first variable, but complex conjugate linear in the second variable.
The Cauchy-Schwarz and triangle inequalities and the notion of norm induced by the inner product extend readily to a Hermitian inner product space.
To distinguish between a Hermitian inner product and an inner product introduced earlier on a vector space over the reals, we will refer to the latter as a Euclidean inner product.
For complex valued functions \(f, g\) in \(\cC[a, b]\text{,}\) a natural Hermitian inner product is \((f, g)=\int_{a}^{b} f(x)\overline{g(x)}\,dx\text{.}\) This is consistent with the inner product introduced earlier on \(\cC[a, b]\) when \(f, g\) are real valued.

Exercise 4.1.7. The Orthogonal Family \(\{ e^{i\frac{n\pi x}{l}}: n\in \bbZ\}\) on \([-l, l]\).

Verify that the set of functions \(\{ e^{i\frac{n\pi x}{l}}: n\in \bbZ\}\) are orthogonal on \([-l, l]\) and that
\begin{equation*} \Vert e^{i\frac{n\pi x}{l}} \Vert= \sqrt{2l} \text{ for all $n\in \bbZ$.} \end{equation*}
Then show that
\begin{equation*} \int_{-l}^{l} \vert \sum_{-N}^{N}c_{n} e^{i\frac{n\pi x}{l}}\vert^{2}\, dx = 2l\left( \sum_{-N}^{N} |c_{n}|^{2}\right). \end{equation*}

Remark 4.1.8.

Two modifications are made in the defining properties of a Hermitian inner product: (i). allowing \((x, y)\) to take complex values, and (ii). replacing the symmetry property by the complex conjugate symmetry property \((x, y)=\overline{(y, x)}\text{.}\)
These are based on the following considerations. (a). It is preferable to keep some complex linearity for an inner product on a vector space over \(\bbC\) such as given in the first property of an inner product, and this makes it necessary to allow \((x, y)\) to take complex values. (b). We still would like to use \(\sqrt{(x, x)}\) as a norm for a vector, thus we need \((x, x) \) to be a nonnegative real number for any \(x\in V\text{.}\)
Let \(H(x, y), S(x, y)\) denote, respectively, the real and imaginary parts of \((x, y)\text{:}\)
\begin{equation*} (x, y) =G(x, y)+ i S(x, y), x, y \in V. \end{equation*}
Then we need \(S(x, x)=0\) for all \(x\in V\text{.}\) This property, coupled with linearity over \(\bbR\text{,}\) implies that \(S(x, y)\) must be antisymmetric in \(x, y\text{.}\) In fact, it makes sense to require \(G(x, y)\) to be an inner product on \(V\) treating \(V\) as a vector space over \(\bbR\text{.}\) Thus we want \(G(x, y)\) to be symmetric in \(x, y\text{.}\) An additional desired property is that multiplication by \(i\) on both \(x\) and \(y\) should preserve the inner product:
\begin{equation*} (ix, iy)=(x, y) \text{ for all $x, y\in V$.} \end{equation*}
It turns out that these desired properties are encoded in, in fact, equivalent to the defining properties for a Hermitian inner product.

Exercise 4.1.9. Hermitian and Euclidean Inner Product.

Use the set up of \(G(x, y)+iS(x, y)=(x, y)\) for a Hermitian inner product. Verify that
  1. \(G(x, y)=G(y, x), S(x,y)=-S(y,x )\) for all \(x, y\in V\text{.}\)
  2. \(G(i x, i y)=G(x, y), S(i x, i y)=S(x, y)\) for all \(x, y\in V\text{.}\)
  3. \(S(x, y)=G(x, i y)\) for all \(x, y\in V\text{.}\)
  4. \(G(x, ix)=0\) for all \(x\in V\text{.}\)
Conversely, if \(G(x, y)\) is an inner product on \(V\) as a vector space over the reals, and satisfies \(G(i x, i y)=G(x, y)\) for all \(x, y\in V\text{.}\) Then define \(S(x, y)= G(x, i y)\) and \((x, y)=G(x, y)+ i S(x, y)\) for all \(x, y\in V\text{.}\) Verify that this \((x, y)\) is a Hermitian inner product on \(V\text{.}\) In other words, a Hermitian inner product is always associated with a Euclidean inner product which is preserved by multiplication by \(i\text{.}\)

Remark 4.1.10.

When two vectors \(x, y\) are orthogonal in a complex vector space \(V\) with a Hermitian inner product, the Pythagorean relation holds in the same way as for case in a vector space with a Euclidean inner product. In this sense it is a proper generalization of the orthogonal relation in Euclidean geometry. On the other hand, there is a subtle difference: suppose \(\{\bv_{1},\cdots, \bv_{m}\}\) is an orthonormal basis of a complex vector space \(V\) with a Hermitian inner product, then \(\bv_{j}\) and \(i \bv_{j}\) are not orthogonal in this Hermitian inner product; on the other hand, if we endow \(V\) with the indued Euclidean inner product \(G(x, y)\) as discussed above, then \(\bv_{j}\) and \(i \bv_{j}\) are orthogonal in \(G(x, y)\text{;}\) in fact, \(\{ \bv_{1},\cdots, \bv_{m}; i \bv_{1},\cdots, i \bv_{m}\}\) becomes an orthonormal basis for this inner product \(G(x, y)\text{.}\) This is the case for the canonical Hermitian metric on \(\bbC\text{,}\) where \((z, w)=z \bar w\text{,}\) so \((1, i)=-i \ne 0\text{,}\) while in the geometric representation of complex numbers, \(1, i\) are orthogonal---they are orthogonal in the induced \(G\) inner product, which is the standard Euclidean inner product.
In the following we will not distinguish between a Hermitian inner product and a Euclidean inner product, and will let the context to imply the appropriate one.

Remark 4.1.11.

Orthogonal systems of functions often arise as eigenfunctions of a β€œself-adjoint” differential operator, which is the analogue of real symmetric matrices or Hermitian matrices whose eigenvectors associated to distinct eigenvalues are automatically orthogonal to each other. For example, each function \(X_{n}(x)=\frac{e^{inx}}{\sqrt{2\pi} }\) in \(\{\frac{e^{inx}}{\sqrt{2\pi} }\}_{n=-\infty}^{\infty}\) is an eigenfunction of the following problem: \(X_{n}''(x)=- n^{2} X_{n}(x)\) for \(x\in (-\pi, \pi)\) and \(X_{n}(-\pi)=X_{n}(\pi)\text{,}\) \(X_{n}'(-\pi)=X_{n}'(\pi)\text{;}\) each \(Y_{n}(x)=\frac{\sin (nx)}{\sqrt \pi}\) in the family \(\{ \frac{\sin (nx)}{\sqrt \pi}, n=1,2,\ldots\}\) is an eigenfunction of the following problem: \(Y_{n}''(x)= - n^{2} Y_{n}(x)\) for \(x\in (0, \pi)\) and \(Y_{n}(0)=Y_{n}(\pi)=0\text{.}\) In each case, the operator is self-adjoint in the sense that the inner product of \(f''\) with \(g\) equals the inner product of \(f\) with \(g''\) (the boundary conditions play an essential role here), and this leads to the orthogonality of eigenfunctions associated to distinct eigenvalues.

Definition 4.1.12. Orthonormal Vectors.

A set of vectors \(\{\bv_{1}, \bv_{2},\cdots \}\) (finite or infinite) in an inner product space \(V\) is called an orthonormal set, if any two distinct vectors in this set are orthogonal to each other and each one is a unit vector.

Definition 4.1.13. Fourier Coefficients and Fourier Series.

Let \(\{\bv_{1}, \bv_{2},\cdots \}\) be a set of orthonormal vectors in an inner product space \(V\text{.}\) For any vector \(\bv\in V\text{,}\) define \(c_{k}=(\bv, \bv_{k})\text{.}\) Then \(\{c_{k}\}\) are called the Fourier coefficients of \(\bv\) with respect to this set of orthonormal vectors, and the series \(\sum_{k} c_{k} \bv_{k}\) is called the Fourier series of \(\bv\) with respect to this set of orthonormal vectors.
In the above definition the convergence of \(\sum_{k} c_{k} \bv_{k}\) in the case of an infinite set of orthonormal vectors is not directly addressed; one either needs to show that the series converges or simply assumes it as a formal series at this point. As will be seen soon, it is also appropriate to call this sum the orthogonal projection of \(\bv\) in the span of this set of orthonormal vectors. Note that in a setting of a set of infinite vectors, we take the span of such a set to mean the completion of the space of finite linear combination of vectors from this set, which allows us to make sense of an infinite series of such vectors.

Remark 4.1.14.

Sometimes we work with a set of orthogonal vectors which are not necessarily unit vectors. This is the case with \(\left\{ 1, \cos \left(\frac{n \pi x}{l}\right), \sin \left(\frac{n \pi x}{l}\right) : n\in\mathbb N\right\}\) on \([-l, l]\text{.}\) Here we modify the definition of the Fourier coefficients of \(g\in \cR[-l, l]\) by defining
\begin{align*} a_{0} \amp = \frac{(g, 1)}{(1, 1)}=\frac {1}{2l} \int_{-l}^{l} g(x) \, dx,\\ a_{n}\amp = \frac{(g, \cos \left(\frac{n \pi x}{l}\right))}{(\cos \left(\frac{n \pi x}{l}\right),\cos \left(\frac{n \pi x}{l}\right))} = \frac 1l \int_{-l}^{l} g(x) \cos \left(\frac{n \pi x}{l}\right) \, dx \text{ for } n\ge 1,\\ b_{n} \amp =\frac{(g, \sin \left(\frac{n \pi x}{l}\right))}{(\sin \left(\frac{n \pi x}{l}\right),\sin \left(\frac{n \pi x}{l}\right))} = \frac{1}{l} \int_{-l}^{l} g(x) \sin \left(\frac{n \pi x}{l}\right) \, dx \text{ for } n\ge 1, \end{align*}
and
\begin{equation*} S_{N}[g](x):=a_0 +\sum_{n=1}^{N}\left[ a_n \cos \left(\frac{n \pi x}{l}\right) + b_n \sin \left(\frac{n \pi x}{l}\right) \right] \end{equation*}
as the partial sums of the Fourier series of \(g(x)\) on \([-l, l]\text{:}\)
\begin{equation*} g(x)\sim a_0 +\sum_{n=1}^{\infty}\left[ a_n \cos \left(\frac{n \pi x}{l}\right) + b_n \sin \left(\frac{n \pi x}{l}\right) \right]. \end{equation*}
When we work with the set of orthogonal functions \(\left\{\sin \left(\frac{n \pi x}{l}\right): n\in\mathbb N\right\}\) on \([0, l]\text{,}\) the Fourier coefficients of \(g\in \cR[0,l]\) with respect to this set of orthogonal functions on \([0, l]\) are defined by
\begin{equation*} b_{n}=\frac{(g, \sin \left(\frac{n \pi x}{l}\right))}{(\sin \left(\frac{n \pi x}{l}\right),\sin \left(\frac{n \pi x}{l}\right))} =\frac 2l \int_{0}^{l} g(x) \sin \left(\frac{n \pi x}{l}\right)\, dx. \end{equation*}

Exercise 4.1.15. Find Fourier Coefficients.

Find the Fourier coefficients of the functions \(f(x)=1\) and \(g(x)=\cos x\) with respect to the set of orthogonal functions \(\left\{ \sin (nx): n\in\mathbb N\right\}\) on \([0, \pi]\text{.}\)

Exercise 4.1.16. Integral Representation of Fourier Partial Sums.

Let
\begin{equation*} S_{N}[g](x):=a_0 +\sum_{n=1}^{N}\left[ a_n \cos \left(\frac{n \pi x}{l}\right) + b_n \sin \left(\frac{n \pi x}{l}\right) \right] \end{equation*}
denote the partial sums of the Fourier series of \(g(x)\) with respect to the set of orthogonal functions \(\left\{ 1, \cos \left(\frac{n \pi x}{l}\right), \sin \left(\frac{n \pi x}{l}\right) : n\in\mathbb N\right\}\) on \([-l, l]\text{,}\) and let
\begin{equation*} c_{n}=\frac{1}{2l} \int_{-l}^{l} g(x) e^{-\frac{i n\pi x}{l}}\, dx \end{equation*}
denote the Fourier coefficients of \(g(x)\) with respect to the set of orthogonal functions \(\left\{ e^{\frac{i n\pi x}{l}}\right\}_{n=-N}^{N}\text{.}\) Verify that
\begin{equation*} S_{N}[g](x)=\sum_{n=-N}^{N} c_{n} e^{\frac{i n\pi x}{l}}= \frac{1}{2l} \int_{-l}^{l}g(t) D_{N}(x-t)\, dt \end{equation*}
where
\begin{equation*} D_{N}(t)= \sum_{-N}^{N} e^{-\frac{i n\pi t}{l}}=\frac{\sin \frac{(N+\frac 12)\pi t}{l}}{\sin\frac {\pi t}{2l}}. \end{equation*}
Hint.
First need to work out
\begin{align*} S_{N}[g](x)\amp =\frac{1}{2l} \int_{-l}^{l}g(t) \left(1 + \sum_{n=1}^{N} 2 \cos\left(\frac{n\pi (x-t)}{l}\right)\right)\, dt\\ \amp = \frac{1}{2l} \int_{-l}^{l}g(t) \left( \sum_{n=-N}^{N} e^{\frac{i n\pi (x-t)}{l}} \right)\, dt\text{,} \end{align*}
then establish
\begin{equation*} 1 + \sum_{n=1}^{N} 2 \cos\left(\frac{n\pi s}{l} \right)= \sum_{n=-N}^{N} e^{\frac{i n\pi s}{l}} =\frac{\sin \frac{(N+\frac 12)\pi s}{l}}{\sin\frac {\pi s}{2l}} \end{equation*}
using either the relation \(2\sin \frac{\pi s}{2l} \cos \frac{(N+\frac 12)\pi s}{l} =\sin(\frac{(N+1)\pi s}{l})-\sin(\frac{N\pi s}{l})\) or \(\cos \left(\frac{n\pi s}{l} \right)= \text{Re}(e^{\frac{i n\pi s}{l} })\text{.}\)

Proof.

The first assertion follows directly from
\begin{equation*} ( \bv - \sum_{k=1}^{N} c_{k} \bv_{k}, \bv_{j})=(\bv, \bv_{j})- \sum_{k=1}^{N} c_{k} (\bv_{k}, \bv_{j})=c_{j}-c_{j}=0 \end{equation*}
using the orthonormal condition \((\bv_{k}, \bv_{j})=\delta_{kj}\text{.}\)
The second assertion follows by using the orthogonality relations above
\begin{align*} \Vert \bv \Vert^{2} = \amp \left( (\bv - \sum_{k=1}^{N} c_{k} \bv_{k})+ \sum_{k=1}^{N} c_{k} \bv_{k}, (\bv - \sum_{k=1}^{N} c_{k} \bv_{k})+ \sum_{k=1}^{N} c_{k} \bv_{k} \right)\\ = \amp ( \bv - \sum_{k=1}^{N} c_{k} \bv_{k}, \bv - \sum_{k=1}^{N} c_{k} \bv_{k})+ 2(\bv - \sum_{k=1}^{N} c_{k} \bv_{k}, \sum_{k=1}^{N} c_{k} \bv_{k})\\ \amp + (\sum_{k=1}^{N} c_{k} \bv_{k},\sum_{k=1}^{N} c_{k} \bv_{k})\\ = \amp \Vert \bv - \sum_{k=1}^{N} c_{k} \bv_{k}\Vert^{2}+ \Vert \sum_{k=1}^{N} c_{k} \bv_{k}\Vert^{2} \end{align*}
using \((\bv - \sum_{k=1}^{N} c_{k} \bv_{k}, \bv_{k})=0\) for any \(k\)
Set \(\bw = \sum_{k=1}^{N} c_{k} \bv_{k}\text{.}\) Then \((\bv-\bw, \sum_{k=1}^{N} (c_{k}-a_{k}) \bv_{k})=0\text{,}\) so
\begin{align*} \Vert \bv - \sum_{k=1}^{N} a_{k} \bv_{k}\Vert^{2} \amp = \left(\bv-\bw +( \sum_{k=1}^{N} (c_{k}-a_{k}) \bv_{k}), \bv-\bw +( \sum_{k=1}^{N} (c_{k}-a_{k}) \bv_{k})\right)\\ \amp = \left(\bv-\bw, \bv-\bw\right)+ \left(\sum_{k=1}^{N} (c_{k}-a_{k}) \bv_{k},\sum_{k=1}^{N} (c_{k}-a_{k}) \bv_{k}\right)\\ \amp = \Vert \bv-\bw\Vert^{2}+\Vert \sum_{k=1}^{N} (c_{k}-a_{k}) \bv_{k}\Vert^{2}\\ \amp \ge \Vert \bv-\bw\Vert^{2}, \end{align*}
with equality iff \(c_{k}-a_{k}=0\) for all \(k, 1\le k \le N\text{.}\)

Definition 4.1.18. Orthogonal Projection.

Let \(\{\bv_{1}, \bv_{2},\cdots, \bv_{N} \}\) be a finite set of orthonormal vectors in an inner product space \(V\text{.}\) For any vector \(\bv\in V\text{,}\) let \(\sum_{k=1}^{N} c_{k} \bv_{k}\) be the Fourier series of \(\bv\) with respect to this set of orthonormal vectors. Then \(\sum_{k=1}^{N} c_{k} \bv_{k}\) is also called the orthogonal projection of \(\bv\) in the span of this set of orthonormal vectors.

Exercise 4.1.19. Find Orthogonal Projections.

Find the orthogonal projections of the functions \(f(x)=1\) and \(g(x)=\cos x\) in the span of the set of orthogonal functions \(\left\{ \sin (nx): 1\le n \le N\right\}\) on \([0, \pi]\text{.}\)

Exercise 4.1.20.

Find the orthogonal projection of the function \(f(x)=x\) in the span of the set of orthogonal functions \(\left\{1, \cos (nx), \sin (nx): 1\le n \le N\right\}\) on \([-\pi, \pi]\text{.}\)

Proof.

Take any finite subset \(\{\bv_{1}, \bv_{2},\cdots, \bv_{N}\}\text{.}\) Then we have already proved that
\begin{equation*} \Vert \bv\Vert^{2}\ge \Vert \sum_{k=1}^{N } c_{k }\bv_{k}\Vert^{2}= \sum_{k=1}^{N} |c_{k}|^{2}. \end{equation*}
Since this holds for any finite \(N\text{,}\) Bessel’s inequality follows immediately.

Remark 4.1.23.

So far we have not addressed the issue whether \(S_{N}[g](x)\) converges to \(g(x)\) in the mean square sense, namely whether \(\int_{-l}^{l} |g(x)-S_{N}[g](x)|^{2}\, dx\to 0\) as \(N\to \infty\text{.}\) From the above we see that the answer depends on whether equality holds in the Bessel’s inequality. Another possible approach is to show that there exists a sequence \(p_{N}(x)=\sum_{n=-N}^{N}c_{n }' e^{\frac{i nx}{l}}\) such that \(\int_{-l}^{l} |g(x)-p_{N}(x)|^{2}\, dx\to 0\) as \(N\to \infty\) and appeal to the Best Approximation Property of the Fourier series. This would be the case if the set of finite linear combination of functions from \(\{ e^{\frac{i nx}{l}}\}\) is dense in the mean square norm in the set of function spaces, such as \(\cR[-l, l]\) or \(C[-l, l]\text{,}\) in which we are interested in making such a Fourier series expansion.
Given any \(g\) in \(\cR[-l, l]\) or \(C[-l, l]\text{,}\) using the orthogonality relations we have
\begin{equation*} \int_{-l}^{l} |S_{N'}[g](x)-S_{N}[g](x)|^{2}\, dx =l \sum_{n=N}^{N'} \left[ |a_{n}|^{2}+|b_{n}|^{2}\right] \end{equation*}
for \(N' \gt N\text{.}\) The Bessel’s inequality implies then that the sequence \(\{ S_{N}[g](x) \}\) is a Cauchy sequence in the mean square norm. At this point we need the property of completeness of the function space on which we are working. Unfortunately, neither \(\cR[-l, l]\) nor \(C[-l, l]\) is complete with respect to the mean square norm. The completion of either \(\cR[-l, l]\) or \(C[-l, l]\) turns out to the space of Lebesgue square integrable functions \(L^{2}[-l, l]\text{.}\) So a more proper discussion on the issue of mean square convergence should be on the complete space \(L^{2}[-l, l]\text{.}\)
Without discussing details of Lebesgue integrable functions, we may assume that there exists some \(\hat g\in L^{2}[-l, l]\) such that \(\int_{-l}^{l} |\hat g-S_{N}[g](x)|^{2}\, dx\to 0\) as \(N\to \infty\text{.}\) We claim the following property:
\begin{equation*} (g-\hat g, e^{\frac{i m\pi x}{l}})=0 \quad \text{for any $m\in \bbZ$.} \end{equation*}
This follows by noting that for any \(N\gt m\text{,}\)
\begin{align*} (g-S_{N}[g](x), e^{\frac{i m\pi x}{l}}) \amp =(g, e^{\frac{i m\pi x}{l}})-(S_{N}[g](x), e^{\frac{i m\pi x}{l}})\\ \amp = 2l\left(c_{m}-c_{m}\right)=0, \end{align*}
so
\begin{align*} (g-\hat g, e^{\frac{i m\pi x}{l}}) \amp =(g-S_{N}[g](x), e^{\frac{i m\pi x}{l}})+(S_{N}[g](x)-\hat g, e^{\frac{i m\pi x}{l}})\\ \amp = (S_{N}[g](x)-\hat g, e^{\frac{i m\pi x}{l}}), \end{align*}
while by the Cauchy-Schwarz inequality
\begin{equation*} \vert (S_{N}[g](x)-\hat g, e^{\frac{i m\pi x}{l}}) \vert \le \Vert S_{N}[g](x)-\hat g\Vert \Vert e^{\frac{i m\pi x}{l}}\Vert \to 0 \end{equation*}
as \(N\to \infty\text{,}\) which shows that
\begin{equation*} (g-\hat g, e^{\frac{i m\pi x}{l}})= 0. \end{equation*}
So a key to our question is whether the only function which is orthogonal to all \(e^{\frac{i m\pi x}{l}}\) is the \(0\) function. This discussion makes sense on a general inner product space.

Definition 4.1.24. A Complete (or Maximal) Orthonormal Set.

A set of orthonormal vectors in an inner product space is called complete (or maximal) if the only vector orthogonal to each of these vectors is the zero vector.