Subsection 7.3.1 Definition and Basic Properties of Vector Fields and Differential One-Forms
We start with an initial definition of a vector field in a Euclidean domain; it is to be modified later to be suitable in a more general context such as on a differentiable surface.
Definition 7.3.1. Vector Field in a Euclidean Domain.
Let
\(U\subset \bbR^{n}\) be open. A vector field in
\(U\) is an
\(\bbR^{n}\)-valued function
\(X:\bx\in U\mapsto \bbR^{n}\text{.}\) A vector field
\(X(\bx)\) is continuous (differentiable) in
\(U\) if
\(X(\bx)\) as an
\(\bbR^{n}\)-valued function is continuous (differentiable) in
\(U\text{.}\)
We will use a vector field mostly in the operation of taking directional derivatives of differentiable functions. First, recall the directional derivative of a differentiable function \(f\) in \(\bbR^{n}\) in the direction \({\mathbf v}\) at a point \(P\text{:}\)
\begin{equation*}
D_{{\mathbf v}}f(P)=\frac{d f(P+t{\mathbf v})}{dt}\Big|_{t=0}
=\sum_{i=1}^{n} v^{i}\frac{\partial f}{\partial x_{i}}(P),
\end{equation*}
where \({\mathbf v}=(v^{1},\ldots, v^{n})\text{.}\) This relation holds for any differentiable curve \(\vec{\gamma}(t)\) passing through \(P\) at \(t=0\) and \(\vec{\gamma}'(0)={\mathbf v}\text{:}\)
\begin{equation*}
\frac{d (f\circ \gamma )}{dt}\Big|_{t=0}=D_{{\mathbf v}}f(P).
\end{equation*}
\({\mathbf v}\) is called a
tangent vector at
\(P\text{.}\) The set of tangent vectors at
\(P\) forms a vector space, called the
tangent space of
\({\mathbb R}^{n}\) at
\(P\text{,}\) and is denoted as
\({\mathbb R}^{n}_{P}\text{.}\) For the situation here,
\({\mathbb R}^{n}_{P}\) is simply a copy of
\(\bbR^{n}\text{,}\) and
\({\mathbb R}^{n}_{P}\) at different
\(P\) are identified with each other in a natural way. But we will see that this is not the case when we discuss the tangent space of a differentiable surface, as there is no obvious natural relations between vectors at different points, and it is important to associate a tangent vector to a specific point.
Suppose that
\(X(\bx)\) is a continuous vector field on
\(U\) and
\(f(\bx)\) is a continuously differentiable function on
\(U\text{,}\) then at each
\(\bx\in U\text{,}\) \(X(\bx)\) is a tangent vector at
\(\bbR^{n}_{\bx}\) and
\(D_{X(\bx)}f(\bx)\) is a continuous function on
\(U\text{.}\)
The discussions in the previous paragraphs generalize to a (differentiable) surface in \({\mathbb R}^{n}\text{.}\) We will first work with a patch of a differentiable surface as given by a differentiable map \(\vg (\bu): U\mapsto {\mathbb R}^{n}\) defined for the parameter \(\bu\in U \subset \bbR^{k}\) for some open domain \(U\text{.}\) A differentiable curve on the surface through \(P_{0}=\vg(\bu_{0})\) is given in terms of \(\vg(\bu(t))\text{,}\) where \(\bu(t)\) is a differentiable curve in the parameter domain \(U\) with \(\bu(0)=\bu_{0}\text{.}\) Then the chain rule
\begin{equation*}
\left[ \vg(\bu(t)) \right]' =D_{\bu}\vg(\bu(t)) \bu'(t)
\end{equation*}
implies that the tangent of the curve \(\vg(\bu(t))\) at \(P_{0}\text{,}\) \(\left[ \vg(\bu(t)) \right]'|_{t=0}\text{,}\) is a linear combination of \(D_{u_{i}}\vg(\bu_0)\text{,}\) where \(D_{u_{i}}\vg(\bu_0)\) is an alternative notation for \(D_{\be_{i}}\vg(\bu_0)\text{,}\) namely, the partial derivative of \(\vg(\bu)\) with respect to its \(i\)th coordinate. The span of these \(k\) vectors forms the tangent space of the surface at \(P_{0}\text{.}\)
In order for such a parametrization
\(\vg(\bu)\) to represent a geometric
\(k\)-dimensional surface, we require that these
\(k\) vectors be linearly independent so the tangent space to the surface is
\(k\)-dimensional, namely the matrix
\([D_{u_{1}}\vg(\bu_0)\, \ldots\, D_{u_{k}}\vg(\bu_0)]\) has rank
\(k\text{.}\) Let
\(S\) denote this patch of differentiable surface. Then there is a well defined tangent space
\(S_{\vg(\bu)}\) for every
\(\bu\) (or rather for every point
\(\vg(\bu)\) on
\(S\)). When
\(\vg(\bu)\) is continuously differentiable, we have a sense that the tangent space
\(S_{\vg(\bu)}\) varies continuously with
\(\bu\text{.}\) But that is not a topic to be taken up now. Instead, we first focus on how a tangent vector is used to compute the directional derivative of a differentiable function.
If \(f\) is a differentiable function defined on \(\bbR^{n}\text{,}\) then its restriction to the surface \(S\) becomes a function on the surface. To consider its differentiability properties on the surface, we use the parametrization \(\vg(\bu)\) to get a function \(f\circ \vg\) in the parameter domain of \(\bu\text{,}\) then its directional derivative at \(P_{0}\) in the direction of \(D_{u_{i}}\vg(\bu_0)\) is given by
\begin{equation*}
D_{D_{u_{i}}\vg(\bu_0)} f \Big|_{\vg(\bu_0)}=\sum_{j=1}^{n}\frac{\partial f}{\partial x_{j}}(\vg(\bu_{0})) D_{\be_{i}}\vg_{j}(\bu_0)
=\frac{\partial \left( f\circ \vg\right) }{\partial u_{i}}(\bu_0).
\end{equation*}
It is this kind of consideration that makes it natural to use \(\frac{\partial }{\partial u_{i}}\Big|_{\vg(\bu_0)}\) to represent the tangent vector \(D_{u_{i}}\vg(\bu_0)\) to the surface at \(P_{0}=\vg(\bu_0)\text{,}\) when \(\vg(\bu)\) is a parametric representation of the surface. Namely,
-
\(\frac{\partial \vg}{\partial u_{i}}=D_{u_{i}}\vg\) is a geometric tangent vector to the surface \(\bx=\vg(\bu)\) at \(\vg(\bu)\) arising from a curve whose parametrization in the \(\bu\) parameter space runs along the \(u_{i}\) direction.
-
\(\frac{\partial (f\circ \vg)}{\partial u_{i}}=D_{D_{u_{i}}\vg}f\) means that, as an operator, \(\frac{\partial }{\partial u_{i}}\) represents directional derivative in the direction of the tangent \(D_{u_{i}}\vg\text{.}\)
Thus
\begin{equation*}
D_{D_{u_{i}}\vg(\bu_0)} f \Big|_{\vg(\bu_0)}=D_{\frac{\partial }{\partial u_{i}}\Big|_{\vg(\bu_0)}}f .
\end{equation*}
In this notation, \(\left\{ \frac{\partial }{\partial u_{1}}\Big|_P, \ldots, \frac{\partial }{\partial u_{k}}\Big|_P\right\}\) forms a basis of \(S_{P}\text{.}\) The advantage of this notation will become clear when a change of variable is used, which would cause a change of basis for the tangent space. We can write any \(\bv\in S_{P}\) as \(\bv=\sum_{i=1}^{k}v^{i}\frac{\partial }{\partial u_{i}}\Big|_P\text{,}\) then
\begin{equation*}
D_{\bv}f (P)
=\sum_{i=1}^{k}v^{i}\frac{\partial (f\circ \vg)}{\partial u_{i}} (\bu)
\end{equation*}
for \(P=\vg(\bu)\text{.}\)
Definition 7.3.2. Vector Field on a Differentiable Surface.
A vector field
\(X\) on a differentiable surface
\(S\) is a map
\(P\in S\mapsto X(P)\in S_{P}\text{.}\) Namely, it assigns to each
\(P\in S\) a tangent vector
\(X(P)\) to
\(S\) at
\(P\text{.}\)
When
\(S\) is given by a differentiable parametrization
\(\vg :U\subset \bbR^{k}\mapsto S\text{,}\) any vector field
\(X\) on
\(S\) can be represented as
\(X(\vg(\bu))=\sum_{i=1}^{k} X_{i}(\bu) \frac{\partial }{\partial u_{i}}\Big|_{\vg(\bu)}.\) \(X\) is said to be a continuous (or continuously differentiable) vector field on
\(S\) if the coefficient functions
\(X_{i}(\bu)\) are continuous (or continuously differentiable) functions of
\(\bu \in U\text{.}\)
We will often write
\(\frac{\partial }{\partial u_{i}}\) for
\(\frac{\partial }{\partial u_{i}}\Big|_{\vg(\bu)}\) to simplify notations. Note that
\(\frac{\partial }{\partial u_{i}}\) is a vector field on
\(S\text{,}\) but if the
\(u_{i}\)βs are not used in connection with the parametrization, then
\(\frac{\partial }{\partial u_{i}}\) also represents a vector field in
\(U\text{,}\) which takes the vector
\(\be_{i}\) everywhere in
\(U\text{.}\) One should watch out for the context in which this notation is used. In the latter context, a vector field in
\(U\) is simply an
\(R^{k}\)-valued function, so one can take its derivative and in this case
\(D_{\bv}\left(\frac{\partial }{\partial u_{i}}\right)=0\) for any
\(\bv\in \bbR^{k}_{\bu}\text{.}\) But in the former context,
\(\bv\) should be regarded as the tangent vector
\(\frac{d \vg(\bu+t\bv)}{dt}\Big|_{t=0}=D\vg(\bu)\bv \in S_{\vg(\bu)}\) on
\(S\text{,}\) and
\(D_{\bv}\left(\frac{\partial }{\partial u_{i}}\right)\) should be related to
\(D_{D\vg(\bu)\bv}\left(\frac{\partial \vg(\bu)}{\partial u_{i}}\right)\text{.}\) But this output in
\(\bbR^{n}\) may not be a tangent vector to
\(S\) at
\(\vg(\bu)\text{.}\) There is a way to obtain a tangent vector to
\(S\) at
\(\vg(\bu)\) through orthogonal projection. This will introduce the notion of
covariant differentiation of vector field on
\(S\text{.}\) But we will not pursue that topic in this course.
Example 7.3.3. Vector fields on a graph.
A graph \(G\) of a differentiable function \(h(\bu)\) for \(\bu \in \bbR^{n-1}\) can be parametrized as \(\vg (\bu)=(\bu, h(\bu))\text{.}\) Then the vector field \(\frac{\partial }{\partial u_{i}}\) is simply a coordinate representation for the geometric vector field \((\mathbf e_{i}, \frac{\partial h(\bu) }{\partial u_{i}})\) on \(G\text{,}\) and
\begin{align*}
\amp D_{\frac{\partial }{\partial u_{i}}}f (\bu, h(\bu))=\amp
D_{(\mathbf e_{i}, \frac{\partial h(\bu) }{\partial u_{i}})}f(\bu, h(\bu))\\
=\amp \frac{\partial f(\bu, h(\bu) )}{\partial u_{i}}
=\amp \frac{\partial f}{\partial x_{i}}(\bu, h(\bu)) +
\frac{\partial f}{\partial x_{n}}(\bu, h(\bu))\frac{\partial h(\bu) }{\partial u_{i}}.
\end{align*}
\(\{\frac{\partial }{\partial u_{1}}, \ldots, \frac{\partial }{\partial u_{n-1}}\}\) is a basis of \(G_{(\bu, h(\bu))}\text{.}\) In this notation \((1, 0, \cdots, 0)\) is a coordinate representation in this basis for the vector field \(\frac{\partial }{\partial u_{1}}\) for \((\bu, h(\bu))\in G\text{,}\) yet its values at different \(\bu\) (or rather \((\bu, h(\bu))\)) may not be identified as equal to each other. As a consequence, we may not have \(D_{\bv}(\frac{\partial }{\partial u_{1}})=\mathbf 0\) in contrast to the case if \(\{\frac{\partial }{\partial u_{i}}=\be_{i}\}\) is used to represent a basis of the tangent space at a point in the flat Euclidean space.
Definition 7.3.4. Differential of a Function.
For any differentiable function \(f\) defined in \(U\subset \bbR^{n}\) and a fixed \(P\in U\text{,}\) the operation
\begin{equation*}
{\mathbf v} \mapsto D_{{\mathbf v}}f(P)
\end{equation*}
defines a linear function on \({\mathbb R}^{n}_{P}\text{,}\) thus defines a cotangent vector in \({{\mathbb R}^{n}_{P}}^{*}\text{,}\) called the differential of \(f\) at \(P\text{,}\) and is denoted as \(df(P)\text{.}\) Thus
\begin{equation*}
df(P)({\mathbf v})=D_{{\mathbf v}}f(P) \quad \text{ for all $\bv \in {\mathbb R}^{n}_{P}$.}
\end{equation*}
This definition requires
\(f\) to be differentiable in a neighborhood of a point, it naturally defines a
field of cotangent vectors in
\({{\mathbb R}^{n}_{P}}^{*}\) as
\(P\) varies in this neighborhood. It is an example of a
one form, and in this case, is called the differential of
\(f\text{.}\)
When \(f=x_{i}\) is a coordinate function, we find that
\begin{equation*}
dx_{i}({\mathbf v})=D_{{\mathbf v}}x_{i}=v_{i},
\end{equation*}
thus we find
\begin{equation*}
df(P)({\mathbf v})=\sum_{i=1}^{n}v_{i}\frac{\partial f}{\partial x_{i}}(P)=
\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}}(P)dx_{i}({\mathbf v}) \quad \text{ for all $\bv \in {\mathbb R}^{n}_{P}$,}
\end{equation*}
and as one forms we have the classic formula
\begin{equation}
df=\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}} dx_{i}.\tag{7.3.1}
\end{equation}
In older texts the differential
\(df\) was often used interchangeably with
\(\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}} \Delta x_{i}\) and referred to as the linear approximation to
\(f\) at
\(P\text{.}\) In the modern treatment, the differential
\(df\) is a linear function on tangent vectors, so after taking a tangent vector
\(\mathbf v\) as input it gives the linear approximation
\(df(P)({\mathbf v})=\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}} dx_{i}({\mathbf v})
=\sum_{i=1}^{n}v_{i} \frac{\partial f}{\partial x_{i}}\) to the increment of
\(f\) at
\(P\) along
\(\mathbf v\text{.}\)
Note that \(\left\{ dx_{1}\Big|_P, \ldots, dx_{n}\Big|_P\right\}\) forms the dual basis of \(\left\{ \frac{\partial }{\partial x_{1}}\Big|_P, \ldots, \frac{\partial }{\partial x_{n}}\Big|_P\right\}\text{,}\) as
\begin{equation}
dx_{i}(\frac{\partial }{\partial x_{j}})=D_{\frac{\partial }{\partial x_{j}}}x_{i}=\delta_{ij} \text{ for } 1\le i, j \le n.\tag{7.3.2}
\end{equation}
A general one-form can be expressed as \(\sum_{i=1}^{n}\xi_{i}(\bx) dx_{i}\) for some scalar functions \(\xi_{i}(\bx)\text{.}\) It is called continuous (differentiable) if these functions are continuous (differentiable). We will learn later that for a general domain not all continuous one-forms can be the differential \(df\) of some continuously differentiable functions \(f\text{.}\)
The above discussion adapts easily to the context of a differentiable surface such as
\(S\) represented in terms of a parametric representation
\(\vg (\bu)\) for
\(\bu \in U\subset \bbR^{k}\text{.}\) Then
\(\left\{ du_{1}\Big|_P, \ldots, du_{k}\Big|_P\right\}\) forms the dual basis of
\(\left\{ \frac{\partial }{\partial u_{1}}\Big|_P, \ldots, \frac{\partial }{\partial u_{k}}\Big|_P\right\}\text{.}\) We will use
\(du_{i}\) for
\(du_{i}\Big|_P\text{.}\) Any one-form on
\(S\) can be represented as
\(\sum_{i=1}^{k} \alpha_{i}(\bu) du_{i}\) for some functions
\(\alpha_{i}(\bu)\text{.}\)
We next study the transformation laws of a vector field and a one-form under a change of coordinates. Suppose that
\(\vg^{\dagger}(\bv)\) for
\(\bv=( v_{1}, \cdots, v_{k})\in V\subset \bbR^{k}\) provides another parametrization for the same
\(S\) via the relation
\(\bv=F(\bu)\) for some continuously differentiable
\(F\) with continuously differentiable inverse:
\(\vg^{\dagger}(F(\bu))= \vg(\bu)\text{.}\) A parametrization with this property is called
admissible.
In this set tup, for any differentiable function \(f\) defined in a domain of \(\bbR^{n}\) containing \(S\text{,}\) the chain rule implies
\begin{equation*}
\frac{\partial (f\circ \vg)}{\partial u_{j}}(\bu)
=\sum_{i=1}^{k} \frac{\partial v_{i}}{\partial u_{j}}
\frac{\partial (f\circ \vg^{\dagger})}{\partial v_{i}}(\bv)\text{.}
\end{equation*}
\((f\circ \vg)(\bu)\) and \((f\circ \vg^{\dagger})(\bv)\) are simply two different coordinate representations of the same function \(f\text{,}\) so we have the following
\begin{equation}
\frac{\partial }{\partial u_{j}}=\sum_{i=1}^{k} \frac{\partial v_{i}}{\partial u_{j}}
\frac{\partial }{\partial v_{i}}.\tag{7.3.3}
\end{equation}
This is the transformation law between the two bases
\(\left\{ \frac{\partial }{\partial u_{1}}\Big|_P, \ldots, \frac{\partial }{\partial u_{k}}\Big|_P\right\}\) and
\(\left\{ \frac{\partial }{\partial v_{1}}\Big|_P, \ldots, \frac{\partial }{\partial v_{k}}\Big|_P\right\}\) for any
\(P\in \vg (U)\cap \vg^{\dagger}(V)\subset S\text{.}\) Note that
(7.3.3) is simply a form of the chain rule.
Suppose that \(X\) is a vector field on \(S\text{,}\) namely, \(X(P) \in S_{P}\) is tangent to \(S\) at \(P=\vg(\bu)=\vg^{\dagger}(\bv)\text{,}\)
\begin{equation}
X(P)=\sum_{i=1}^{k}a^{i}(\bu)\frac{\partial }{\partial u_{i}}
=\sum_{i=1}^{k}b^{i}(\bv)\frac{\partial }{\partial v_{i}}\text{,}\tag{7.3.4}
\end{equation}
then in addition to the relation
\begin{equation*}
D_{X(P)} f= \sum_{i=1}^{k}b^{i}(\bv)\frac{\partial (f\circ \vg^{\dagger})}{\partial v_{i}}(\bv)
=\sum_{j=1}^{k}a^{j}(\bu)\frac{\partial ( f\circ \vg) }{\partial u_{j}}(\bu),
\end{equation*}
we also have
\begin{equation}
\begin{bmatrix} b^{1}(\bv)\\ \vdots \\ b^{k}(\bv)\end{bmatrix} =
\left[\frac{\partial v_{i}}{\partial u_{j}}\right] \begin{bmatrix} a^{1}(\bu)\\ \vdots \\ a^{k}(\bu)\end{bmatrix},\tag{7.3.5}
\end{equation}
which follows from
(7.3.4) and
(7.3.3).
(7.3.4) also encodes the geometric information that when
\(\bu'(t)=(a^{1}(\bu), \cdots, a^{k}(\bu))\text{,}\) then
\(\bv(t)=F(\bu(t))\) gives
\(\bv'(t)=DF(\bu(t))\bu'(t)\text{,}\) which is another form of
(7.3.5).
Note that if
\(X(\bx)\) is a continuous vector field on
\(S\) and
\(f(\bx)\) is a continuously differentiable function in a neighborhood of
\(S\text{.}\) Then
\(D_{X(\bx)}f(\bx)\) is a continuous function on
\(S\text{,}\) which can be computed via any admissible parametrization of
\(S\text{.}\)
\begin{equation}
d v_{i}= \sum_{j=1}^{k} \frac{\partial v_{i}}{\partial u_{j}} \, du_{j}.\tag{7.3.6}
\end{equation}
Thus a one-form
\(\sum_{i=1}^{k}\xi_{i}(\bu) du_{i}\) transforms to
\(\sum_{i=1}^{k}\eta_{i}(\bv) dv_{i}\text{,}\) where, using
(7.3.6) we have
\begin{equation*}
\sum_{i=1}^{k}\eta_{i}(\bv) dv_{i}=
\sum_{j=1}^{k} \sum_{i=1}^{k}\eta_{i}(\bv) \frac{\partial v_{i}}{\partial u_{j}} \, du_{j},
\end{equation*}
so it follows that
\begin{equation}
\xi_{j}(\bu)= \sum_{i=1}^{k}\eta_{i}(\bv) \frac{\partial v_{i}}{\partial u_{j}}.\tag{7.3.7}
\end{equation}
In matrix notation, this transformation takes the form of
\begin{equation*}
\begin{bmatrix} \xi_{1}(\bu)\\ \vdots \\ \xi_{k}(\bu)\end{bmatrix} =
\left[\frac{\partial v_{i}}{\partial u_{j}}\right]^{\rm T} \begin{bmatrix} \eta_{1}(\bv)\\ \vdots \\ \eta_{k}(\bv)\end{bmatrix}\text{.}
\end{equation*}
Treating
\(v_{i}\) as a function of
\(\bu\text{,}\) (7.3.6) is simply a case of
(7.3.1). But we should read both sides of
(7.3.6) as covectors in
\(S^{*}_{P}\) with
\(P=\vg(\bu)=\vg^{{\dagger}}(\bv)\text{.}\)
As a consequence of
(7.3.6), for any differentiable function
\(f\) on
\(S\text{,}\) its differential
\(df\) computed in two different parametrization satisfy
\begin{equation*}
df=\sum_{i=1}^{k}\frac{\partial f}{\partial u_{i}}\, du_{i}= \sum_{i=1}^{k}\frac{\partial f}{\partial v_{i}}\, dv_{i}.
\end{equation*}
In the context of Stokes Theorem, suppose that in the integrand \(\xi(\bu)\cdot \bu'(t)\) of the line integral, we treat \(\xi(\bu)\) as the coordinates of a one-form instead of a vector field, namely, \(\Xi(\bu)=\sum_{i=1}^{k}\xi_{i}(\bu)\,du_{i}\) and identify \(\bu'(t)\) with the tangent vector \(\sum_{i=1}^{k}u_{i}'(t)\frac{\partial }{\partial u_{i}}\text{,}\) then \(\xi(\bu)\cdot \bu'(t)=\langle \Xi(\bu), \bu'(t)\rangle\) and under the change of variable \(\bv=F(\bu)\text{,}\) \(\Xi(\bu)\) transforms to \(\sum_{i=1}^{k}\eta_{i}(\bv) dv_{i}\) and \(\bu'(t)\) transforms to \(\bv'(t)\text{.}\) We see that
\begin{equation*}
\langle \Xi(\bu), \bu'(t)\rangle=\sum_{i=1}^{k}\eta_{i}(\bv) v_{i}'(t).
\end{equation*}
This also follows directly from
(7.3.5) and
(7.3.7). Thus, when treated as a one-form, the line integral in Stokes Theorem is invariant under a change of variables.
Now that we have introduced tangent vectors and cotangent vectors, the algebraic tensor operations, including tensor product and exterior product, apply to them. Thus in addition to the tangent space
\(S_{P}\) and cotangent space
\(S_{P}^{*}\) at each point
\(P\) of
\(S\text{,}\) there are also spaces of higher order tensors. In an admissible parametrization
\(\bu\in U\subset \bbR^{k}\mapsto \vg(\bu)\in S\text{,}\) \(\{ du_{i_{1}}\otimes \cdots \otimes du_{i_{m}}: 1\le i_{1},\cdots, i_{m}\le k\}\) forms a basis of the space
\({\mathcal T}^{m}(S_{\vg(\bu)})\) of covariant tensors of order
\(m\) of
\(S\) at
\(\vg(\bu)\text{,}\) while
\(\{ du_{i_{1}}\wedge \cdots \wedge du_{i_{m}}: 1\le i_{1}\lt \cdots \lt i_{m}\le k\}\) forms a basis of the space
\({\Lambda}^{m}(S_{\vg(\bu)})\) of covariant alternating tensors of order
\(m\) of
\(S\) at
\(\vg(\bu)\text{.}\)
Example 7.3.6.
In the case of two dimensions, if \(u_{1}=r, u_{2}=\theta\) are the polar coordinates of \((x_{1}, x_{2})\in {\mathbb R}^{2}\text{,}\) then
\begin{alignat*}{2}
\frac{\partial }{\partial r} \amp= \frac{\partial x}{\partial r} \frac{\partial }{\partial x} +
\frac{\partial y}{\partial r} \frac{\partial }{\partial y} \amp\amp
= \cos \theta \frac{\partial }{\partial x} + \sin \theta \frac{\partial }{\partial y},\\
\frac{\partial }{\partial \theta} \amp= \frac{\partial x}{\partial \theta} \frac{\partial }{\partial x} +
\frac{\partial y}{\partial \theta} \frac{\partial }{\partial y} \amp\amp
= -r \sin \theta \frac{\partial }{\partial x} + r \cos \theta \frac{\partial }{\partial y}.
\end{alignat*}
Noting that \(\begin{bmatrix} \cos \theta \amp \sin \theta \\ -\sin \theta \amp \cos \theta
\end{bmatrix}\) being an orthogonal matrix, the above relation can be written as
\begin{equation*}
\begin{bmatrix} \frac{\partial }{\partial r} \\ r^{-1} \frac{\partial }{\partial \theta} \end{bmatrix}
= \begin{bmatrix} \cos \theta \amp \sin \theta \\ -\sin \theta \amp \cos \theta
\end{bmatrix} \begin{bmatrix} \frac{\partial }{\partial x} \\ \frac{\partial }{\partial y} \end{bmatrix},
\end{equation*}
from which we obtain
\begin{equation*}
\begin{bmatrix} \frac{\partial }{\partial x} \\ \frac{\partial }{\partial y} \end{bmatrix}
= \begin{bmatrix} \cos \theta \amp -\sin \theta \\ \sin \theta \amp \cos \theta
\end{bmatrix} \begin{bmatrix} \frac{\partial }{\partial r} \\ r^{-1} \frac{\partial }{\partial \theta} \end{bmatrix}.
\end{equation*}
Thus a vector field in rectangular coordinates \(X(x,y) \frac{\partial }{\partial x} + Y(x, y) \frac{\partial }{\partial y}\text{,}\) when represented in polar coordinates, becomes
\begin{align*}
\amp \begin{bmatrix} X(x,y) \amp Y(x, y) \end{bmatrix}
\begin{bmatrix} \cos \theta \amp -\sin \theta \\ \sin \theta \amp \cos \theta
\end{bmatrix}\begin{bmatrix} \frac{\partial }{\partial r} \\ r^{-1} \frac{\partial }{\partial \theta} \end{bmatrix} \\
=\amp \left( X(r\cos\theta, r\sin\theta) \cos \theta + Y(r\cos\theta, r\sin\theta) \sin\theta\right)
\frac{\partial }{\partial r} \\
+ \amp r^{-1}\left( -X(r\cos\theta, r\sin\theta) \sin \theta + Y(r\cos\theta, r\sin\theta) \cos \theta\right) \frac{\partial }{\partial \theta}.
\end{align*}
If we treat
\(\bbR^{2}\) as equipped with the standard Euclidean metric, then
\(\{ \frac{\partial }{\partial x}, \frac{\partial }{\partial y}\}\) is orthonormal, so its dual basis
\(\{ dx, dy\}\) is also orthonormal in the induced metric on cotangent space
\(\bbR^{2*}\text{.}\) Since the relation between
\(\{ \frac{\partial }{\partial x}, \frac{\partial }{\partial y}\}\) and
\(\{ \frac{\partial }{\partial r} , r^{-1} \frac{\partial }{\partial \theta}\}\) is via an orthogonal matrix, therefore the latter is also orthonormal in the tangent space. It follows that in metric notation we have
\(g(\frac{\partial}{\partial \theta}, \frac{\partial}{\partial \theta})=r^{2}\text{.}\)
Note that the dual basis of \(\{ \frac{\partial }{\partial r} , r^{-1} \frac{\partial }{\partial \theta}\}\) is \(\{dr, r d\theta\}\text{.}\) So \(\{dr, r d\theta\}\) is orthonormal in the induced metric on cotangent space \(\bbR^{2*}\text{.}\) This can also be confirmed by the computation
\begin{equation*}
dx\otimes dx + dy \otimes dy =dr\otimes dr + r^{2}d\theta\otimes d\theta,
\end{equation*}
where one uses
\begin{equation}
dx=\cos \theta \, dr- r\sin\theta\, d\theta, \quad
dy= \sin\theta\, dr + r\cos\theta\, d\theta\text{.}\tag{7.3.8}
\end{equation}
Treating \(r^{2}d\theta\otimes d\theta\) as \((r d\theta)\otimes (r d\theta)\text{,}\) one sees that \(\{dr, r d\theta\}\) is an orthonormal basis. In metric notation we have \(g(d\theta, d\theta)=r^{-2}\text{.}\)
Note that if
\(\{\alpha_{1},\cdots, \alpha_{n}\}\) and
\(\{\bu_{1},\cdots, \bu_{n}\}\) are
orthonormal dual basis, then
\(\sharp (\alpha_{i})=\bu_{i}\text{.}\) It follows in our case that
\(\sharp (r d\theta)=r^{-1} \frac{\partial }{\partial \theta}\text{,}\) so
\(\sharp (d\theta)=r^{-2} \frac{\partial}{\partial \theta}\text{.}\)
We now treat the vector field \(X(x,y) \frac{\partial }{\partial x} + Y(x, y) \frac{\partial }{\partial y}\) as \(\sharp (X(x,y)\, dx + Y(x,y)\, dy)\text{,}\) and the one form can be computed as
\begin{align*}
\amp
X(x,y)\,\left(\frac{\partial x}{\partial r}\, dr + \frac{\partial x}{\partial \theta}\, d\theta \right)
+ Y(x, y)\, \left(\frac{\partial y}{\partial r}\, dr + \frac{\partial y}{\partial \theta}\, d\theta \right)\\
=\amp X(x,y)\,\left( \cos \theta \, dr- r\sin\theta\, d\theta \right)
+ Y(x, y)\, \left( \sin\theta\, dr + r\cos\theta\, d\theta \right)\\
=\amp \left( X(x,y) \cos \theta + Y(x, y) \sin\theta \right)\,dr +
r \left( -X(x,y) \sin \theta + Y(x, y) \cos\theta \right)\,d\theta,
\end{align*}
from which we can apply the \(\sharp\) operation to get the same result. In addition, if \(\Gamma\) is a parametric curve in \(\bbR^{2}\text{,}\) then the line integral
\begin{align*}
\amp \int_{\Gamma} \left\{ X(x,y)\, dx + Y(x,y)\, dy \right\} \\
=\amp \int_{\Gamma}\left\{ \left( X(x,y) \cos \theta + Y(x, y) \sin\theta \right)\,dr +
r \left( -X(x,y) \sin \theta + Y(x, y) \cos\theta \right)\,d\theta \right\}.
\end{align*}
\begin{equation*}
dx\wedge dy =\left( \cos \theta \, dr- r\sin\theta\, d\theta\right)\wedge\left( \sin\theta\, dr + r\cos\theta\, d\theta\right) = r \, dr\wedge d\theta.
\end{equation*}
Finally, for a differentiable function \(f\text{,}\)
\begin{equation*}
df=\frac{\partial f}{\partial x}\, dx + \frac{\partial f}{\partial y}\, dy
=\frac{\partial f}{\partial r}\, dr + \frac{\partial f}{\partial \theta}\, d\theta,
\end{equation*}
so taking \(\sharp\) operation gives
\begin{equation*}
\frac{\partial f}{\partial x}\, \frac{\partial }{\partial x} + \frac{\partial f}{\partial y}\, \frac{\partial }{\partial y}
= \frac{\partial f}{\partial r}\, \frac{\partial }{\partial r} + r^{-2} \frac{\partial f}{\partial \theta}\,
\frac{\partial }{\partial \theta}.
\end{equation*}
This gives the gradient of \(f\) in the polar coordinate as \((\frac{\partial f}{\partial r}, r^{-2} \frac{\partial f}{\partial \theta})\text{.}\)
One also notes that \(\left(\frac{\partial f}{\partial x}\right)^{2}+\left(\frac{\partial f}{\partial y}\right)^{2}
\ne \left(\frac{\partial f}{\partial r}\right)^{2}+\left(\frac{\partial f}{\partial \theta}\right)^{2}\) in general. Instead,
\begin{equation*}
\left(\frac{\partial f}{\partial x}\right)^{2}+\left(\frac{\partial f}{\partial y}\right)^{2}
= \left(\frac{\partial f}{\partial r}\right)^{2}+\frac{1}{r^{2}}\left(\frac{\partial f}{\partial \theta}\right)^{2}\text{.}
\end{equation*}
Exercises Exercises
1.
Using the relation between rectangular and spherical polar coordinates in \(\bbR^{3}\text{:}\)
\begin{equation*}
\begin{cases} x_{1} \amp = r\sin\theta\cos\phi \\
x_{2} \amp = r \sin\theta\sin\phi \\
x_{3} \amp = r \cos\theta\\
\end{cases}
\end{equation*}
to determine \(dx_{1}, dx_{2}, dx_{3}\) in terms of \(dr, d\theta, d\phi\text{.}\) Then for a differentiable function \(f\) determine \(\frac{\partial f}{\partial r}, \frac{\partial f}{\partial \theta}, \frac{\partial f}{\partial \phi}\) in terms of \(\frac{\partial f}{\partial x_{i}}, i=1, 2, 3\text{.}\)
Hint.
Use
\(df=\sum_{i=1}^{3}\frac{\partial f}{\partial x_{i}}dx_{i}\) and substitute
\(dx_{i}\) in terms of
\(dr, d\theta, d\phi\) to identify
\(\frac{\partial f}{\partial r}, \frac{\partial f}{\partial \theta}, \frac{\partial f}{\partial \phi}\text{.}\)
2.
On the sphere
\(\sum_{i=1}^{3}x_{i}^{2}=1\text{,}\) consider
\((x_{1}, x_{2})\) and
\((\theta, \phi)\) as coordinates for a portion of the upper hemisphere. Find the relations between
\(dx_{1}, dx_{2}\) and
\(d\theta, d\phi\text{,}\) as well as between
\(\frac{\partial }{\partial x_{1}}, \frac{\partial }{\partial x_{2}}\) and
\(\frac{\partial }{\partial \theta}, \frac{\partial }{\partial \phi}\text{.}\) Identify a choice of the domain on which the change of coordinates between these two sets of coordinates is continuously differentiable and has continuously differentiable inverse.
3.
Rewrite the system of ODE
\begin{equation*}
\begin{cases} x'(t) \amp =- \frac{\partial f (x, y)}{\partial x} \\
y'(t) \amp = - \frac{\partial f (x, y)}{\partial y} \\
\end{cases}
\end{equation*}
as a system in the polar coordinate \((r, \theta)\) of \((x, y)\) and the partial derivatives of \(f\) with respect to \(r\) and \(\theta\text{.}\)
Subsection 7.3.2 Tangent Map/Differential and its Pull-Back of a Differentiable Map
Suppose that
\(F: U\subset \bbR^{m}\mapsto \bbR^{n}\) is differentiable, then for any
\(P\in U\text{,}\) \(DF(P)\) is a linear map from
\(\bbR^{m}_{P} \) to
\(\bbR^{n}_{F(P)}\) mapping
\(\bv\) to
\(DF(P)\bv\text{,}\) and is denoted as
\(F_{*}(P)\text{,}\) and is often called the
tangent map (or even called the differential and denoted as
\(dF\)) of
\(F\) at
\(P\text{.}\)
The action of \(F_{*}(P)\) can also be seen through how a differentiable function \(f\) on \(\bbR^{n}\) is differentiated through \(F\text{:}\)
\begin{equation*}
D_{\frac{\partial }{\partial u_{i}}}\left( f\circ F\right)=D_{F_{*}\left(\frac{\partial }{\partial u_{i}}\right) }f\text{,}
\end{equation*}
namely, to treat \(F_{*}\left(\frac{\partial }{\partial u_{i}}\right)\) as tangent vector \(D_{u_{i}}F(P)\) at \(F(P)\text{.}\) When \(F\) is a parametrization map, we have identified in our notation \(F_{*}\left(\frac{\partial }{\partial u_{i}}\right)\) with \(\left(\frac{\partial }{\partial u_{i}}\right)\text{.}\)
Note that
\(F_{*}(P)\) is determined in terms of the first derivatives of
\(F\text{,}\) so a more accurate notation would have been
\(DF(P)\) or
\(dF(P)\text{,}\) but it has been a traditional to use
\(DF(P)\) as the matrix representation of
\(F_{*}\) under the chosen coordinates.
This notion and notation turn out to be very useful. Suppose \(F(P)=Q\text{.}\) Write out \(F\) in components
\begin{equation*}
x_{i}= F_{i}({\mathbf u})=F_{i}(u_{1},\ldots, u_{m}), 1\le i \le n.
\end{equation*}
Then each \(F_{i}\) is a differentiable function of \({\mathbf u}\text{,}\) thus
\begin{equation*}
dF_{i}({\mathbf u})=\sum_{j=1}^{m}\frac{\partial F_{i}}{\partial u_{j}}du_{j}.
\end{equation*}
The geometric interpretation of the linear map \(F_{*}: {\mathbb R}^{m}_{P} \mapsto
{\mathbb R}^{n}_{Q}\) is seen as follows. For any \({\bv}\in {\mathbb R}^{m}_{P}\text{,}\) \(P+t{\bv}\) is a curve in \({\mathbb R}^{m}\) passing through \(P\) with tangent \({\bv}\) at \(t=0\text{,}\) and \({\mathbf x}(t):=F(P+t{\bv})\) is a curve in \({\mathbb R}^{n}\) passing through \(Q\) at \(t=0\text{,}\) its tangent at \(t=0\) is
\begin{equation*}
{\mathbf x}'(0)=DF(P) {\bv}.
\end{equation*}
Thus \({\mathbf x}'(0)=F_{*}(\bu)\text{.}\) In components we see
\begin{equation*}
x_{i}'(0)=\sum_{j=1}^{m} v_{j} \frac{\partial F_{i}}{\partial u_{j}}(P)=dF_{i}(P)(\bv)\text{.}
\end{equation*}
This is a reason for using \(dF\) as a common notation for \(F_{*}\text{.}\) Another way of writing this relation is to note that
\begin{align*}
d(f\circ F)({\bv})=\amp D_{{\bv}} (f\circ F) \\
=\amp \sum_{j=1}^{m} v_{j} \frac{\partial (f\circ F)}{\partial u_{j}}\\
=\amp \sum_{j=1}^{m} v_{j} \sum_{i=1}^{n}
\frac{\partial f}{\partial x_{i}}\frac{\partial F_{i}}{\partial u_{j}} \\
= \amp \sum_{i=1}^{n} \frac{\partial f}{\partial x_{i}}
\sum_{j=1}^{m} v_{j} \frac{\partial F_{i}}{\partial u_{j}}\\
= \amp \sum_{i=1}^{n} x_{i}'(0) \frac{\partial f}{\partial x_{i}}\\
= \amp D_{\mathbf w} f\\
=\amp df(\mathbf w) \\
=\amp df (dF(\bv)),
\end{align*}
where \(\bw=dF(\bv)\text{.}\) Namely,
\begin{equation*}
d(f\circ F)=df \circ dF \text{ and $D_{{\bv}} (f\circ F)=D_{dF(\bv)}f$}\text{.}
\end{equation*}
It is easier to understand this relation in terms of the following diagram.
\begin{equation*}
d(f\circ F): {\mathbb R}^{m}_{P} \xmapsto{dF}
{\mathbb R}^{n}_{Q} \xmapsto{df} \bbR_{f(Q)}.
\end{equation*}
Using the dual maps (or pull-backs), we have
\begin{equation*}
\bbR_{f(Q)}^{*} \xmapsto{(df)^{*}} {\mathbb R}^{n*}_{Q} \xmapsto{(dF)^{*}} {\mathbb R}^{m*}_{P},
\end{equation*}
where, if we denote \(z=f(\bx)\text{,}\) then \((df)^{*}(dz)=\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}} dx_{i}\text{,}\) and
\begin{equation*}
(dF)^{*}(dx_{i})=\sum_{j=1}^{m} \frac{\partial x_{i}}{\partial u_{j}} du_{j}.
\end{equation*}
We note that many books use \(f^{*}\) to denote \((df)^{*}\text{,}\) and \(F^{*}\) to denote \((dF)^{*}\text{.}\) Thus we have the corresponding relation \((f\circ F)^{*}=F^{*}\circ f^{*}\text{,}\) and the pull-back operation behaves like substitution in the differential: \(f^{*}(dz)=df(\bx)
=\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}} dx_{i}\text{,}\) etc.
Exercises Exercises
1.
Consider \((x_{1}, x_{2}, x_{3})=F(r, \theta, \phi)\) given by
\begin{equation*}
\begin{cases} x_{1} \amp = r\sin\theta\cos\phi \\
x_{2} \amp = r \sin\theta\sin\phi \\
x_{3} \amp = r \cos\theta.\\
\end{cases}
\end{equation*}
-
Compute \(F_{*}(\frac{\partial }{\partial r}), F_{*}(\frac{\partial }{\partial \theta}),
F_{*}(\frac{\partial }{\partial \phi})\text{.}\)
-
Compute \(F^{*}(dx_{i})\text{,}\) \(F^{*}(dx_{1}\wedge dx_{2})\text{,}\) \(F^{*}(dx_{2}\wedge dx_{3})\) and \(F^{*}(dx_{3}\wedge dx_{1})\text{.}\)
-
Compute \(F^{*}(dx_{1}\otimes dx_{1}+ dx_{2}\otimes dx_{2}+dx_{3}\otimes dx_{3})\text{.}\)
-
Let \(G(\theta, \phi)=F(1, \theta, \phi)\text{.}\) Compute \(G_{*}(\frac{\partial }{\partial \theta}),
G_{*}(\frac{\partial }{\partial \phi})\text{,}\) \(G^{*}(dx_{i})\text{,}\) \(G^{*}(dx_{1}\wedge dx_{2})\text{,}\) \(G^{*}(dx_{2}\wedge dx_{3})\text{,}\) and \(G^{*}(dx_{3}\wedge dx_{1})\text{.}\)
-
Compute \(G^{*}(x_{1}dx_{1}+x_{2}dx_{2}+x_{3}dx_{3})\) and \(G^{*}(dx_{1}\otimes dx_{i}+ dx_{2}\otimes dx_{2}+dx_{3}\otimes dx_{3})\text{.}\)
2.
Let
\((x, y, z)=F(u, v)=(u^{2}-v^{2}, 2u v, 1)\text{.}\) Compute
\(F_{*}(\frac{\partial }{\partial u})\text{,}\) \(F^{*}(y dx + z dy)\) and
\(F^{*}(dx\wedge dy + dy\wedge dz)\text{.}\)