Skip to main content

Section 6.6 Change of Variables in Multi-dimensional Integral

The change of variables formula in multi-dimensional integral takes the following form:
\begin{equation} \int_{T(E)} f({\mathbf y})\, d{\mathbf y} = \int_{E} f(T({\mathbf x}))|\det DT({\mathbf x})|\, d{\mathbf x}\tag{6.6.1} \end{equation}
where the change of variables transformation \(T\) is continuously differentiable in an open domain \(U\) such that \(E\subset U\text{,}\) and further conditions on \(f, E, T\) will be spelled out later.
Although an appropriate formulation and proof for general domains of integration takes a lot of work, there is a fairly intuitive idea behind (6.6.1), and the heart of the matter is reflected in the case of \(f=1\text{,}\) which gives the volume \(v(T(E))\) of \(T(E)\text{,}\) defined as \(\int_{T(E)}1\, d{\mathbf y}\)
\begin{equation} v(T(E))=\int_{T(E)}1\, d{\mathbf y}= \int_{E} |\det DT({\mathbf x})|\, d{\mathbf x}.\tag{6.6.2} \end{equation}
Here is a version of the change of variables formula.

Remark 6.6.2.

The formulation of the above theorem implicitly assumes that \(T(E)\) is Jordan measurable. This, together with the following, are consequences of (6.6.3) using the Inverse Function Theorem.
  1. \(T(U)\) is an open set and there is a well defined inverse of \(T\) on \(T(U)\text{,}\) which is continuously differentiable.
  2. For any subset \(E\subset U\text{,}\) \(\partial \left( T(E) \right) =T(\partial E)\text{.}\)
  3. \(T\) maps any set of \(U\) of measure zero to a set of measure zero.
  4. \(T\) maps any Jordan-measurable set of \(U\) to a Jordan measurable set.
Recall that the volume of a set \(E\) is defined as \(\int_{E}\chi_{E}\) when the latter is well defined; this turns out to be equivalent to (a). \(\partial E\) has measure zero, and (b). \(v(E)=\sup\left\{\sum_{S_{i} \subset E}v(S_{i})\right\}\text{,}\) where \(S_{i}\) are sub rectangles of a partition. (b) is simply a restatement of \(v(E)=\sup L(\chi_{E}, \cP)\text{,}\) noting that \(L(\chi_{E}, \cP)= \sum_{S_{i} \subset E}v(S_{i})\) for any partition \(\cP\text{.}\)

Remark 6.6.3.

In the one-variable case, the change of variables transformation \(y=T(x)\) is not required to be a bijection, and we don’t put in the absolute value sign around the Jabobian, as our conventional notation encodes the orientation: if \(T: [a, b]\mapsto [c, d]\text{,}\) then
\begin{equation*} \int_{T(a)}^{T(b)} f(y)\, dy = \int_{a}^{b} f(T(x)) T'(x)\, dx, \end{equation*}
where, in case \(T\) is a bijection and \(T(a)>T(b)\text{,}\) we have \(\int_{T(a)}^{T(b)} f(y)\, dy =- \int_{T(b)}^{T(a)} f(y)\, dy=-\int_{T[a, b]} f(y)\, dy\text{,}\) which is consistent with
\begin{equation*} \int_{T[a, b]} f(y)\, dy = \int_{[a, b]} f(T(x)) |T'(x)|\, dx. \end{equation*}

Proof.

(of properties in RemarkΒ 6.6.2) We only provide some details for item (iii). We first construct a sequence of compact closed subsets \(U_{i}\) of \(U\) satisfying \(U=\cup_{i} U_{i}\text{.}\) In fact we can take each \(U_{i}\) to be a hypercube with compact closure \(\bar U_{i}\) in \(U\) and further require that \(\bar U_{i}\) is contained in another hypercube \(V_{i}\) with compact closure \(\bar V_{i}\subset U\text{.}\)
Let \(E\subset U\) be a set of measure \(0\text{.}\) Then \(E_{i} :=E\cap U_{i}\subset U\) is a set of measure \(0\) and it suffices to prove that each \(T(E_{i})\) is a set of measure \(0\text{.}\)
Since \(DT\) is continuous on \(U\) and \(\bar V_{i}\subset U\) is compact and convex, there exists a bound \(C_{i}>0\) such that \(\Vert DT(\bx)\Vert \le C_{i}\) for all \(\bx \in \bar V_{i}\text{.}\) This then implies that
\begin{equation*} \Vert T(\bx)-T(\by)\Vert \le C_{i} \Vert \bx -\by \Vert \text{ for any $\bx, \by\in \bar V_{i}$.} \end{equation*}
As a consequence, any hypercube \(Q\) in \(\bar V_{i} \) with all side lengths equal to \(l\) is mapped by \(T\) to \(T(Q)\text{,}\) which is contained in a hypercube \(Q'\) of side length no more than \(\sqrt n C_{i }l\) and \(|Q'|\le n^{n/2}C_{i}^{n} |Q|\text{.}\)
For any \(\epsilon >0\text{,}\) let \(\{Q_{\alpha}\}\) be a finite or countable collection of hypercubes covering \(E_{i}\) such that \(\sum_{\alpha} |Q_{\alpha}| \lt \epsilon\text{.}\) We may assume that each \(Q_{\alpha}\subset U_{i}\)---if some of the original \(Q_{\alpha}\) does not satisfy this, we can simply replace it by \(Q_{\alpha}\cap U_{i}\text{.}\)
Claim: We can replace this collection, if necessary, by a collection of hypercubes \(\{ Q^*_{\beta}\subset V_{i}\}\text{,}\) such that each \(Q^*_{\beta}\) is a hypercube with all side lengths equal and \(\sum_{\beta} | Q^*_{\beta}| \lt 2\epsilon\text{.}\)
This is seen by working with each individual \(Q_{\alpha}\text{:}\) suppose it is given by \([a_{1},b_{1}]\times\cdots \times [a_{n},b_{n}]\text{,}\) then choose \(\sigma >0\) and \(k\in \bbN\) such that
\begin{equation*} (1+\sigma)^{n} \le 1+\epsilon; 2^{1-k} \lt (b_{j}-a_{j})\sigma \text{ for all $1\le j \le n$.} \end{equation*}
By partitioning each axis into intervals of length \(2^{-k}\text{,}\) we find \(\hat a_{j}\lt \hat b_{j} \in 2^{-k}\bbZ\) such that
\begin{equation*} \hat a_{j}\le a_{j}\lt b_{j}\le \hat b_{j}, \quad \hat b_{j}-\hat a_{j}\le b_{j}-a_{j} +2^{1-k} \le ( b_{j}-a_{j})(1+\sigma). \end{equation*}
The hypercube \(\hat Q_{\alpha}=[\hat a_1, \hat b_{1}]\times \cdots \times [\hat a_{n}, \hat b_{n}]\) contains \(Q_{\alpha}\) with
\begin{equation*} |\hat Q_{\alpha}|=\Pi_{j=1}^{n}(\hat b_{j}-\hat a_{j})\le |Q_{\alpha}|(1+\epsilon). \end{equation*}
Finally we can make sure that each \(\hat Q_{\alpha}\subset V_{i}\) by working with a larger \(k\text{,}\) if necessary, and that each \(\hat Q_{\alpha}\) is the union of a finite number of hypercubes of side length \(2^{-k}\) with non-overlapping interior, therefore \(|\hat Q_{\alpha}|\) is the sum of the volumes of these hypercubes. Collecting these hypercubes we get an at most countable collection \(\{ Q^*_{\beta}\subset V_{i}\}\) such that
\begin{equation*} \sum_{\beta} | Q^*_{\beta}| = \sum_{\alpha}|\hat Q_{\alpha}| \lt (1+\epsilon)\epsilon \lt 2\epsilon\text{,} \end{equation*}
if we have chosen \(1>\epsilon>0\text{.}\)
Back to our proof of item (iii) in RemarkΒ 6.6.2. \(T(E_{i})\) is contained in \(\cup_{\beta} T(Q^{*}_{\beta})\text{,}\) and each \(T(Q^{*}_{\beta})\) is contained in a cube \(Q'_{\beta}\) whose volume is \(\le n^{n/2}C_{i}^{n} |Q^{*}_{\beta}|\text{,}\) therefore we have \(\sum_{\beta}|Q'_{\beta}| \le 2n^{n/2}C_{i}^{n} \epsilon\text{.}\) Since \(n, C_{i}^{n}\) are fixed here and \(\epsilon >0\) is arbitrary, this shows that \(T(E_{i})\) has measure \(0\text{.}\)

Remark 6.6.4.

In the above argument we choose to work with nested hypercubes \(U_{i}\subset \bar U_{i}\subset V_{i}\subset \bar V_{i}\subset U\) only to give us room in \(\bar V_{i}\) to modify the hypercubes \(Q_{\alpha}\) in the cover of \(E_{i}\) so that each is the non-overlapping union of a finite number of hypercubes of equal side lengths. Similar ideas can be used to prove the following lemma, which could have been used to simplify the above proof.

Proof.

(of (6.6.2)) The central idea in proving (6.6.2) is that for any open subset \(U'\subset U\) with a compact closure \(\bar U' \subset U\) and \(\epsilon>0\text{,}\) there exists a \(\delta >0\) such that for any sufficiently small cube \(Q\subset U'\) in the sense that its side length are equal and no more than \(\delta\text{,}\) we have
\begin{equation*} v(T(Q))\le (1 +\Lambda \epsilon)^{n} \left|\det \left(D T(\bar {\mathbf x})\right)\right| v(Q), \end{equation*}
where \(\bar {\mathbf x}\) is any point in \(Q\text{,}\) but will be taken as the center of \(Q\text{,}\) and \(\Lambda \ge 1\) depends on \(T, U'\) such that
\begin{equation*} \Lambda^{-1}\Vert \mathbf u \Vert \le \Vert DT({\mathbf x}) \mathbf u \Vert \le \Lambda \Vert \mathbf u\Vert, \forall {\mathbf x} \in U', \mathbf u \in \mathbb R^{n}. \end{equation*}
This is seen by the linear Taylor approximation of \(T({\mathbf x})\text{:}\)
\begin{equation*} T({\mathbf x})=T(\bar{\mathbf x}) + \left(DT(\bar{\mathbf x}) +C({\mathbf x},\bar {\mathbf x})\right)({\mathbf x}-\bar {\mathbf x}) \text{ for any $\bx, \bar \bx \in U'$,} \end{equation*}
where \(|C({\mathbf x},\bar {\mathbf x})|\le \epsilon\) as long as \(\Vert {\mathbf x}-\bar {\mathbf x} \Vert \le \delta\) for some \(\delta >0\) which depends on \(T\text{,}\) \(U'\) and \(\epsilon >0\text{.}\) For any cube \(Q\) in \(U'\) with \(\bar {\mathbf x}\) as center and side lengths equal and no more than \(\delta\text{,}\)
\begin{equation*} \left\{ T(\bar {\mathbf x}) + DT(\bar{\mathbf x})({\mathbf x}-\bar {\mathbf x}): {\mathbf x} \in Q\right\} \end{equation*}
is a parallelepiped with volume \(|\det \left(D T(\bar {\mathbf x})\right)| v(Q)\text{.}\)
We will use the Taylor expansion above to estimate the volume of \(T(Q)\text{.}\) It’s easier to decompose \(T=T_{1}\circ T_{2}\) on \(Q\text{,}\) where \(T_{1}\) is the linear map given by the matrix \(\left(D T(\bar {\mathbf x})\right)\text{,}\) and \(T_{2}=T_{1}^{-1}\circ T\text{.}\) Note that the Jacobian matrix of \(T_{2}\) at \(\bar \bx\) equal to the identity. If we apply \(T_{1}^{-1}\) to the Taylor expansion above we get
\begin{equation*} T_{2}(\bx)=T_{1}^{-1}\circ T(\bar{\mathbf x}) + \left(I+T_{1}^{-1}\circ C({\mathbf x},\bar {\mathbf x})\right)({\mathbf x}-\bar {\mathbf x}). \end{equation*}
It follows that \(T_{2}(Q)\) is contained in a hypercube \(Q'\) with \(T_{1}^{-1}\circ T(\bar {\mathbf x})\) as center and side lengths equal to the side lengths of \(Q\) multiplied by \((1+\Lambda \epsilon)\text{.}\) Furthermore, \(T(Q)=T_{1}\circ T_{2}(Q)\subset T_{1}(Q')\) and \(v(T_{1}(Q'))=|\det DT(\bar {\mathbf x})| |Q'|\text{.}\) Thus
\begin{equation*} v(T(Q)) \le \left|\det \left(DT(\bar {\mathbf x})\right)\right| (1+\Lambda \epsilon)^n v (Q). \end{equation*}
The approach here follows that in the article by J. Schwartz, The formula for change in variables in a multiple integral, Amer. Math. Monthly 61, (1954), 81--85.
The above estimate also holds for hypercubes whose ratios of side lengths are within \(1\pm \epsilon\text{,}\) with a somewhat modified constant replacing \((1+\Lambda \epsilon)^n\) and that constant still approaches \(1\) as \(\epsilon \to 0 \; \mbox{---} \) we will keep using \((1+\Lambda \epsilon)^n\) in the estimate for such hypercubes.
For any closed rectangle \(S\subset U\) and any \(\epsilon>0\text{,}\) we can do fine enough partition \(\cP=\{Q_{\alpha}\}\) of \(S\) using hypercubes for which the above estimate holds for each \(Q_{\alpha}\text{.}\) Here we didn’t use LemmaΒ 6.6.5 to produce sub rectangles \(Q_{\alpha}\) of \(S\) with equal side lengths as we want to work with a finite number of sub rectangles in a partition.
It now follows that
\begin{equation*} v(T(S))=\sum_{\alpha}v(T(Q_{\alpha})) \le (1+\Lambda \epsilon)^n \sum_{\alpha} \left|\det \left(DT(\bar {\mathbf x_{\alpha}})\right)\right| v (Q_{\alpha})\text{.} \end{equation*}
The summation above is a Riemann sum for the integral of \(\left|\det \left(DT({\mathbf x})\right)\right|\) on \(S\text{,}\) so as the partition size tends to \(0\text{,}\) we get
\begin{equation*} v(T(S)) \le (1 +\Lambda \epsilon)^{n}\int_{S} \left|\det \left(D T( {\mathbf x})\right)\right|\, d{\mathbf x}. \end{equation*}
Since \(\epsilon>0\) is arbitrary, it follows that
\begin{equation*} v(T(S))\le \int_{S} \left|\det \left(DT( {\mathbf x})\right)\right|\, d{\mathbf x}. \end{equation*}
And this argument works not only for rectangles, but for all Jordan-measurable set. In fact, for any Jordan-measurable set \(E\) whose closure is in \(U\text{,}\) and any non-negative function \(f\text{,}\) integrable on \(T(E)\text{,}\)
\begin{equation} \int_{T(E)}f({\mathbf y})\, d{\mathbf y} \le \int_{E} f(T({\mathbf x}))|\det DT({\mathbf x})|\, d{\mathbf x}.\tag{6.6.4} \end{equation}
Here are some more details for proving (6.6.4). In defining \(\int_{T(E)}f({\mathbf y})\, d{\mathbf y}\text{,}\) we may work with partitions \(\cP\) in the \(y\)-space such that any of its subrectangle that has non-empty intersection with \(T(E)\) must be contained in \(T(U)\text{.}\) Then
\begin{align*} L(f \chi_{T(E)}, \cP) \amp = \sum_{S_{j}: S_{j}\cap T(E)^{c}\ne \emptyset} m_{S_{j}}(f \chi_{T(E)})v(S_{j}) + \sum_{S_{j}: S_{j}\subset T(E)} m_{S_{j}}(f \chi_{T(E)})v(S_{j})\\ \amp \le \sum_{S_{j}: S_{j} \subset T(E) } \int_{T^{-1}(S_{j})} f(T({\mathbf x})) |\det DT({\mathbf x})|\, d{\mathbf x}\\ \amp \le \int_{E} f(T({\mathbf x})) |\det DT({\mathbf x})|\, d{\mathbf x}, \end{align*}
where we have used \(m_{S_{j}}(f \chi_{T(E)})=0\) when \(S_{j}\cap T(E)^{c}\ne \emptyset\text{,}\) \(v(S_{j})\le \int_{T^{-1}(S_{j})} |\det DT({\mathbf x})|\, d{\mathbf x}\) and \(0\le m_{S_{j}}(f \chi_{T(E)}) \le f({\mathbf x})\) for \({\mathbf x}\in T^{-1}(S_{j})\text{,}\) as well as \(\cup_{S_{j}\subset T(E)} T^{-1}(S_{j})\) forming a non-overlapping subset of \(E\text{.}\) It follows that
\begin{equation*} \int_{T(E)}f({\mathbf y})\, d{\mathbf y} =\sup_{\cP} L(f \chi_{T(E)}, \cP) \le \int_{E} f(T({\mathbf x})) |\det DT({\mathbf x})|\, d{\mathbf x}. \end{equation*}

Proof.

(of (6.6.1)) First, since the set of discontinuity of \(f(T(\bx))|\det DT(\bx)|\) is \(T^{-1}(D)\text{,}\) where \(D\) is the set of discontinuity of \(f(\by)\) in \(T(E)\text{,}\) and \(T^{-1}(D)\) has measure \(0\) due to \(D\) having measure \(0\text{,}\) it follows that \(f(T(\bx))|\det DT(\bx)|\) is integrable on \(E\text{.}\)
It suffices to prove (6.6.1) for the case with \(f\ge 0\text{.}\) For the general case, we split an integrable function as the difference of its positive and negative parts: \(f=f^{+}-f^{-}\text{.}\)
We now can apply (6.6.4) with \(T^{-1}\) on \(T(E)\mapsto E\) and \(f(T({\mathbf x})) |\det DT({\mathbf x})|\) as the integrand to obtain
\begin{align*} \int_{E} f(T({\mathbf x})) |\det DT({\mathbf x})|\, d{\mathbf x} \amp \le \int_{T(E)} f({\mathbf y}) \lvert \det DT(T^{-1}({\mathbf y}))\rvert |\det D T^{-1}({\mathbf y})|\, d{\mathbf y} \\ \amp = \int_{T(E)} f({\mathbf y})\, d{\mathbf y}, \end{align*}
where we have used \(|\det DT(T^{-1}({\mathbf y}))| |\det D T^{-1} ({\mathbf y})|=1\text{.}\) This establishes (6.6.1) for non-negative integrable functions.

Remark 6.6.7.

In applications often we can’t apply the change of variables formula directly as the assumptions may not be satisfied in the form as stated, and we need to apply some approximation procedure.
One of the most commonly used change of variables is that from rectangular coordinates to polar coordinates:
\begin{equation*} \begin{bmatrix} x \\ y \end{bmatrix} = T \begin{bmatrix} r \\ \theta \end{bmatrix} = \begin{bmatrix} r \cos \theta \\ r\sin \theta \end{bmatrix}. \end{equation*}
The Jacobian of \(T\) is \(J_T(r,\theta)=r\text{.}\) If \(U=\left\{(r,\theta): 0 < r <R, 0 < \theta < 2\pi\right\}\text{,}\) then \(T\) fails to be injective on a portion of \(\partial U\) and \(T(U)\) is not quite the open disc \(D_R=\left\{(x,y):x^2+y^2 < R^2\right\}.\)
However, for any \(\epsilon >0\) small, our Theorem is applicable on
\begin{equation*} U_{\epsilon}= \left\{(r,\theta): \epsilon < r< R-\epsilon, \epsilon < \theta < 2\pi -\epsilon\right\}. \end{equation*}
Thus for any function \(f(x,y)\) which is continuous over \(\bar D_R\text{,}\)
\begin{equation*} \int_{T(U_{\epsilon})} f(x,y)dxdy = \int_{U_{\epsilon}} f( r \cos \theta, r\sin \theta) rdrd\theta. \end{equation*}
Then using
\begin{equation*} \int_{D_R} f(x,y)dxdy = \lim_{\epsilon \to 0} \int_{T(U_{\epsilon})} f(x,y)dxdy, \end{equation*}
and
\begin{align*} \amp \int_{U} f( r \cos \theta, r\sin \theta) rdrd\theta\\ =\amp \lim_{\epsilon \to 0} \int_{U_{\epsilon}} f( r \cos \theta, r\sin \theta) rdrd\theta, \end{align*}
we obtain
\begin{equation*} \int_{D_R} f(x,y)dxdy = \int_{U} f( r \cos \theta, r\sin \theta) rdrd\theta. \end{equation*}
Just as in the one variable case, when \(f\) is not necessarily continuous (or bounded), one can define improper integral. An examination of the limiting argument above shows that if \(f(x,y)\) is known to be continuous away from the origin, and for some \(C>0\) and \(1>\delta >0\text{,}\) we have
\begin{equation*} |f(x,y)| \le C r^{-1-\delta}, \end{equation*}
then the improper integral \(\int_{D_R} f(x,y)\, dxdy\) is well defined and the change of variables formula above is still valid.