Skip to main content

Section 1.1 A Brief Summary of Riemann-Stieltjes Integral

There is a need to extend the usual Riemann integral to other contexts. Let us first briefly review the main ingredients in defining the usual Riemann integral of a bounded real-valued function \(f\) on an interval \([a, b]\text{.}\)
  1. Do a partition of the interval \([a, b]\) into subintervals. This partitions the interval \([a, b]\) into a union of finitely many non-overlapping subintervals.
  2. On each subinterval, find the infimum and supremum of \(f\text{.}\) Alternatively, one may choose to evaluate \(f\) at some point in each subinterval.
  3. Multiply the infimum, respectively the supremum, (or the function value at the chosen point) by the length of each subinterval, and add them up to form the lower sum, respectively the upper sum, (or Riemann sum) of \(f\) corresponding to the partition.
  4. Take the supremum of all lower sums, and respectively the infimum of all upper sums, (or take limit of Riemann sums as the partition size goes to zero) to define the lower integral, and respectively the upper integral of \(f\) on \([a, b]\) (or to examine the convergence of Riemann sums).

Observation 1.1.1.

This procedure necessarily requires the function \(f\) to be bounded, and the interval \([a, b]\) to be compact. This is because that this procedure starts with a finite sum, which requires each of the term to be finite. When either assumption fails, the lower sum and upper sum (or Riemann sum) of \(f\) may not be finite for a generic choice in steps 1 and 2.
To extend the usual Riemann integral, still for a bounded function defined on \([a, b]\text{,}\) we can modify step 3 above by replacing the length of each subinterval by the increment of a given monotone increasing function \(\alpha\) on \([a, b]\) over the subinterval. This leads to the Riemann-Stieltjes integral with respect to \(d\alpha\text{.}\) There are several contexts in which this extension is needed, one of which is in probability theory, and another in functional analysis.
In the context of probability, if a random variable has a probability density function \(p(x)\text{,}\) then the probability that the variable falls into \([a, b]\) is given by \(\int_{a}^{b} p(x)\, dx\text{.}\) Such a random variable can’t have a positive probability taking on a single value. A general random variable \(X\) has a cumulative distribution function \(F(x) :=\cP(x: X\le x)\text{.}\) It is a non-decreasing right continuous function of \(x\)---we will also use the terminology increasing function interchangeably. The probability that the variable falls into \((a, b]\) is given by \(\cP((a,b])=F(b)-F(a)\text{,}\) and the probability that the variable falls into \(\{a\}\) is \(\cP(\{a\})=F(a)-F(a-0)\text{,}\) where \(F(a-0)\) is the left-hand limit of \(F\) at \(a\text{.}\) The Riemann-Stieltjes integral with respect to such an increasing function is a step in describing the probability of a general random variable in terms of a general integral. For example, we can use finite additivity to define
\begin{equation*} \cP([a, b])=\cP((a,b])+\cP({a})=F(b)-F(a-0)\text{.} \end{equation*}
Since \((a,b)=\cup_{n=1}^{\infty} (a,b-\frac1n]\text{,}\) we use a limit process to define
\begin{equation*} \cP((a,b))=\lim_{n\to\infty} \cP((a,b-\frac1n])=F(b-0)-F(a)\text{.} \end{equation*}
This allows us to define \(\cP(I)\) for any interval \(I\text{.}\) The key property of such a probability measure \(\cP\) is that it is countably additive, meaning that if \(\{I_{i}\}\) is a countable collection of disjoint intervals, then
\begin{gather*} \cP(\cup_i I_i) =\sum_i \cP(I_i) \end{gather*}
We will not pursue the verification of this property here, but it is relatively easy to verify it for a finite collection of disjoint intervals. Finite additivity or countable additivity defined on a suitable family of subsets of the domain of integration is the main basis for us to extend Riemann’s integral to more general contexts.
In the context of functional analysis, there is a need to consider possible convergence of a sequence of linear functional on \(C[a, b]\text{,}\) given in the form of \(g\in C[a, b]\mapsto \int_{a}^{b} g(x)f_{k}(x)\, dx\text{,}\) where \(\{f_{k}\}\) is a sequence of continuous or Riemann integrable functions on \([a, b]\) with certain bounds, say, \(\int_{a}^{b} |f_{k}(x)|\, dx\le M\) for all \(k\text{.}\) If the limit \(\lim_{k\to \infty} \int_{a}^{b} g(x)f_{k}(x)\, dx\) exists for every \(g\in C[a, b]\text{,}\) then it could be given in terms of the Riemann-Stieltjes integral with respect to some increasing function \(\alpha\) in the sense that \(\int_{a}^{b} g(x)f_{k}(x)\, dx\to \int_{a}^{b} g\, d\alpha\) (to be defined below).

Subsection 1.1.1 Summary of Riemann-Stieltjes Integral

In the following we briefly summarize the main definitions and properties of the Riemann-Stieltjes integral.
Given a monotone increasing function \(\alpha\) on \([a, b]\) (this will be a standing assumption throughout this chapter), we can mimic the steps in defining the usual Riemann integral of a given bounded real-valued function \(f\) on \([a, b]\) to define the lower sum
\begin{equation*} L(f, \cP, d \alpha) :=\sum_{i=1}^{k} m_{I_{i}} (f) d\alpha(I_{i}) \end{equation*}
and upper sum
\begin{equation*} U(f, \cP, d \alpha) :=\sum_{i=1}^{k} M_{I_{i}} (f) d\alpha (I_{i}) \end{equation*}
of \(f\) with respect to a partition \(\cP=\{a=x_{0} < x_{1} < \cdots < x_{k}=b\}\) of \([a, b]\text{,}\) with \(I_{i}=(x_{i-1}, x_{i}]\text{,}\) \(d\alpha (I_{i})=\alpha(x_{i})-\alpha(x_{i-1})\text{,}\) \(m_{I_{i}} (f)=\inf_{I_{i}}f\text{,}\) and \(M_{I_{i}} (f) =\sup_{I_{i}}f\text{.}\)

Remark 1.1.2.

In many treatments at this level, such as done by Rudin, the partition is defined as a finite set of points in the interval of integration, and there is no distinction between closed and open (or half-open) subintervals, and finite additivity is only implicitly used but not explicitly formulated. Also note that this definition allows \((a, b)\) to be a non-compact interval as long as \(d\alpha (I_{i})\) is finite for each subinterval \(I_{i}\text{.}\)

Definition 1.1.3.

Suppose that \(f\) is a bounded real-valued function defined on the interval \([a, b]\text{.}\) Then its upper integral, and respectively lower integral, on \([a, b]\) with respect to \(\alpha\) is defined as \(\inf_{\cP}U(f, \cP, \alpha)\text{,}\) and respectively \(\sup_{\cP}L(f, \cP, \alpha)\text{,}\) where \(\cP\) runs over all partitions of \(\cP\text{.}\)
The upper integral is denoted as \(\upint_{a}^{b}\,f\, d\alpha\text{,}\) while the lower integral is denoted as \(\lowint_{\;a}^{\;b}\,f\, d\alpha\text{.}\)
When \(\alpha(x)=x\text{,}\) we write \(U(f, \cP)\) for \(U(f, \cP, dx)\text{,}\) and \(L(f, \cP)\) for \(L(f, \cP, dx)\text{.}\)

Aside

Exercise 1.1.4.

Let \(\cC\) denote the standard tertiary Cantor set on \([0, 1]\) and \(\chi_{\cC}\) denote its characteristic function, which takes value \(1\) on \(\cC\) and \(0\) elsewhere. Let \(\cP\) be a partition of \([0, 1]\) into intervals of equal length \(3^{-k}\) for some \(k\in \bbN\text{.}\) Find \(U(\chi_{\cC}, \cP)\) and \(L(\chi_{\cC}, \cP)\text{.}\) Is there a positive lower bound of \(U(\chi_{\cC}, \cP)\) independent of \(\cP\text{?}\)

Proof.

Let \(\cP_{1}, \cP_{2}\) be two arbitrary partitions of \([a, b]\text{,}\) and \(\cP^{*}\) be a refinement of both \(\cP_{1}\) and \(\cP_{2}\text{.}\) Then
\begin{equation*} L(f, \cP_{1},d\alpha)\le L(f, \cP^{*}, d \alpha)\le U(f, \cP^{*}, d \alpha)\le U(f, \cP_{2}, d \alpha). \end{equation*}
As a result,
\begin{equation*} L(f, \cP_{1}, d \alpha)\le \upint_a^b \,f\, d \alpha =\inf_{\cP_{2}} U(f, \cP_{2}, d \alpha), \end{equation*}
and
\begin{equation*} \lowint_{\;a}^{\;b} \,f\, d \alpha =\sup_{\cP_{1}} L(f, \cP_{1}, d \alpha)\le \upint_a^b \,f\, d \alpha. \end{equation*}

Definition 1.1.6.

A bounded real-valued function \(f\) defined on an interval \([a, b]\) is called Riemann-Stieltjes integrable with respect to \(d\alpha\text{,}\) if
\begin{equation*} \lowint_{\;a}^{\;b}\,f\, d\alpha = \upint_{\;a}^{\;b}\,f\, d\alpha. \end{equation*}
In such a case we write \(f\in \cR(\alpha)\) on \([a, b]\text{,}\) and use \(\int_{a}^{b} \,f\, d\alpha\) for \(\lowint_{\;a}^{\;b}\,f\, d\alpha \text{.}\)

Remark 1.1.8.

A useful observation is that one often verifies criterion (1.1.1) by separating the intervals \(I_i\) into two groups: those intervals where \(M_{I_{i}} (f) -m_{I_{i}} (f)\) is smaller than a portion of \(\epsilon\) and those where it is larger, and shows that the sum over each group is less than a portion of \(\epsilon\text{.}\)

Exercise 1.1.9.

Define
\begin{equation*} H(x)=\begin{cases} 1 \amp x \ge 0 \\ 0 \amp x < 0 \end{cases} \text{ and } G(x) =\begin{cases} 1 \amp x > 0 \\ 0 \amp x \le 0 \end{cases} \end{equation*}
Determine \(\lowint_{\;-1}^{\;1} H(x)\, dH(x)\text{,}\) \(\upint_{-1}^{1} H(x)\, dH(x)\text{,}\) \(\lowint_{\;-1}^{\;1} G(x)\, dH(x)\text{,}\) \(\upint_{-1}^{1} G(x)\, dH(x)\text{,}\) and find out if either \(H\) or \(G\) is in \(\cR(dH)\) over \([-1, 1]\text{.}\)

Proof.

Note that in order to make
\begin{equation*} U(f, \cP, d \alpha) -L(f, \cP, d \alpha) =\sum_{i=1}^{k}\left( M_{I_{i}} (f) -m_{I_{i}} (f)\right) d\alpha (I_{i}) < \epsilon, \end{equation*}
to establish (1.1.1), we can use the uniform continuity of \(f\) on \([a, b]\) to find a partition \(\cP\) of \([a, b]\) such that each \(M_{I_{i}} (f) -m_{I_{i}} (f) < \epsilon/(\alpha(b)-\alpha(a))\text{.}\) It then follows that \(U(f, \cP, d \alpha) -L(f, \cP, d \alpha) < \epsilon\text{.}\)

Proof.

Here we make use of the uniform continuity of \(\alpha\) on \([a, b]\) to find a partition \(\cP\) of \([a, b]\) such that each \(\alpha(x_{i})-\alpha(x_{i-1}) < \epsilon/(f(b)-f(a))\text{.}\) Noting that \(M_{I_{i}} (f) -m_{I_{i}} (f)=f(x_{i})-f(x_{i-1}+0)\) and \(\sum_{i=1}^{k} (f(x_{i})-f(x_{i-1}+0))<f(b)-f(a)\text{,}\) it then follows that \(U(f, \cP, d \alpha) -L(f, \cP, d \alpha) < \epsilon\text{.}\)

Proof.

Let \(M > 0\) be such that \(|f(x)| \le M\) for all \(x\in [a, b]\) and \(p_{1} < p_{2} < \cdots < p_{k}\) be all the points of discontinuity of \(f\) in \([a, b]\) (Here we are assuming \(p_{1} > a\) and \(p_{k} < b\text{;}\) the proof easily adapts to the remaining cases). Using the continuity of \(\alpha\) at these points, we can find \(l_{i} < p_{i} < r_{i}\) such that \(r_{i} < l_{i+1}\) and
\begin{equation*} \sum_{i=1}^{k}\left(M_{[l_{i}, r_{i}]}(f)- m_{[l_{i}, r_{i}]}(f)\right)(\alpha(r_{i})-\alpha(l_{i})) < \epsilon/2. \end{equation*}
Using the uniform continuity of \(f\) on \([a, b]\setminus \cup_{i=1}^{k} (l_{i}, r_{i})\text{,}\) we can find a partition \(\cP\) on this finite union of closed intervals such that
\begin{equation*} \sum_{j} \left(M_{j}(f)-m_{j}(f)\right) (\alpha(x_{j})-\alpha(x_{j-1})) < \epsilon/2. \end{equation*}
Adjoining \(\cP\) with \(\cup_{i=1}^{k} [l_{i}, r_{i}]\) forms a partition of \([a, b]\) for which (1.1.1) holds.

Proof.

We only outline the main ingredients. Let \(M=\sup_{[a, b]}|f(x)|\text{.}\) For any given \(\epsilon > 0\text{,}\) first use \(\alpha' \in \cR\) on \([a, b]\) and (1.1.1) to find a partition \(\cP=\{a=x_{0} < x_{1} < \cdots < x_{k}=b\}\) such that
\begin{equation} U(\alpha', \cP, dx) - L(\alpha', \cP, dx) =\sum_{i=1}^{k}\left( M_{I_{i}} (\alpha') -m_{I_{i}} (\alpha')\right) (x_{i}-x_{i-1}) < \epsilon.\tag{1.1.3} \end{equation}
Use this to prove that for any choice \(s_{i}\in [x_{i-1}, x_{i}]\)
\begin{equation} \left| \sum_{i=1}^{k} f(s_{i}) (\alpha (x_{i})-\alpha (x_{i-1})) - \sum_{i=1}^{k} f(s_{i}) \alpha'(s_{i})(x_{i}-x_{i-1})\right| \le M\epsilon.\tag{1.1.4} \end{equation}
It follows from
\begin{align*} \sum_{i=1}^{k} f(s_{i}) (\alpha (x_{i})-\alpha (x_{i-1})) \amp \le \sum_{i=1}^{k} f(s_{i}) \alpha'(s_{i})(x_{i}-x_{i-1}) + M\epsilon\\ \amp \le U(f \alpha', \cP, dx) +M\epsilon \end{align*}
that
\begin{equation*} U(f, \cP, d\alpha) \le U(f \alpha', \cP, dx) +M\epsilon. \end{equation*}
Reversing the roles of \(U(f, \cP, d\alpha)\) and \(U(f \alpha', \cP, dx)\) leads to
\begin{equation} \left| U(f, \cP, d\alpha) - U(f \alpha', \cP, dx) \right| \le M\epsilon.\tag{1.1.5} \end{equation}
By definition there exist partitions \(\cP_{1}\) and \(\cP_{2}\) such that
\begin{equation} \upint_{a}^{b} f \, d\alpha \le U(f, \cP_{1}, d\alpha) < \upint_{a}^{b} f \, d\alpha + \epsilon\tag{1.1.6} \end{equation}
and
\begin{equation} \upint_{a}^{b} f \alpha' \, dx \le U(f \alpha', \cP_{2}, dx) < \upint_{a}^{b} f \alpha' \, dx + \epsilon.\tag{1.1.7} \end{equation}
Let \(\cP^{*}\) be a common refinement of \(\cP, \cP_{1}, \cP_{2}\text{,}\) then (1.1.3)--(1.1.7) continue to hold with \(\cP^{*}\) replacing \(\cP, \cP_{1}, \cP_{2}\text{,}\) respectively. It now follows that
\begin{align*} \upint_{a}^{b} f \alpha' \, dx \amp \le U(f \alpha', \cP^{*}, dx)\le U(f, \cP^{*}, d\alpha)+M\epsilon\\ \amp \le \upint_{a}^{b} f\, d\alpha + (M+1)\epsilon. \end{align*}
Reversing the roles of \(\upint_{a}^{b} f \alpha' \, dx\) and \(\upint_{a}^{b} f\, d\alpha \) would lead to
\begin{equation*} \upint_{a}^{b} f\, d\alpha \le \upint_{a}^{b} f \alpha' \, dx +(M+1)\epsilon. \end{equation*}
Since \(\epsilon > 0\) is arbitrary, we conclude that
\begin{equation} \upint_{a}^{b} f\, d\alpha = \upint_{a}^{b} f \alpha' \, dx.\tag{1.1.8} \end{equation}
Similarly,
\begin{equation} \lowint_{\;a}^{\;b} f\, d\alpha = \lowint_{\;a}^{\;b} f \alpha' \, dx.\tag{1.1.9} \end{equation}
Thus \(\upint_{a}^{b} f\, d\alpha=\lowint_{\;a}^{\;b} f\, d\alpha\) iff \(\upint_{a}^{b} f \alpha' \, dx= \lowint_{\;a}^{\;b} f \alpha' \, dx\text{.}\)

Remark 1.1.16.

If \(\alpha\) has a finite number of jump discontinuity, then the right hand side of (1.1.2) and (1.1.10) need to account for contributions from these points. For example, if \(\alpha\) has a single jump discontinuity at \(c\in (a, b)\text{,}\) then
\begin{equation*} \int_{a}^{b} f\, d\alpha = \int_{a}^{c} f \alpha' \, dx + \int_{c}^{b} f \alpha' \, dx + f(c)(\alpha(c+0)-\alpha(c-0)). \end{equation*}
The assumption \(\alpha' \in \cR\) on \([a, b]\) is needed. Using a construction similar to that of a Cantor set, Volterra constructed a function (not monotone though) which had derivative everywhere in \([0, 1]\) but its derivative is not Riemann integrable. Cantor’s function is monotone increasing in \([0, 1]\text{,}\) equals a constant in any of the middle third interval removed, therefore has derivative equal \(0\) at any point of those intervals. Since Cantor’s set has measure \(0\text{,}\) Cantor’s function has derivative equal \(0\) almost everywhere in \([0, 1]\text{.}\) If we can define an integral for such a function, its integral on any subinterval of \([0, 1]\) must be \(0\text{,}\) thus the above equality relation can’t hold in such a case.

Remark 1.1.17.

For any \(a < c < b\text{,}\) \(\int_{a}^{c}1\, d \alpha=\alpha(c)-\alpha(a)\text{,}\) \(\lim_{k\to\infty} \int_{a}^{c+ \frac 1k}1\, d \alpha=\alpha(c+0)-\alpha(a)\text{.}\) Since \([a, c]=\cap_{k}[a, c+ \frac 1k]\text{,}\) or one could think of \(\int_{a}^{c}1\, d \alpha\) as \(\int \chi_{[a, c]} (x)\, d\alpha\text{,}\) and since \(\lim_{k\to\infty} \chi_{[a, c+\frac 1k]} (x)=\chi_{[a, c]} (x)\text{,}\) one would expect \(\lim_{k\to\infty} \int_{a}^{c+ \frac 1k}1\, d \alpha= \int_{a}^{c}1\, d \alpha\text{.}\) This would require \(\alpha(c+0)=\alpha(c)\text{.}\) For this reason, one often chooses to work with an \(\alpha\) which is right-continuous.
Since \(\lim_{k\to\infty} \int_{a}^{c- \frac 1k}1\, d \alpha=\alpha(c-0)-\alpha(a)\) and \([a, c)=\cup_{k}[a, c- \frac 1k]\text{,}\) it is reasonable to treat \(\lim_{k\to\infty} \int_{a}^{c- \frac 1k}1\, d \alpha\) as \(\int_{[a, c)} 1\, d\alpha\) and \(\lim_{k\to\infty} \int_{a}^{c+ \frac 1k}1\, d \alpha\) as \(\int_{[a, c]} 1\, d\alpha\text{;}\) namely, \(\int_{[a, c)} 1\, d\alpha\) and \(\int_{a}^{c} d\alpha =\int_{[a, c]} 1\, d\alpha\) carry different meaning--- the notation of integration over a set is more precise than that of integration over a lower and upper limit.

Exercise 1.1.18.

Show that the lower Riemann-Stieltjes integrals satisfy the super-additivity property: for any \(f, g\) bounded on \([a, b]\text{,}\)
\begin{equation*} \lowint_{\;a}^{\;b} (f+g)\, d\alpha > \lowint_{\;a}^{\;b} f\, d\alpha + \lowint_{\;a}^{\;b} g\, d\alpha \end{equation*}
and give an example where the inequality is strict.

Remark 1.1.19.

In the context of Lebesgue integration, the partition of the interval \([a, b]\) is done with more general measurable sets rather than subintervals of \([a, b]\text{,}\) and if one defines lower sums with respect to such partitions and define the lower integral as the supremum of the lower sums over all such partitions, then, for non-negative measurable functions, the lower integral so defined satisfies the additivity property. See (2.14) and (2.15) in Folland’s Real Analysis book (2nd edition). We use the Dirichlet function \(f(x)\text{,}\) which is \(0\) on rationals and \(1\) on irrationals, to illustrate the difference after allowing more general partitions. If only partitions into subintervals is allowed, then each lower sum is \(0\) as \(f(x)\) is \(0\) on rationals so \(m_{I_i}(f)=0\) for each interval \(I_i\text{.}\) However, if we allow partitions into measurable sets, then for any partition that includes the irrationals in \([0, 1]\) as one of the partition sets, the lower sum would be at least \(1\) as \(m_{[0,1]\setminus Q}(f)=1\) where \(Q\) is the set of rationals in \([0, 1]\text{.}\) Thus the lower integral of \(f\) defined with respect to measurable set partitions is at least \(1\text{,}\) which is different from that defined with respect to interval partitions.