In an undergraduate analysis class, one of the first results that is generally proved after the definition of differentiability is given is the fact that differentiable functions are continuous. We can justifiably ask if the converse holds—are there examples of functions that are continuous but not differentiable? If such examples exist, how “bad” can they be?
Many examples of continuous functions that are not differentiable spring to mind immediately: the absolute value function is not differentiable at zero; a sawtooth wave is not differentiable anywhere that it changes direction; the Cantor function1I’ll (hopefully) talk more about this later. is an example of a continous function that is not differentiable on an uncountable set, though it does remain differentiable “almost everywhere.”
The goal is to show that there exist functions that are continuous, but that are nowhere differentiable. In fact, what we actually show is that the collection of such functions is, in some sense, quite large and that the set of functions that are differentiable—even if only at a single point—is quite small. First, we need a definition and an important result.
A Little Theory
Definition: Let \(X\) be a complete metric space,2A metric space is a set of points and a way of measuring the distance between those points. A sequence of points in a metric space is said to be Cauchy if the distance between any two (not necessarily consecutive) points in the sequence gets small for points sufficiently deep into the sequence. A metric space is said to be complete if every Cauchy sequence converges to some point in the space. The real numbers are a complete metric space, but the rational numbers are not. (Why?) and let \(M \subseteq X\). Then \(M\) is said to be
- nowhere dense in \(X\) if the closure of \(M\) has empty interior;3This is a topological notion. In very broad strokes, if \(A\) is a subset of \(X\), then a point \(x\) is in the closure of \(A\) if we can find points of \(A\) that are arbitrarily “close” to \(x\). \(A\) is said to have empty interior if all of the points in \(A\) are arbitrarily “close” to points that are not in \(A\). For instance, the closure of the interval \((0,1)\) in \(\mathbb{R}\) is the interval \([0,1]\), and any finite collection of points in \(\mathbb{R}\) is nowhere dense. (Why?)
- meager in \(X\) if it is the union of countably many nowhere dense sets; and
- residual in \(X\) if it is not meager (i.e. if it is the complement of a meager set.
This definition provides a topological notion of what it means for a subset of a metric space to be “small.” Nowhere dense sets are tiny—almost insignificant—subsets, while residual sets are quite large (relative to the ambient space). It is also worth noting that meager sets were originally called “sets of the first category,” and residual sets were originally called “sets of the second category,” leading to the name of the following theorem:
Baire’s Category Theorem: If a metric space \(X\ne\emptyset\) is complete, then it is residual in itself.
Proof: Suppose for contradiction that \(X\) is meager in itself. Then we may write \[X = \bigcup_{k=1}^{\infty} M_k,\] where each \(M_k\) is nowhere dense. As \(M_1\) is nowhere dense in \(X\), its closure has empty interior, and therefore contains no nonempty open sets. But \(X\) does contain at least one nonempty open set—\(X\) itself. Hence \(\overline{M}_1\ne X\) and so, since \(\overline{M}_1\) is closed, its complement is both open and nonempty. Choose some \(p_1\in X\setminus \overline{M}_1\). Since \(X\setminus \overline{M}_1\) is open, there exists some \(\varepsilon_1 \in (0,\frac{1}{2})\) such that \(B(p_1,\varepsilon_1)\subseteq X\setminus \overline{M}_1\). Let \(B_1 := B(p_1,\varepsilon_1)\).
Now consider \(\overline{M}_2\). It also has empty interior, and so it contains no open balls. In particular, it does not contain \(B_1\). But then \(B_1\setminus\overline{M}_2\) is open. Let \(p_2 \in B_1\setminus\overline{M}_2\) and choose \(\varepsilon_2 < \frac{1}{2}\varepsilon_1\) such that \(B_2 := B(p_2,\varepsilon_2) \subseteq B_1\setminus\overline{M}_2\).
Continue this process by induction. That is, for each \(k\in\mathbb{N}\), choose \(B_{k+1}\) to be an open ball of radius \(\varepsilon_{k+1} < \frac{1}{2}\varepsilon_k\) such that \(B_{k+1} \subseteq B_k \setminus \overline{M}_{k+1}\). By this construction we have \(\varepsilon_k < 2^{-k}\) for each \(k\). In particular we have from the triangle inequality that \(d(p_m,p_n) \le 2^{-k}\) for all \(m,n > k\), as \(B_m,B_n\subseteq B_{k+1}\). Hence the sequence of points \((p_k)\) is Cauchy in \(X\). Since \(X\) is complete, there exists some \(p\in X\) such that \(p_k\to p\). But for any \(k\) the point \(p\) is contained in \(B_k\), which implies that \(p\not\in M_k\) for all \(k\). Hence we have found a point \(p\in X\) such that \(p\not\in \bigcup M_k = X\), which is a contradiction.
The Existence of Nowhere Differentiable Continuous Functions
We now get to the main result, which has appeared on qualifying exams a few times in the past:
Exercise: Use Baire’s category theorem to prove the existence of continuous, nowhere differentiable functions on the unit interval.
Solution: For each natural number \(n\), define the set \[ E_n := \{ f\in C([0,1]) : \exists x_0\in[0,1] \text{ s.t. } |f(x)-f(x_0)| \le n|x-x_0| \forall x\in[0,1]\}. \] This is rather a lot of notation. Let’s try to unpack it just a bit: the derivative of \(f\) is defined by the limit \[ f'(x_0) = \lim_{x\to x_0} \frac{f(x)-f(x_0)}{x-x_0}. \] If this limit exists, then near \(x_0\) the difference quotient must be bounded by some number, say \(L_1\). Away from \(x_0\), the uniform continuity of \(f\) ensures that the difference quotient is bounded by some \(L_2\) when \(x\) is far from \(x_0\). Taking the larger of these two bounds, we have that if \(f\) is differentiable at some point \(x_0\), then \[ \frac{|f(x)-f(x_0)|}{|x-x_0|} \le \max\{L_1,L_2\}. \] Thus if a function \(f\) is differentiable at any point \(x_0\in[0,1]\), then \(f\) lives in \(E_n\) for some value of \(n\).
What we want to show is that each \(E_n\) is nowhere dense in space of continous functions \(C[0,1]\) (which is a normed vector space with respect to the uniform norm). Since all of the differentiable functions are contained in \(\bigcup_n E_n\), it would then follow that the set of differentiable functions is contained in a meager subset of \(C[0,1]\). From Baire’s category theorem, we could then conclude that nowhere differentiable functions exist and, indeed, that there is a residual set of nowhere differentiable functions.
To show that each \(E_n\) is nowhere dense, we have to show that the closure of each \(E_n\) has empty interior or, equivalently, if we have an arbitrary function in \(\overline{E}_n\), we need to find a continuous function that is “close” to \(f\) in the uniform norm, but which is not contained in \(\overline{E}_n\).
So, what is \(\overline{E}_n\)? We claim that it is just \(E_n\). To see this, suppose that \(\{f_k\}\) is a Cauchy sequence in \(E_n\). First, note that \(C[0,1]\) is complete, hence there is some \(f\in C[0,1]\) such that \(f_k\to f\). Now, since each \(f_k\)\in E_n, for each \(k\) there is some \(x_k\) such that \[ \frac{|f(x)-f(x_k)|}{|x-x_k|} \le n \quad\forall x\in[0,1]. \] But then the sequence of numbers \(x_k\) is a sequence in \([0,1]\). By the Bolzano-Weierstrass theorem, this sequence has a convergent subsequence, say \(x_{k_j} \to x\in[0,1]\). Hence by the uniform convergence of \(f_k\) to \(f\), we have \[ \frac{|f(x)-f(x_{k_j})|}{|x-x_{k_j}|} = \lim_{j\to\infty} \frac{|f_{k_j}(x)-f_{k_j}(x_{k_j})|}{|x-x_{k_j}|} \le n. \] Therefore \(f \in E_n\), and so Cauchy sequences in \(E_n\) converge in \(E_n\), which shows that \(E_n\) is closed.
Now, given a function \(f\in E_n\) how do we find a function \(g\) that is “close” to \(f\), but not in \(E_n\)? That is actually somewhat delicate, and is dealt with in the following lemma:
Lemma: Given \(f\in C[0,1]\), \(n\in\mathbb{N}\), and \(\varepsilon > 0\), there exists a piecewise linear function \(g\) with only finitely many linear pieces such that each linear piece has slope \(\pm 2n\) and \(\|g-f\|_{u} < \varepsilon\).
Proof: Since \(f\) is uniformly continuous, there exists a \(\delta\) such that for any \(x,y\in[0,1]\), if \(|x-y|< \delta\), then \(|f(x)-f(y)|<\varepsilon/2\). Choose \(m\in\mathbb{N}\) such that \(m > 1/\delta\). On the interval \([0,1/m]\), define \(g\) as follows: define the first linear piece of \(g\) by setting \(g(0) = f(0)\) and giving it slope \(2n\) on the interval \([0,\varepsilon/2n]\). On the interval \([\varepsilon/2n, 2\varepsilon/2n]\), let \(g\) have slope \(-2n\). Continue in the manner until the linear piece that intersects a line of slope \(\pm 2n\) through the point \((1/m,f(1/m))\) is constructed, and take \(g\) to be equal to that linear function from the point of intersection to \(1/m\).
Continue this procedure for each interval of the form \((k/m,(k+1)/m\) for \(k=1,2,\ldots,m-1\). That is, set \(g(k/m) = f(k/m)\) and construct a sawtooth function on the given interval with slope \(\pm 2n\). We claim that \(g\) is a function of the type desired.
We first note that \(g\) is piecewise linear, with each piece having slope \(\pm 2n\)—this is explicit in the construction. Moreover, there are only finitely many pieces, since the unit interval was broken into \(m\) subintervals, and each subinterval contains only finitely many linear pieces.4This follows from the Archimedean principle—exact bounds on the number of pieces can be computed, but such a computation is tedious and we are, frankly, a bit lazy. Finally, it follows from the choice of \(\delta\) and \(m\), and the triangle inequality that for each \(x\in [0,1]\), we have \[ |g(x) – f(x)| \le \left|g(x) – f\left(\frac{\lfloor mx \rfloor}{m}\right)\right| + \left|f\left(\frac{\lfloor mx \rfloor}{m}\right) – f(x)\right| < \frac{\varepsilon}{2} + \frac{\varepsilon}{2} = \varepsilon. \] Hence \(\|g-f\|_u < \varepsilon\).
Therefore \(g\) is exactly the kind of function that we want, and so the proof is complete.
With the lemma proved, we now have the following: if \(f\in E_n\), then for any \(\varepsilon > 0\), there exists a piecewise linear function \(g\) consisting of finitely many linear pieces each having slope \(\pm 2n\) such that \(\|g-f\|_u < \varepsilon\). But no such \(g\) is in \(E_n\), and so every neighborhood of \(f\) contains functions that are not in \(E_n\). Thus \(E_n\) contains no open sets, and is therefore nowhere dense.
Thus far, we have proved that each \(E_n\) is closed an nowhere dense, thus we are ready to apply Baire’s category theory: since \(C[0,1]\) is residual in itself, it follows that \[ C[0,1] \setminus \bigcup_{n=1}^{\infty} E_n \ne \emptyset. \] But, as noted above, every function that is differentiable anywhere—even if only at a single point—is contained in the union. Therefore \(C[0,1]\) contains at least one (in fact, a residual set of) nowhere differentiable function.