Isolation from Complexity in Differential Equations
Let statement A be one of the following formulas: [Leb, pp.247-250, (9.5.7)-(9.5.10)]. More specifically, let Ai
=[Leb, (9.5.i)]. Let B=[Guo, p.70, (13)]. The proof of [BÞA7]
is difficult because it uses the theory of differential equations and because
the operation of linear transformation on the solutions of a differential equation
is not intuitive. All the proofs of [BÞAi]
can follow the same complicated method. For example, [Guo, p.161, l.9, l.-1]
follows the pattern of [Guo, p.159, l.9-p.160, l.11]. However, we can restrict
the complexity to the proof of [BÞA7]
alone and prove other Ai's by combining the
derived formulas [Leb, p.249, l.16-l.17].
(The Mehler-Dirichlet integral representation of the associate Legendre
functions) A good start is half of the battle. If we work
on something simple, the idea of the argument is easily recognized. If we work
on something complicated, we will have more problems to solve and our argument
will become fragmented. It is wise to start with a simpler
basis which shortens the argument, produces the same effect, and highlights the
essential ideas. [Guo, p.263, (2), (3); p.264, (6)] generalize [Hob, p.25, (24), (25); p.24, (23)].
However, Guo's proofs are awkward. This is because he extends an associate Legendre function to the complex domain in the beginning of his proofs. If he
could first establish the forms of the the above formulas on the real line using Hobson's
elementary method, and in the end use the concept of analytic continuation to prove that these
formulas are valid in the extended domain, the argument would be much
clearer and simpler. For example, the problem Guo worries about in [Guo, p.263,
l.-7] would not arise.
Remark. Nonetheless Guo's approach has the advantage of helping us replace the
integral on a line segment with a more flexible contour integral. For example, [Guo,
p.263, (4)] = [Guo, p.265, (7)].
If we just follow other people's
footsteps and try to figure out the meaning of each reasoning step, then we can only understand a topic superficially. To gain
complete understanding, we must study the topic from various perspectives and
see the big picture by digesting the material thoroughly. Then we should manage
to adopt an advantageous viewpoint and find a simpler approach to the original
How we articulate the proof of a theorem with depth.
Example. [Bir, p.163, Theorem 11; Pon, pp. 179-180, Theorem 15].
Simplify the argument. Let us compare the proof of [Bir,
p.163, Theorem 11] with [Pon, pp.179-180, Theorem
15]. Pontryagin first proves that the solution is differentiable with respect to
the initial points. Then he proves that the partial derivatives of the solution
with respect to a component of the initial point satisfy the variational
equation. Birkhoff merges these two arguments into one.
For approximations to the solution, Birkhoff uses the simple Lipschitz
condition once [Bir, p.164, l.1]. In contrast, Pontryagin
uses the complicated Picard approximation twice [Pon, p.172, l.6; p.175, l.-3].
The use of
Taylor's Theorem [Bir, p.164, l.13] provides a more direct and straightforward
argument than the use of Hadamard's lemma [Pon, p.174, l.-7]
and does not cause the problem occurring in [Pon, p.176, l.6].
Apply the key idea in a flexible way. Unless necessary, do not
add mathematical structures or stray from the point. In the proof of [Arn1, p.285, Theorem],
Arnold discusses the variational equation along with a differential equation and treats
them as an inseparable pair. Whenever he discusses one equation, he discusses the other
a well. This makes it difficult for the argument to proceed. When he
discusses the variational equation, the presumed inseparability makes him suddenly have
the urge to discuss the linear system [Arn1, p.287, Lemma 7]. The discussion of the
linear system is not even related to the theorem. Thus, this unnecessary
digression only obscures the key idea.
Do not reverse the argument's natural order.
In order to derive the formula for ¶/¶t, we introduce
the quotient of differences given in [Pon, p.174, l.16]. Then we successively
derive [Pon, p.175, (18); p.176, (21); p.177, (24)]. The argument's order is
natural. In contrast, Birkhoff
discusses [Bir, p.163, (25)] first. Only in the end of his proof [Bir, p.164, l.-3]
may we understand the reason why he wants to discuss [Bir, p.163, (25)].
Do not use confusing notations.
In [Bir, p.163, (25)], Birkhoff introduces the notation hj.
I believe many readers will mistake what is meant to be the index of a component of a vector (x1,
… , xn) (j=1,…,n)
for the index of a sequence (j=1,2,…). This is because he
several times. In fact, h
is the variable and j is just its parameter.
In [Arn1, p.286, l.-6],
Arnold uses x1 to represent y in [Pon, p.178,
(28)]. In [Arn1, p.287, l.9], Arnold tries to use x1
to represent x in
[Pon, p.178, (28)], but he makes a mistake in his formulation ( perhaps the
translator translates it incorrectly, see E). Thus, Arnold uses the same notation
to represent two different things. In fact, Arnold's proof of [Arn1, p.285,
Theorem] is pretty much the same as Pontryagin's proof of [Pon, pp.179-180,
Theorem 15] except Arnold uses confusing notations. The only improvement Arnold
has made is in his proof of [Arn1, p.286, Lemma 4].
Do not say "repeating verbatim the reasoning of Sects. 3, 4, and 5" [Arn1, p.285, l.15-l.16].
This will not help readers grasp the essence of the argument. If they read the
three sections long ago, they will not even understand
the reasoning. Give the key points instead: we prove Proposition (B) by induction on m by means of the
variational equation [Pon, p.177, l.
Do not make mistakes. In order to satisfy dC/dt =AC [Arn1,
p.287, l.23], x1 [Arn1, p.287, l.9] should have
referred to the initial value instead of the solution. Based on the proof of [Bir,
p.156, Corollary], Arnold has obviously made a mistake.
Do not leave out essential points and calculations. The remark given in
[Bir, p.165, l.10-l.13] should have been added to the end of the proof of [Arn1,
p.287, Lemma 7]. Otherwise, the proof is incomplete. dx1/dt=0 in [Arn1, p.287, l.12] should have been proved as
dx1/dt = (dB/dt)x+B(dx/dt)
= (dB/dt)x+B(dC/dt)Bx (because dC/dt = AC [Bir, p.156,
Þ B(dC/dt)B=BACB=BA) =
(Smoothness of roots)
[Cod, p.176, l.-14] says that
the eigenvalues of A0 are smooth in t. The
proof of this statement is founded on changing the basis. It is easiest to assume
that A0(t) is analytic in t first. For any
fixed t on the overlap of two consecutive intervals, the chosen kth columns of
B0 differ by a scalar factor because the
dimension of the corresponding eigen vector space of any eigenvalue is 1. By
expanding A0(t) in the Taylor series at the
fixed t instead of using continuity [Cod, p.176, l.-6-l.-5],
we see that on the overlap of the two consecutive intervals the chosen kth columns
of B0 differ by a scalar factor that is an analytic
function in t [Cod, p.177, l.1-l.2]. Once we prove the analytic case, we may
similarly derive the case of infinite differentiability.
Remark. The Weierstrass preparation theorem [Whi, p.16, Theorem 5I] says that
the coefficients of the Weierstrass
analytic. The proof uses [Whi, p.321, Lemma 5B]. Thus, the theory of complex
variables is useful in proving the smoothness of the elementary symmetric functions of
zeros, but is useless in proving the smoothness of zeros.
However, if the polynomial w has the linear form det (A(x
lE), where A is an n´n
matrix, then the above algebraic method can be used to prove that its roots
~) are analytic in x ~.
The big picture may help us get out of the mess of entanglement once and for all .
Removing irrelevant facts from our consideration. Suppose we want to prove the equality given in [Wat1,
p.281, l.16]. If we let sm be the mth
partial sum of the hypergeometric series given in [Wat1, p. 281, l.5], then
there are three quantities that can approach
b and m. However, the
fact that z®¥ is
irreverent in this case. If we allow z®¥
to enter our consideration, it only confuses us. Consequently, we should fix z = z0,
> |z0| and then
use [Ru1, p.135, Theorem 7.11].
Avoiding complexity by ignoring insignificant contributions.
In order to find branch points of a hypergeometric function, Watson asks readers to read [Wat1,
(see [Wat1, p.281, l.8] and Guo asks readers to read [Guo, p.151, l.
approaches are too difficult to understand. Branch points are singularities. The
only possible singularities of [Guo, p.135, (1)] are 1 and
¥. In order to prove
that t = 1 is a branch point, we consider [Guo, p.151, (6)]. If we ignore
insignificant contributions, the integral given in [Guo, p.151, (6)] can be
considered the integral of (1-t)g - b - 1,
i.e., (1-t)g - b.
Then t =1 is a branch point as long as
g - b is not an integer.
When we try to solve a problem, we go straight toward our goal by ignoring
Saks puts [Sak1, p.396, Theorem 12.8] before [Sak1, p.399, Theorem 12.11; p.400, Theorem 12.12; Lemma 12.13], because we do not need
to use the latter theorems in order to prove [Sak1, p.396, Theorem 12.8]. In
contrast, González puts [Gon1, p.490, Theorem 5.64] before [Gon1, p.491, Theorem
5.65]. The reader may be under the false impression that [Gon1, p.490, Theorem 5.64]
is an indispensable tool for proving [Gon1, p.491, Theorem 5.65].
In order to prove [Sak1, p.395, Theorem 12.7], we use [Sak1, p.357, l.
p.394, Theorem 12.5]. In contrast, González makes a fuss about the the latter
results and creates a complicated side theorem [Gon1, p.369, Theorem 5.6].
Do not ruin the effect by adding something superfluous; do not cut one's toes to
fit shoes. Once upon a time there was a painter who painted a snake. Then he added feet to it. Thus, he ruined the picture.
In [Wat1, §9.62], Watson used [Wat1, §3.35, Example 2] to prove Riemann's first
lemma. If he had used [Ru1, p.61, Theorem 3.42] instead, the complicated passage
given in [Wat1, p.184, l.-7-p.185,
l.11] could have been eliminated. In this case, Watson only asked for trouble when he
used the generalized
theorem [Wat1, §3.35, Example 2] instead of [Ru1, p.61, Theorem 3.42]. This is because a
generalized theorem weakens the hypotheses and the
weakened hypotheses are usually more involved and difficult to verify than the
When we solve a problem, we should focus on its essence and use only the tools that are
indispensable to the solution. In order to construct the Lebesgue
measure, Rudin uses the Riesz representation theorem [Ru2, p.42, Theorem 2.14]. See [Ru2, p.53, l.-8].
The construction given in [Roy, chap.3, §3] shows that it is unnecessary to use
the concept of integration [Ru2, p.42, Theorem 2.14 (a)] or Urysohn's Lemma
[Ru2, p.43, l.5]. Note that the proof of Urysohn's Lemma [Ru2, p.40] uses the
axiom of mathematical induction, which we should avoid using if possible. The essence of the Riesz representation theorem
is [Roy, p.121, Theorem 8]. The key to proving the latter theorem is that an
absolutely continuous function is an integral [Roy, p.106, Theorem 13]. The goal
of mathematics is not to build a huge machine that may solve everything. Even if
we could build such a machine, it would still be a failure. This is because
nothing would be clear. For
a college textbook, it is better to focus on the essence of a subject rather
than prove complicated theorems in the beginning part of a book. 
In order to prove Lusin's Theorem [Ru2, p.56, Theorem 2.23], Rudin also uses the Riesz representation theorem [Ru2, p.56,
l.7] and Urysohn's Lemma [Ru2, p.56, l.-13].
In contrast, [Roy, p.72, Problem 31] does not use these complications. Instead,
it uses the following simple method to prove Lusin's theorem: Every
measurable function is nearly a simple function [Roy, p.70, Problem 23 b]; Every
simple function is nearly a step function [Roy, p.70, Problem 23 c]; Every step
function is nearly a continuous function [Roy, p.70, Problem 23 d]. In
order to prove the last statement, all we have to do is use a line segment to
connect each gap of the graph of the step function.
Suppose the proofs of Theorem A and Theorem B are similar and the proof of
Theorem A is given. In order to prove Theorem B, we only need to modify the part
proof of Theorem A required by the situation.
In [Zyg, vol.1, p.104, l.
Zygmund says, "A minor modification in the proof of (7.15) shows that (7.16)
tends to 0 if (7.17) is replaced by
X(t) = o(t2)."
He hints that we should change an integral whose integrand has the factor
x to an integral whose
integrand has the factor
X using integration by
parts. [Zyg, vol.1, p.102, (7.16)] =
[Zyg, vol.1, p.103, l.13]
= A + B2 [Zyg, vol.1, p.103, l.4, l.9, and
l.12]. Therefore, we only need to modify A and B2,
but not B1. The observation that we need not
modify B1 allows us to avoid unnecessary computations.
The formulation and proof of a theorem should be centered around the essence of the theorem.
In a simply connected region, a harmonic function has a single-valued conjugate
function which is determined up to an additive constant [Ahl, p.162, l.
The key to proving this statement is the concept of exact differential [Ahl,
p.141, Theorem 15]. Ahlfors' approach centers around this key concept, so his
formulation and proof are direct and concise. In contrast, the formulations and
proofs of [Sak, p.444, Theorem 1.8; p.447, Theorem 1.11; p.447, l.-2-p.448,
l.1] are indirect and complicated without giving more information because Saks
fails to recognize the key concept.
Remark. The proof given in [Ahl, p.161, l.11-l.13] is simpler than the proof given in [Sak, p.445, l.5-l.17].
We want to prove that t1 maps the right half-plane Re(w)³0 onto the circular region given by the inequality [Wal, p.34, (7.3)].
Remark. There are many ways to prove the above statement; we try to find the simplest one.
Trick: Choose convenient points from the circle rather than from the imaginary axis.
Proof. Assume we know the fact given by [Ru2, p.298, l.18-l.19].
I. We want to find three points on the circle whose images are on the imaginary axis.
Solution: Take t = 0, (2 Re b1)-1(1±i).
II. We want to find a point inside the circle whose image is on the right half-plane.
Solution: Take t = (2 Re b1)-1.