Methodology in Differential Equations
- How we choose a setting that is most suitable for illustrating a method
(Partitions of unity [Mun, p.222, Theorem 5.1]) The construction of a partition of unity has wide
applications: topology, real analysis [Ru2, p.41, Theorem 2.13], and
differential geometry [Spi, vol. 1, p.69, Corollary 16]. In essence, the
construction of a partition of unity is a topological method. In order to
ensure the method's wide application, the setting must be general. In view of
[Dug, p.144, Proposition 3.2] and the diagram given in [Dug, p.311], a locally
compact, paracompact, or normal topological space meets the setting requirement.
In order to expressively illustrate a method's essence, the construction must be
simple. The method given in [Mun, p.222, Theorem 5.1] is simpler than that given
in [Ru2, p.41, Theorem 2.13]. The formulation and proof of Urysohn's lemma given
in [Dug, p.146, Theorem 4.1] is simpler than those given in [Ru2, p.40,
Proposition 2.12] and [Spi, vol. 1, p.44, Lemma 2]. Furthermore, choosing a finite partition of
unity will free us from considering the nuisance given in [Dug, p.170,
Definition 4.1(1)]. Except for settings, the constructing methods in [Mun,
p.222, Theorem 5.1], [Dug, p.170, Theorem 4.2] and [Spi, vol. 1, p.68, Theorem
15] are the same. All the above considerations make normal spaces the best
choice of a setting for illustrating "partitions of unity".
- Methods of solving PDE's.
- Adapting a method according to the occasion [Sne, p.226, l.17-l.19].
- We use the method of integral transforms when the membrane is of
infinite extent.
- We use separation of variables when the boundary has a
simple form.
- Using Lagrange's identity with various Green's functions and
integration contours.
- Dirichlet's problem for a circle (Poisson's integral [Sne, p.195, (12)]).
Green's function [Sne, p.195, l.-10].
Generalization [Sne, p.169, l.-16-p.170, l.-14].
- The Riemann-Volterra Solution of a one-dimensional wave equation [Sne, p.223, (12)].
Green's function [Sne, p.222, l.-3-l.-1].
The integration contour consists of characteristics and curves [Sne, p.222, Fig. 38].
Generalization: Sometimes the solution can be determined by the information on a part of the curve [Sne, pp.119-122].
- Solutions by Hilbert-space methods.
For a method, we would like to illustrate how it works in a simple model. If this new method expands the domain of application, we must show that it agrees with the old method in the old domain.
Example 1. x2 = 4.
Define the inner product as (x,y) = xy.
The bounded linear functional 41/2x corresponds to the Cauchy sequence {2,2,2,...}.
The real solution {2,2,2,...} can be identified with the rational solution 2.
Example 2. Dirichlet's problem.
Define the inner product as in [Joh, p.119, (5.13)].
The bounded linear functional given in [Joh, p.120, (5.16)] corresponds to v in H.
There exists a C2 solution V [Joh, p.123, (5.26); p.123, l.9] such that the Hilbert-space solution v=V a.e.
Example 3. Symmetric hyperbolic systems.
Define the inner product as in [Joh, p.166, (3.18a)].
The bounded linear functional given in [Joh, p.167, l.5] corresponds to U in H.
The weak solution
u = A L~ U can be identified with a strict solution in the ordinary sense [Joh, p.167, l.12].
- Analyticity allows us to solve a DE algebraically (see [Bir, p.96, Theorem 3] & [Pet, p.19, l.10-l.13]).
- (Comparison) Examples are used to show the advantage of a method. The method of successive approximations may solve both [Arn1, p.270, Example 2] and [Bur, p.4, Example 4], but the latter
problem cannot be solved by a finite combination of elementary functions [Bur, p.1, l.-12]. [Arn1] tends to emphasize the theoretical consistency between two methods and ignores the reason why we need a better method: To solve more ODE's.
- Providing a proof without a method involves giving the final answer first and then justifying
it from hindsight (see the proof of [Mun, p.322, Theorem 1.2]). We may distill a
method from the above justification. The method will enable us to systematically
proceed toward the solution by analyzing patterns and taking advantage of the
circumstances. [Mun00, p.327, l.7-l.10; l.11-l.14; p.328, l.-14-p.329,
l.6] are parts of the method. A method is useful for generalization [Mun00,
p.329, Theorem 51.3].
- The method given in [Wat1, §3.6] is a synthetic method.
Unless concrete examples are given [Ru1, pp.34-35, Proof of Theorem 2.40; Ru2,
p.227, Proof of Theorem 10.13], it is difficult to understand the method's true
meaning. There are two schemes in this method: [Wat1, p.53, l.23-25] is the
minor scheme; [Wat1, p.54, l.11-l.15] is the major scheme. Without examples, one
cannot tell which scheme is more important. Examples of the minor scheme are given in
[Ru1, p.35, l.2-l.3; Ru2, p.222, l.-12-l.-10].
The meanings of the minor scheme in these two examples are quite different.
Based on the proof of Cauchy's theorem given in [Ru2, pp.221-223, Theorem 10.13],
we need not use
the method of reduction to absurdity. In contrast, the proof of Cauchy's theorem
given in [Wat1,
§5.2] uses the method of reduction to
absurdity in [Wat1, p.84, l.-3].
Refined methods
We say that the most effective proof of Theorem A is more
refined than the most effective proof of Theorem B if the hypothesis of Theorem
A is weaker than that of Theorem B, but the conclusion of theorem A is stronger
than that of theorem B. We may use the proof of Theorem A to prove Theorem B
even though it may not be the most effective method to prove Theorem B.
Examples. [Zyg, vol.1, p.78, Theorem 1.26] is more refined than the statement
given in [Zyg, vol.1, p.78, l.-7-l.-6];
[Zyg, vol.1, p.81, Theorem 1.38] is more refined than [Zyg, vol.1, p.81, Theorem
1.36]; [Zyg, vol.1, p.90, Theorem 3.9] is more refined than [Zyg, vol.1, p.89,
Theorem 3.4].
Remark 1. There are two versions of [Zyg, vol.1, p.90, Theorem 3.9]:
A.
sn(x)®f(x)
for every x where Fx(h)
= o(h).
B.
sn(x)®f(x)
a.e.
If we adopt version A, we may use it to prove [Zyg, vol.1, p.89, Theorem
3.4]. In contrast, if we adopt version B, we will reach a point of no return.
Namely, we can no longer use version B to prove [Zyg, vol.1,
p.89, Theorem 3.4]. This is because the existence of x in version A is
constructive (more specifically, x is fixed), while the existence of x in version is only logical. Modern
mathematicians love to use the term "almost everywhere" in real analysis simply
because the meaning of this term is easier to remember than the meaning of Fx(h).
This is the reason why refined methods have almost become endangered species in real
analysis.
Remark 2. Let T be {zÎC| |z| = 1} and UÎL1(T).
[Ahl, p.168, Theorem 25] is a refinement of [Ahl, p.166, Theorem 24]. See [Ahl,
p.167, l.9-l.10]. In contrast, the formulation of [Ru2, p.258, Corollary] loses
the advantage of refinement.¬
- A method may apply to different cases in a theorem or to theorems with
similar hypotheses but different conclusions.
Example. [Zyg, vol.1, p.66, Theorem 11.9] contains two cases. [Zyg, vol.1, p.105, Theorem
8.1] contains two cases. The statement given in [Zyg, vol.1, p.106, l.-11-l.-10] contains two cases.
All the proofs of these six cases use the same method.
- Mathematical training may help us gain deep knowledge or sharpen our senses. Sometimes these two purposes go hand-in-hand; sometimes
one does not accompany the other.
"Sharpen our senses" means "Be able to recognize a method's subtlety and
distinguish nuances between two methods".
Fourier series apply to integrable functions of period 2p.
Fourier-Stieltjes series apply to functions of bounded variation defined in [0,2p].
Let F be a function of bounded variation defined in [0,2p].
When we discuss Fourier-Stieltjes series, it is advantageous to somehow make F periodic. The first method is to restrict our consideration to [0,2p]
and replace F(t) by F(t) - c0t
[Zyg, vol.1, p.11, l.16-l.18]. The benefit of making F periodic this way is that
c0 = 0. The second method is to extend F to the
real line by using [Zyg, vol.1, p.11, (5.2)]. Although the new function produced
by the second method is not exactly periodic, it has the advantage similar to
that of a periodic
function [Zyg, vol.1, p.11, l.11].
Remark. The above two methods do not
contradict each other. We may apply both methods in one theorem. We should be
able to distinguish the
nuances of the above two methods not only when we read their definitions [Zyg,
vol.1, ch.1, §5] but also when we apply them
to a theorem [Zyg, vol.1, p.107, Theorem 9.3; l.16].
- We may express a function as an integral using Fourier transforms.
- In order to grab the essence of the solution, we must know the fact that Laplace transforms [Tit, pp.6-7, §1.4] and
Mellin transforms [Tit, pp.7-9, §1.6] are variants
of Fourier transforms.
- How we choose contours
-
Fourier transforms [Tit, p.4, (1.2.5) & (1.2.6)]
- Generalized Fourier transforms [Tit, p.5, (1.3.4)]: For f (x) = ex,
we use the contour [-R+ai, R+ai]+[R+ai,
R+bi]+[R+bi, -R+bi]+[-R+bi,
-R+ai].
- Laplace transforms [Tit, p.7, (1.4.3)]: we use a closed curve C given in
[Tit, p.6, l.-1].
- Mellin transforms [Tit, p.7, (1.5.1) & (1.5.2)]: For [Tit, p.7, (1.5.3) &
(1.5.4)], we use the contour G given in [Tit, p.8,
l.-10].
Remark 1.
It is important to understand the physical meanings of Fourier transforms [Coh, p.1462, (28) & (29); Dit,
pp.74-76, 4(9)-4(12); Dit, p.71, l.14-l.15].
Remark 2. [Guo, pp.101-103, §3.7] emphasizes
the details of an integral representation rather than the motive and method of
the solution. The discussion of Laplace transforms and that of Mellin transforms in [Lang1, p.272 and p.283] are not organized.
- For the proofs of a theorem, on the one hand we should seek the one that is simplest,
most effective and must insightful. On the other hand, in order to avoid being
trapped by the same pitfalls, we should
learn historical lessons by criticizing the shortcomings of each available proof
with a single pertinent remark that hits the nail on the head.
Example. (Mean value theorems)
The first mean value theorem essentially says that a continuous function maps a connected set onto a connected set. The second mean value theorem is derived from the first mean value theorem using integration by parts.
For applications, the most
convenient form of the second mean value theorem is that given in [Wid, p.163, Corollary 4.1 or Wat1,
p.66, l.-7-l.-5]. The
most insightful formulation of the second mean value theorem is that given in [Wid, p.162,
Theorem 3; p.163, Theorem 4]. Rudin's formulation of the second mean value
theorem [Ru1, p.124, Theorem 6.32] is not as closed related to the first mean
value theorem as is Widder's formulation. The proof given in [Wat1, p.66, l.1-l.-9]
is too complicated. In order to focus on the essence of the proof, Watson should have used
the method of integration by parts [Wid, p.160, Theorem 2] instead of the
method of summation by parts [Zyg, vol.1, p.3, (2.1)]. In his proof of the
second mean value theorem, Waston should not have repeated the proof of [Wid,
p.160, Theorem 2] and the proof of [Ru1, p.112, Theorem 6.14].
- Constructing continuous functions that are non-differentiable.
In each of the sections [Tit2, §11.21, §11.22 and
§11.23], Titchmarsh constructs a continuous function
that is not differentiable. The first one is simplest. This shows that we
should start a project with a small task. The first and the third example show
that if the derivative were to exist, it would have two different values. The
second example shows that if the derivative were to exist, its value would be +¥.
Thus, the three constructions and proofs are similar. The reduction to
absurdity Titchmarsh uses can be considered trivial. Thus, Titchmarsh's proofs
are effective. In contrast, modern mathematicians love to use a non-trivial
[1]
reduction to absurdity to construct continuous functions that are non-differentiable. Due to their
negligence the method of construction in modern textbooks deteriorates.
- When we try to solve a problem, we should focus on the essence of the problem.
[Sak, p.108, Theorem 3.4] is a corollary of the general theorem given in [Sak, p.107, Theorem 3.1]. The latter theorem
specifies the conditions under which we may differentiate under the integral sign.
Thus, the latter theorem highlight the essence of [Sak, p.108, Theorem 3.4]. In
contrast, [Ahl, p.121, Lemma 3] provides only a trick to solve a particular
problem. After studying Ahlfor's proof, we do not know the general method of solving this type
of problem.
Another drawback of Ahlfor's proof is that the scope of mathematical induction used in his proof is too broad. We should limit the scope of mathematical induction used in a proof as narrowly as possible.
In the proof of [Sak, p.108, Theorem 3.4], the mathematical induction is used to
derive
d [z-(k+1)]/dz =
-(k+1)z-(k+2)].
In contrast, Ahlfors tries to justify his differentiation in each induction
step [Ahl, p.121, l.-8-p.122, l.8]. In other words, he differentiates under
the integral sign countable times. On
the one hand, it makes us spend too much time figuring out trivial details; on
the other hand, it allows the scope of mathematical induction to run out of
control.¬¬
- Specifying a simple procedure to produce the entire list
(Contiguous hypergeometric relations)
z(dF/dz)=z(ab/c)F(a+, b+, c+)
= a[F(a+) - F]
= b[F(b+) - F]
= (c-1)[F(c-) - F]
= [(c-a)F(a-)+(a-c+bz)F]/(1-z)
= [(c-b)F(b-)+(b-c+az)F]/(1-z)
= z[(c-a)(c-b)F(c+)+c(a+b-c)F]/[c(1-z)].
The above identities can be proved by using the power
expansion of the hypergeometric function. Thus,
F can be written as a linear combination of any two of its contiguous functions, with
rational coefficients in terms of a, b, c, and z. This gives fifteen relations by
identifying any two terms of the last five equalities.
Remark. [Guo, p.202, Exercise 4.2] lists fifteen relations, but fails to provide
a simple method of producing them in an organized manner.¬
- If a method is merely restricted to a specific case, we should ask what
method can be used for the general case.
[Guo, p.253, l.4-l.7] proves the equality in case Re(n) > -1.
This restriction ensures that the integrals along the two small circles given in
[Hob, p.195, Figure] converge to zero. In contrast, [Hob, p.193, l.-2-p.194, l.8]
can be used for the general case.
I.
The statement given in [Hob, p.194, l.1] can be proved as follows:
Proof. Let g(q)=-1+reiq, where qÎ[-p, p]. a(t)=-t. Then
òg (t2-1)n tr dt = - òa○g (t12-1)n (-t1)r dt1
[dt = t'-t = (-t1')
- (-t1) = -dt1]
= òa○g (t12-1)n t1r dt1.
a○g
is positively oriented, while the contour of integration for ò0(1-) is negatively oriented.
Although a starts from 0 and goes around t = -1 counter-clockwise once, the argument of t+1 at t = 0 returns to its original value from the viewpoint of the Riemann surface for (t-1)n.
Consequently, the integration for ò0(-1+) will not affect that for ò0(1-).
II.
The fact that the phase of t'-1 increases from -p to p [Hob, p.194, l.4-l.5] can be verified by substituting t = 0, 1-i, 2, and 1+i into t2-1.
III. We can prove the statement given in [Hob, p.194, l.7-l.8] for the case Re
(n+1)>0. After putting G-1
on each side of the equality, we prove the general case by analytic continuation.
Remark. The equality given in [Hob, p.195, l.-10] should have been proved step by step as given in [Guo, p.253,
l.4-l.5].
-
When we study a new concept, the first thing we should do is relate it to a
familiar concept by establishing a major link between them. This is because
analogy provides
a vantage point to see the big picture. Before the link is established, every
task is difficult. Once the link is established, every task becomes easy.
[Wat1, §15.8] discusses three theorems in the following order:
- The function given in [Wat1, p.329, l.21] satisfies the differential equation given in [Wat1, p.329, l.18].
- [Wat1, p.329, l.-11-l.-9, Theorem (I)]
- [Wat1, p.329, l.-7-l.-5, Theorem (II)]
In my opinion, following the above order is a bad approach. It is better if we
discuss C first.
This approach will enable us to quickly establish a relationship between Cn-rr+1/2 and Pnr.
We can prove the equality given on [Wat1, p.329, l.-5] using [Hob, p.189, (12)] and [Guo, p.276, (10)]. Without using C,
it is difficult to prove A and B if one tries to follow the proof patterns given
in [Wat1, p.303, l.-15-l.-9; p.304, l.6-l.13].
This is because Cn-rr+1/2
differs from Pnr by a factor containing (z2-1)-r/2. The difference will cause the number of terms to explore
when we use differential operators [Wat1, p.304, l.9].
It may also cause other problems when we compare coefficients [Wat1, p.303,
l.-13].
However, once C is proved, the proof of A and that of B will become easy. This
is because there are corresponding properties of Pnr
ready for use. A
follows from C and [Guo, p.250, (16); Wat1, p.326, l.11]. B follows from C and [Guo,
p.256, (8); Wat1, p.324, l.18]. The formula given in [Wat1, p.324, l.18] is
based on Ferrers' definition [Wat1, p.323, l.-5].
When we apply the equality to the proof of B, we must add a factor of (-1)(1/4)-n
to the right-hand side of the equality. This is because the notation Pnr given in
[Wat1, p.329, l.-5] is based on Hobson's definition
[Wat1, p.325, l.-3] instead of Ferrers' definition [Wat1, p.323, l.-5].¬¬
-
In order to gain flexibility, we should apply the essential idea rather than the exact form of the general theorem to a specific case.
Example. Coddington proves [Cod, p.20, Theorem 5.1] using the exact form of
the general theorem [Cod, p.12, Theorem 3.1]. However, the domain of the
solution in the general case is not large enough to meet the requirement.
Consequently, he uses [Cod, p.15, Theorem 4.1] to extend the domain. In
contrast, the proof of [Pon, p.167, l.-10-p.169, l.-1]
applies the same idea
directly to the entire domain. Consequently, it is immune from the fuss of domain expansion.
The techniques of domain expansion given in [Cod, p.15, Theorem 4.1] and [Cod,
p.20, Theorem 5.2] are unnecessary and insignificant.
- Methods of solving ODE
- The homogeneous linear equations with constant coefficients
- Simple roots: [Detail: Inc1, § 6.11. Method: Inc1, p.134, l.2]
- Multiple roots: [Detail: Example 5.2.3.6 of http://www.lcwangpress.com/papers/absurdity.pdf; Inc1, §6.13,
§6.14. Method: Inc1, p.137, l.12-l.13]
Remark 1. The scope of the method extends beyond the case in which the coefficients are constants: [Wat1, p.201, l.6-l.7; p.208, l.-11].
Remark 2. The Euler linear equation can be transformed into a linear equation with constant coefficients by means of the substitution x = ez [Inc1, p.141, l.-6].
- The nonhomogeneous linear equations of order n
- The method of variation of parameters [Inc1, §5.23]
- The resulting formula that constructs Green's function [Cod, p.101, Problem 21]
- A formal solution allows us to quickly obtain a solution candidate
and have a crude blue print for solving the problem.
If the conclusion of a theorem is valid in most cases, then we simply apply the conclusion to a problem without checking if the situation satisfies the hypothesis of the theorem. Thus, we use the theorem first and justify the application latter. This
formal procedure allows us to quickly obtain a solution
candidate and have a crude blue print for solving the problem.
Example 1. In [Guo, p.81, l.12], we assume that the interchange of the
integral sign and Lz is valid. This procedure allows us to quickly obtain a formal solution u(z) [Guo,
p.81, (5) & (6); p.82, (7)]. The assumption will be justified later case by
case. For example, in order to prove both that [Guo,
p.302, (2)] satisfies [Guo, p.302, (1)] and that the integral given in
[Wat1, p.339, l.-5] satisfies [Wat1, p.337, (B)], we use [Ru2, p.27, Theorem
1.34] to justify the differentiation under the integral sign.
After we obtain a
formal solution, it is easy to
forget to prove it to be a true solution. In [Guo, §6.4],
Guo fails to rigorously prove that the formal solution given in [Guo, p.302, (2)] is a solution of [Guo, p.302, (1)]. Since the integral given in [Guo, p.305, (1)] and the
left-hand side of the equality given in [Guo, p.305, (2)] are obtained by replacing zt by -t in
[Guo, p.32, (2) and (3)], [Guo, p.305, (1)] can only be considered a formal
solution of the Whittaker equation [Guo, p.300, (1)]. Guo also fails to rigorously prove that this formal solution is indeed a solution. In contrast, [Wat1, p.339, l.-6-p.340, l.6] rigorously prove that the integral given in [Wat1, p.339, l.-5] is a solution of the Whittaker equation [Wat1, p.337, (B)]. Note that Watson leaves out a factor, (-1)-k-1/2+m, on the right-hand-side of the equality given in [Wat1, p.340, l.2-l.3].
Example 2. We can quickly derive [Guo, p.298, (6)] from [Guo, p.143, (10)] by replacing z by z/b, and letting b®¥.
This formal procedure is justified in [Guo, p.302, l.2-p.303, l.5].
Example 3. We can quickly derive [Guo, p.303, (6)] from [Guo, p.153, (7)] by
interchanging a and b, replacing z by z/b, and letting b®¥.
This formal procedure is justified in [Guo, p.303, l.10-l.17].
- Finishing multiple goals with one shot
Example. [Guo, p.324, l.-2-p.325, l.7] finds the integrals given in [Guo, p.325, (17)] with one shot, while the proof given in [Wat1, p.350, l.-11-p.351, l.12] is divided into two cases.
- If there are several methods available, we should choose the simplest one because it is
the most effective one. Sometimes one method is always simpler than others; sometimes the choice for simplicity
varies from case to case.
Example 1. It is simpler to use [Inc1, p.161,
l.21-l.23] rather than [Wat1, p.202, l.16] to find out whether z = ¥ is a regular
singular point of the second-order ODE.
Example 2. (The root test vs. the ratio test)
For convergence tests, we choose the ratio test for S zn/n!,
and choose the root test for S ([2n2+1]/[n2+1])n.
- Specialization makes mathematical formulations effective. Specialization
discusses a topic case by case and reduces results to the simplest form for each
case. The form should be precise,
definite, exclusive, and final.
Example. Although the form given in [Wat1, p.365, l.2-l.3] is good for generalization [Wat1, p.368, l.-9-l.-1], it is not as effective as the forms given in [Guo, p.351, (5) & (6)].
First, the former form has not been reduced to simple form for each case, so it
is not good for direct application. Second, if a series terminates, we want to
know how many terms it has, otherwise the answer is not complete. [1]
- If a method can be used only for simple cases, we should change perspectives to find another method that provides an easy access to all the cases
Example. Let m£[n/2]. Then S k=m[n/2] C(n, 2k) C(k, m) = 2n-2m-1n[(n-m-1)(n-m-2)…(n-2m+1)](m!)-1.
Proof. Using the expansion of (1+x)n and then letting x=1 or -1, we can prove case m=0.
Using the expansion of (d/dx)[(1+x)n] and then letting x=1 or -1, we can prove case m=1.
However, it will become difficult to prove case m ³ 2
if we continue to use the above combinatory method. We shall resort to Bessel
functions.
(sinh q + cosh q)n + (sinh q - cosh q)n = Sk = 0[n/2]Sr = 0k C(n, 2k) C(k, r) sinh qn-2k+2r.
The result follows from [Wat1, p.375, l.18-l.20; Wat, p.272, (4)]. The properties of Bessel functions match our needs naturally and perfectly
just as the properties of the Riemann zeta function match the needs for proving
the prime number theorem.
- In order to have a theory of good quality, we should keep theorems as strong as possible.
Acquiring a weaker theorem by choice or ineffective methods will affect the quality of theory's development from then on.
Suppose AÞBÞC. It will result
in a stronger theory if we state A and B as theorems than if we state A and C as theorem because according to syllogism, C cannot return to B.
- Physical methods
- Physical interpretations of a problem [Laplace's equation: Wat1, p.386, (I); Born, p.11, (7)]
- Selecting a coordinate system: We should choose a convenient coordinate system based on an object's geometric
symmetry [Jack, p.104, l.6-l.12].
- Physical interpretation of a solution [Born, p.16, (8); Wat1, p.397, l.8-l.19]
- Physical considerations help select meaningful solutions [Jack, p.107, l.-2-l.-1; Coh, p.648, (C-9); p.652, l.17, p.664, l.2].
- Solutions must be well-defined: In [Jack, p.104, l.15], we consider x (= cos q) instead of q;
in [Jack, p.105, l.-5], we restrict r to be great than 0.
- Choosing an appropriate solution form [Jack, p.104, l.-6-l.-5].
- Physical proofs
Color painting adds more dimensions and varieties to black-and-white drawing. Similarly, physical and geometric proofs provide richer
meanings, insights, and interesting stories than analytic proofs. Let us study
the following three proofs of the addition theorem for spherical harmonics:
[Wat1, p.395, l.7-l.21], [Coh, vol. 1, p.688-689], and [Jack, p.110, l.12-p.111,
l.10]. In terms of the publishing dates of the above textbooks, the proofs of
the later published books are better. The improvements are as follows:
- The choices of notations, coordinate systems, orthonormal functions become more compatible to the
physical theme of the theorem.
- Notations: For spherical harmonics, the notation given in [Jack, p.108, l.-11] is concise, while the notation given in [Wat1, p.392, l.-4] is awkward. The formula given in [Jack, p.110, (3.63)] is concise, while the formula given in [Wat1, p.393, l.-7] is awkward. The awkward formulas given in [Wat1, p.394, l.3-l.-9] may blur the essential ideas.
- Orthonormal functions: Since we are discussing the solutions of Laplace's equation in spherical coordinates [Jack, p.95, l.-13;
Wat1, p.391, l.-1],
it is more appropriate to choose Ylm on the unit sphere instead of Pnm on [-1, 1] as the desired set of orthonormal functions [Jack, p.108, l.16].
- Ideally, the best physical proof is the one each of whose step has a
pertinent physical interpretation. The development of physical methods shows the tendency toward such an ideal:
- The choice of n given in [Wat1, p.395, l.12] lacks physical motivation, while
the proof of [Coh, vol.1, p.688, (72)] supplies a physical reason: An eigenfuction of the angular momentum L2 remains as an eigenfunction with the same eigenvalue after a rotation. The fact that the rotation operators commute with L2 [Coh, vol.1, p.688, l.-15-l.-14; p.699, (57)] is more obvious than the fact that
Ñ2 is invariant under the rotation operators [Jack, p.110,
l.-8].
- Strictly speaking, Watson leaves a gap in the proof of the formula given in [Wat1, p.395, l.15]. Because q1' is a function of (q,f) and (q',f'), he should have expressed Pn(cos q1') as an expansion
of spherical harmonics in a form similar to that of the formula given in [Coh, vol.1, p.688, (74)]. If the expansion involved a term Ykm, where k is
other than n, then he would not be able to derive the formula given in [Wat1,
p.395, l.15]. Either poor notations or the lack of physical motivations fails
him to detect the said gap.
- [Coh, vol.1, p.689, (77)(i)] is derived from the fact that rotations form a group. [Coh, vol.1, p.689, (79)] is the Schwartz inequality. Therefore,
the discussion given [Coh, vol.1, pp.688-689, §g(iii)] is purely analytical.
Actually, its idea is worse than that given in [Wat1, p.395, l.12-l.17]. Consequently, [Jack, §3.6] replaces it with [Jack, p.111, l.1-l.6] which
reveals more insights about rotation and angular momentum.
- If we correct the above drawbacks and make the following changes, the proof given in [Jack, §3.6] would be perfect.
[4p(2l+1)-1]1/2 Ylm*(q(g,b), f(g,b)) = Sm = -ll
AlmYlm(g,b) [Jack, p.109, (3.58); Coh, vol.1, p.688, l.-12].
Let g = 0, we have [4p(2l+1)-1]1/2 Ylm*(q', f') = Al0
= Am(q', f') [Jack, p.366, (3.66) and (3.60)].
- Natural methods
- If we have choices for quoting, proving or classifying a theorem, we select the simplest, most direct, special, natural and original one.
Example. Pringsheim's criteria for convergence
[1].
-
In order to characterize a concept, we should seek a minimal set of its
necessary conditions strong enough to become its sufficient conditions.
In the proof of [Perr, p.276, Satz 38], Perron shows that the sufficient conditions
for convergence originate from its necessary conditions [Perr, p.274, l.-6-l.-4]. This approach enables us to see how the theorem is
manufactured and formulated. In contrast, the proof given in [Wal, p.37, Theorem 8.1] fails to explain how the sufficient conditions are obtained
because it fails to provide a reason for the artificial classification of cases. See [Wal, p.38, l.4-l.5; l.17].¬¬
- How we recognize and appreciate the value of refined methods
In order to fully understand a refined method, we should not only know what it
is, but also recognize its value and key points.
- We should distinguish a refined method from other crude
methods by observing the effects it creates
First, we want to know from where the method comes. What problem
motivates mathematicians to create such a device? What obstacle does this
refined method can conquer, while other old methods cannot?
Suppose z = ¥ is a singularity of the second kind, we know the solutions of [Cod,
p.151, (4.1)] for the real case, and we want to find the solutions for the complex case
[Cod, p.161, l.-8-l.-5]. Then it requires to replace the boundedness of f at z = ¥
in [Con, p.125, Theorem 1.4] with a growth condition, i.e., to prove [Con, p.135, Corollary 4.2].
See [Con, p.124, l.-2-p.125, l.1; Cod, p.164,
l.-10].
- We should not take a musket to kill a butterfly
We should highlight the amazed effects that a refined
method produces. If an old, crude method can do, it is unnecessary to use a new, refined method. Using refined methods to do crude things is simply a waste.
For example, it is unnecessary to use the Phragmen-Lindelöf method to prove
[Ru2, p.274, Theorem 12.8]: we can prove the statement given in [Ru2, p.275,
l.11] using [Con, p.125, Theorem 1.4].
To specify a bound given the boundedness of f at z = ¥ is
not as amazing as to specify a bound given the growth condition of f. Compare [Ru2,
p.274, Theorem 12.8] with [Con, p.135, Corollary 4.2].
- How to highlight the key idea of a refined method
- Use the method of standardization to eliminate unnecessary complications. For example, use a symmetric case
[Con, p.135, Corollary 4.2] to represent the general case [Cod, p.162, Theorem
A] without loss of generality. See [Con, p.135, l.-3-l.-1].
- For the formulation of a refined method, we should follow the method's origin
and preserve its original setting. For example, [Con,
p.135, Corollary 4.2] is a right version; see [Con, p.124, l.-1-p.125,
l.1]. Adopting other versions such as [Con, pp.134-135, Theorem 4.1] or [Ru2,
p.276, Theorem 12.9] may
distract us from the essence of the Pragmen-Lindelöf method.
- A proof should be well-structured and insightful.
Both [Hob, 360, l.12-p.361, l.21] and [Guo, p.268, l.2-l.16] prove the same formula. The former proof uses
the integral of sin and that of cos, while the latter proof uses the residue theorem.
In order to see why the statement given in [Hob, p.361, l.18-l.19] is true, we
have to do some calculations from the viewpoint of the former proof. However,
from the viewpoint of the latter proof, we can see the reason directly [Guo,
p.268, l.12-l.16]. Consequently, the latter proof is well-structured and insightful.
- (Structures: the natural order of discussion)
(Hyperboloid of one sheet, one piece, x2/a2+y2/b2-z2/c2 = 1)
® (Cone, two pieces with one-point intersection, x2/a2+y2/b2-z2/c2 = 0)
® (Hyperboloid of two sheets, two disjoint pieces, x2/a2+y2/b2-z2/c2 = -1)
The arrangement [Fin, §311]® [Fin, §313]® [Fin, §315] is good, while
the arrangement [Bell, §59-§63] ® [Bell, p.99, (2) ® (3)] is not good.
- Links {1,
2
(the method of majorants)}.