Proofs in Mechanics

Proofs in Mechanics

(Equilibrium)

[Rei, p.75, l.1-l.12] gives an analytic proof of [Rei, p.75, (2.9.4)]. In contrast, the proof of [Kit, p.66, (26)] gives the physical meaning behind the mathematical formula.
[Pat, p.101, (3)] and [Pat, p.102, (4)] are similar. The similarity leads to the guess that [Pat, p.102, (5)] is valid. However, a guess is not a proof. Pathria's failure to transform the guess into a proof causes him to neglect many subtle ideas. A good illustration for the first equality in [Pat, p.102, (5)] can be found in [Rei, p.218, (6.6. 22)]. The proof of the second equality can be found in [Kit, p.67, (32)].
The proof of [Rei, p.114, (3.8.9)] involves a difficult concept

[Rei, p.114, l.5]. The proof of [Kit, p.67, (32)] nicely avoids this difficulty. In addition, Reif’s proof is only an estimate, while Kittel’s proof gives a precise equality.

Two approaches to a particular problem may lead to different conclusions. By comparing conclusions, we can determine which approach more accurately reaches its target.
    By Fourier transforms [Jack, p.330, (7.104)], we derive
[ε(ω)/ε₀]−1=−[G'(0)/ω²].
By [Jack, pp.309-310, (7.49), (7.50) & (7.51)], we derive
[ε(ω)/ε₀]−1=[ω_p²/ω²]   [Jack, p.313, (7.59)].
Thus, if we adopt the viewpoint of dipole moment rather than the viewpoint of Fourier transforms, we will be able to obtain a stronger and more precise conclusion.

Cohen-tannoudji derives the formula for the center of a wave packet by showing figures [Coh, p.26, Fig. 5] or using the concepts in optics [Coh, p.25, l.−11]. He only points out the key ideas in deriving the formula. However, from the perspective of mathematics, an important factor which leads to a conclusion is only a part of the argument. It is not a complete proof.

Complete proof.

The left-hand side achieves its maximum only when x=x₀.
    Although extrapolation may allow one to extend one's knowledge to an unexplored domain, it does not mean that we may leave out the steps necessary to complete a proof.
To understand a theorem [Go2, p.317, (7-125)] from various approaches, we must figure out the potential of each argument. Some arguments are crude [Go2, p.317, l.−6] but intuitive, so they can be used as a double check.

Legitimate renaming saves one from having to repeat the proof [Coh, p.706, l.5]. (1). The powers of a cyclic permutation (x, y, z) preserve the orientation of coordinate axes. Thus it does not matter if we name the first coordinate axis x [i.e. (x,y,z)], y[i.e. (y,z,x)], or z[i.e.(z,x,y)].

Any theorem has its own theme. Although a building block (trees, fundamental cuts) for the proof of Theorem A [Win, p.27, l.21, Theorem 1.5] can be used to prove Theorem B, the approach may bury the central idea of Theorem B in unnecessary complications [Win, p.83, l.12]. A simple proof [Cor, p. 156, l.−3] is the one that can clearly reveal the theorem's central idea (A branch current is the difference between two neighboring loop currents).

Proof that provides an insight vs. rigorous proof. Insight: [Sym, p.315, (8.115) & (8.116)]. Rigorous proof: [Cho, p.11, Lemma; p.14, l.11-l.13].

Each of the following proofs can be characterized by the type of the strategy it employs.
(Easy Start) It puts one in an adventageous position so that one may easily make progress. Easy start: Consider a small cube [Mar, p.185, l.1-l.−7]. Not easy start: [Cho, p.11, Lemma].

(Straightforward proof) It proceeds directly toward one's goal and prevents one from becoming entrapped in a maze of formalism. Straightforward: [Mar, p.180, l.5]. Not straightforward: [Sym, p.320, l.3-l.6].

(Proof that can determine the contributing factor) It pinpoints the key component that affects the conclusion after one decomposes the object under study into several components. Proof in this category: [Cho, p.24, (2.1); p.27, l.3]. Proof not in this category: [Cho, p.11, Lemma].

Proof vs. Simulation
    Computer simulation requires that the construction of a proof be straightforward. It is very difficult to implement reduction to absurdity on a computer. Sometimes the argument of a proof does not give a specific method. In that case, we must choose the method that is most suitable for the problem. For example, it is difficult to implement the proof of [Ru2, p.254, Theorem 11.9] because Rudin fails to specify the construction of g_k's [Ru2, p.255, l.1]. In contrast, it is easier to implement [Col, p.280, §6] because Collaz provides the specification in [Col, p.280, (V.19)].

An intuitive statement should require little validation. If the validation is too complicated, the proof may have the following symptoms [Lin, p.448, Theorem 2]: First, the language is too technical or professional. Few people understand it. Second, the setting is too complicated. Third, the method of validation may include unnecessary steps.

Only the original proof for a concrete case is significant because it uses physical milestones to initiate ideas that lead to proof's details.

    Reif's proof [Rei, p.37, §1.11] of the central limit theorem is organized in a simple manner, but it involves many concepts such as random walk, the Dirac δ-function, Fourier transform [Rei, p.36, (1.10.5)], expectation and variance. I would like to call his physics-style proof a proof with milestones. There are a few distinctive rest stops between the beginning and the end of his proof : (1). The random walk case: Rei, (1.4.6), (1.4.9), (1.5.19) [Q.E.D. for this special case]. (2). The random variable changes from discrete to continuous [Rei, (1.6.3)]. (3). The general case: [Rei, (1.9.3), (1.9.12), (1.10.8), (1.11.8)]. We start with a special case because it has a clear purpose and eliminates unnecessary complicated considerations. Furthermore, the net movement for a random walk is bounded and traceable. However, understanding the sum of general random variables requires the concept of convolution. Consequently, the image of the general sum is difficult to visualize. In contrast, a math proof [Boro, p.152, Proof of Theorem 2] is different: It only considers the beginning and the end and does not care which path its takes. Generally speaking, it avoids big theorems and prefers an elementary approach or a short cut. Thus it lacks milestones to distinguish its journey. This is a serious flaw in a math proof because it means that the proof lacks the big picture to inspire, to guide and to facilitate memory.

The purpose of a proof is to point out the key idea (i.e., the simple and direct reason that leads to the conclusion) rather than complicate the original problem more than necessary.
    Consider the proof of laws of refraction and reflection. From the mathematical point of view the key idea is [Born, p.37, l.5-l.7]. From the physical point of view the key idea is [Jack, p.304, l.9]. The boundary conditions [Hec, p.112, (4.15)] provide only the secondary factor instead of the key idea. This is because the boundary conditions are difficult to prove and because e^ix=e^iy implies x=y+2np rather than x=y [Hec, p.112, left column, l.- 6]. Except for the part [Hec, p.112, left column, l.1-l.22], the proof of laws of refraction and reflection in [Hec, pp.111-112, ' 4.6.1] is very good.

[Bra, p.109, l.13] fails to explain why a _k⁽¹⁾ is assigned to be 0. [Lan3, p.134, l.- 5-l.- 1] points out the subtle reason. The choice in [Coh, p.1099, l.6] makes the proof rigorous.

The classical proof vs. the quantum mechanical proof.
    Any theorem in classical mechanics can be interpreted using quantum mechanics. Therefore, a theorem in classical mechanics has its corresponding version in quantum mechanics.

The conclusion of the quantum mechanical version must agree with the conclusion of the classical or semi-classical version [Coh, p.1239, l.1-l.3].
The axioms of classical mechanics are less universal than those of quantum mechanics because they do not apply to the microscopic world. In other words, a quantum mechanical proof puts the theorem on a more solid foundation.
When generalizing a classical proof to a quantum mechanical one, we should modify the concepts of state, measurement, etc. accordingly.
The classical proof of the Maxwell’s law of distribution of velocities is obvious [Pat, p.149, (12)]. In view of Kittel’s argument [Kit, pp.392-393] it can be said that his quantum mechanical proof is much ado for nothing. Nevertheless, let us review Kittel’s proof. Using the density of states, Kittel changes from dv to dn and then switches back to dv. In classical mechanics, any physical quantity such as velocity can have its own probability distribution. In quantum mechanics, the probability distribution only refers to the number of microstates. In this sense, quantum mechanics unifies the concept of probability distributions.

A good proof enables us to gain insight into the big picture through simple devices.
    Both [Coh, pp.792-797, ' 3] and [Lan3, pp.117-120; pp.659-663, ' d] derive the formulae for the radial functions. In Landau’s derivation, the simple forms [Lan3, p.661, (d.11) & (d.12)] help us visualize the solution of the differential equation [Lan3, p.659, (d.1)]. [Lan3, p.662, (d.14)] is derived by a simple contour integration [Lan3, p.662, (d.13)] and a simple substitution t® t+z [Lan3, p.662, l.17]. In contrast, Cohen-Tannoudji replaces [Lan3, p.662, (d.14)] by a lengthy calculation [Coh, p.795, (C-31)] and a comparison between [Coh, p.795, (C-26)] and [Coh, p.796, (C-33)]. A series is an abstract entity because it is difficult to visualize its geometric or analytic [say, its zeros] properties. Therefore, Cohen-Tannoudji’s series approach obscures the big picture although it is more directly oriented toward the specified estimation [Lan3, p.118, (36.5)].
Remark. There is a big difference in logic between "rigorous argument" and "precise estimation". In [Coh, p.794, l.- 13], Cohen-Tannoudji should have claimed, "This estimation is not precise when r does not approach +¥ , …" instead of "This argument is not rigorous, …" In fact, the argument is rigorous for the case r ® +¥ . When r ® +¥ , the contribution of r^lto r^le^{±
r l} is not as important as that of e^{± r
l} , so the former factor can be ignored. When r does not approach +¥ , [Coh, p.794, (C-22)] is no longer applicable. We need a more refined argument to obtain the precise solution, see [Lan3, p.662, (d.14)]. [Coh, p.796, (C-32)] treats ~ as =. The justification of this treatment needs extra work. In this sense, it is appropriate to say that Cohen-Tannoudji’s derivation of his conclusion in [Coh, p.796, l.16] is not as rigorous as Landau’s [Lan3, p.119, l.6].

The method of synthesis consists of listing similar examples and then identifying their common feature [Ashc, p.140, l.-10-l.-9]. However, no matter how long the list, the method can only be used to predict a general theorem [Bir, p.273, Theorem 5]. A proof is still required to validate the theorem.

For an argument, we must clarify our point of view. In other words, we must organize the levels of assumptions in an appropriate order. Only then may we discuss the purpose (or the key point) of the argument.
Examples.
The viewpoint of [Ashc, p.224, l.-25-l.-19] is classical. The quoted passage explains why the argument is confusing and why it limits our capabilities.
The viewpoint of [Ashc, p.224,l.-18-p.225, l.-25] is semiclassical. The quoted passage explains why the acceleration of the electron is opposite to the external electric force when it approaches a Bragg plane.
Suppose we apply external fields to electrons. [Ashc, p.227, l.-19-l.-15] compares the orbit of a Bloch electron with that of a free (not under the influence of the periodical potential in a solid) charge. The quoted passage specifies when the orbit of a Bloch electron resembles that of a free electron and when it resembles that of a free hole.

[Jen, p.30, l.-6-p.31,l.3] proves a theorem on minimum deviation using both the theoretical argument and an experimental result. The additional experimental knowledge makes Jenkins' proof valuable for the following two reasons: First, it establishes the truth of the theorem when a more rigorous proof [Hec, p.188] is not immediately available. Second, it directly points out the physical reason (symmetry) why the experimental result would be contradicted if the theorem were false.

[Jen, p.52, l.-8-l.-1] provides a graphical method to simulate the real process. [Jen, p.53, l.1-l.9] proves that the resulting diagram is in proportion to the real picture. That is, the various lengths in [Jen, p.57, Fig. 3K] satisfy [Jen, p.56, (3n)]. However, a purely mathematical design that satisfies the formula [Jen, p.56, (3n)] lacks a physical foundation (i.e., no physical meanings can be attached to the construction in [Jen, p.53, Fig. 3I]), so [Jen, p.53, l.1-l.9] is not a proof of the physical formula [Jen, p.56, (3n)].

The derivation of the lens formula in [Jen, p.73, §4.14] is simpler than that in [Hec, p.158, (5.17)].

(The Gaussian formula)
    The derivation of [Hec, p.154, (5.8)] is based on Fermat's principle, while the derivation of [Jen, p.56, 3n)] is based on Snell's law. The proof of Snell's law requires very little knowledge of wave theory. However, an understanding of Fermat's principle requires deeper and more abstract knowledge of wave theory. Since geometrical optics like classical mechanics prefers to formulate its theory without using the wave concept, Jenkins' proof is more pertinent and intuitive.

One should not base one's proof on a shaky assumption.
    [Lan2, pp.101-103, 43] proves Biot-Savart's law [Sad, p.263, l.2-l.6] using Maxwell's equations [Lan2, p.102, (43.1) & (43.2)] derived from special relativity. In contrast, [Sad, p.290, l.-13-p.291, l.5] proves Biot-Savart's law using [Sad, p.285, (7.41)], an analogue of [Sad, p.285, (7.40)]. Analogy can be used to formulate a guess, but should not be used to validate the analogue of a theorem. Therefore, Sadiku's proof is based on a shaky assumption. Given his desire to use analogy, one also wonders why he does not derive Biot-Savart's law directly from Coulomb's law [Jack, p.175, l.25] in the first place. It would use a similar analogy and the proof would be much simpler. Sadiku's reasoning can, at best, be considered a detailed check for consistency, but should not be regarded as a proof.
Remark. One can derive relativistic momentum and energy of a charged particle either from the principle of least action [Lan2, pp.24-29, §8 & §9] or from the law of conservation of energy and momentum [Jack, p.533, l.6]. You may say these two methods also use analogy. However, the main point is that we must push the use of analogy to a level as basic in logic and universal in applications as possible.

If a theorem can be formulated in several forms, one should not use one form to prove another form because it is tantamount to using a statement to prove itself.
    Sadiku's derivation of Ampere's law in [Sad, p.291, l.6-p.292, l.3] basically follows the idea in [Cor, pp.351-352, §19.2, §19.3, §19.4]. The only difference is that Sadiku uses the differential form of Ampere's law to establish [Sad, p.291, (7.60)]. Thus, Sadiku actually uses the differential form of Ampere's law to prove the integral form of the same law. The validity of a theorem should not be established by random substitutions without distinguishing the assumption from the conclusion. Personally, I prefer Jackson's derivation [Jack, p.178, l.-10-p.179. l.-5] of Ampere's law because Jackson's proof is more effective than Corson's.

Proofs centered around the key point vs. proofs initiated by formalism.
(The Doppler Effect) [Dit, p.27, 2(20)] fails to distinguish between the two cases in [Rob, p.22, l.6-l.7]. Robinson's proof [Rob. p.22, l.8-l.25] is limited by formal definitions and the habitual practice that is not suitable to this problem. The formalism causes unnecessary complications and difficulties. In contrast, Halliday's proof [Hall, p.329, l.19-p.331,l.3] emphasizes the key idea and gives a clear picture.

Strictly speaking, the lattice vibration energy is given by [Hoo, p.60, (2.51)], not [Kit2, p.123, (29)]. The difference between the above two formulas is the zero-point energy which is nonzero. If the zero-point energy were 0, the uncertainty principle would be contradicted [Lev2, p.70, l.15-p.71, l.4]. Since the zero-point energy is a constant, we can still derive the correct heat capacity by using Kittel's incorrect formulation of lattice vibration energy.

A potential difference is path independent.
    The proof in [Wangs, p.69, l.-3] points out the theorem's governing mathematical principle for the general case, while the proof in [Hall, p.464, l.11-l.3] gives a clear picture of the proof's key idea for a simple model (a point charge) without even using calculus.

A valuable proof is based on careful and comprehensive analysis.
    A correct proof is not necessarily a valuable proof. Both [Iba, pp.38-40, §3.2] and [Ashc, pp.86-87, §The reciprocal lattice is a Bravais lattice] show that [Ashc, p.86, (5.3)] is the basis of the reciprocal lattice. The former proof considers all the possibilities. After eliminating the impossible candidates, we finally obtain [Iba, p.39, (3.19)]. In contrast, the potential answer is given right in the beginning of the later proof and its correctness needs to be substantiated. No clue is given about how Ashcroft acquires this answer in the first place. Thus, from the analytical point of view, the later proof is not as valuable as the former proof.

The keys to a good proof are clarity, directness and precision, not fancy words whose meaning is unclear.
    The proof of [Haw, p.196, (13-8)] is clear and to the point [Haw, p.196, l.6-l.8]. In contrast, the explanation of [Pee, p.234, (8.32)] uses fancy but vague words and is not to the point [Pee, p.234, l.10-l.11]. The proof of a special case does not necessarily imply the validity of the general case.

A neat argument requires ingenuity rather than the omission of a few indispensable key words. Many physics professors at Princeton University and MIT like to leave gaps in their arguments for students to fill. They claim that this method helps students to think. I consider this practice dishonest because it makes an argument's shortcomings difficult to detect. For example, Huang leaves some gaps in the proof of [Hua, p.167, (8.48)]. However, his proof has the following shortcomings:

It lacks clarity.
    Classical mechanics has a problem with counting physical states because only quantum mechanics has a rigorous definition of physical states. Nevertheless, an author should pinpoint the exact logical step that requires a correction by quantum mechanics. However, Huang fails to do so. More specifically, [Reif, p.347, (9.6.4)] is justified by [Reif, p.348, (9.6.10)], while the derivation of [Hua, p.163, (8.25)] is not clear. Therefore, the definition of a grand partition function given in [Reif, p.347, (9.6.5)] is clear, while the definition given in [Hua, p.164, (8.34)] is not.
It is either a repetitive argument or a detour argument (In [Man, §11.7.1], Mandl derives [Man, (11.130)] without using [Hua, p.167, (8.47)]).
It deviates from the central theme (It is unnecessary to discuss ¶²A/¶N'² [Hua, p.166, (8.41)] when we study DN).
    If an author leaves gaps in his argument and later his argument turns out to have drawbacks, does he want his readers to make a strenuous effort to learn an awkward argument? If an author cannot predict whether his argument will remain neat in the future, he had better make it as clear as possible. At least, he will have clarity to his credit.
    When a derivation is not step by step, there are negative consequences. From [Hua, p.189, l.-3] to [Hua, p.190, l.16], all the derivations are incorrect. Only the final answers are correct. For the correct derivations, see [Reic, p.355, (7.45) & (7.46)]. To repair their academic credit, the professors in a privileged university must quit this bad habit.

(The linear electric quadruple)
    The derivation of [Cor, p.87, (5-14)] is based on Coulomb's law, while the derivation of [Chou, p.66, l.-4] is based on the Taylor series expansion of the electric potential in multipole terms. The former derivation is simpler in concept and in calculation.

[Guo, 212, l.7-p.213, l.-1] uses the power series representation to prove that the Legendre polynomial satisfies the Legendre equation. Guo's approach is straightforward and intuitive. In contrast, the proof given in [Chou, p.601, l.-7-p.602, l.9] uses recurrence formulas many times. Thus, the direction that Choudhury's proof follows is not as clear as the direction that Guo's proof follows.

A criterion still requires justification. When a solution is required to satisfy a certain form, any specific choice that satisfies the required form is not automatically a solution. A step-by-step specification with appropriate justification is required. One cannot avoid justifying his choice under the pretext of the word "try" when making a choice which leads to the correct answer.
    [Chou, p.108, (3.5)] is stated as a criterion, but no proof is supplied. [Wangs, p.192, l.-7-l.-1] proves the criterion. In [Chou, p.147, l.2-l.3], Choudhury says that we try …, then he attaches the final answer directly after the word "try" without a proper justification. [Wangs, p.193, l.-18-l.-1] or [Cor, p.230, l.10-p.231, l.5] fills that gap.

[Zem, p.113, l.-4; p.114, l.2] shows (¶P/¶V)_S ¹ (¶P/¶V)_q. Newton mistakenly thought the quantity DV/DP in [Zem, p.119, l.6] was (¶V/¶P)_q. It was Laplace who pointed out that the quantity DV/DP in [Zem, p.119, l.6] was (¶V/¶P)_S [Zem, p.119, l.7-p.120, l.15]. In the contemporary literature of physics, this type of mistake still occurs quite often. For example, the proof in [Man, p.20, l.-3-l.-2] is incorrect. For a correct proof, see [Hua1, p.9, (b) & (c)].

Our final goal to solve the problem [Hec, p.147, Problem 4.71] is to make the denominator and the numerator equal. Our strategy is to keep r and t in simple and similar expressions [Matv, (18.6) & (18.10)]. If the proof is coordinated with this strategy, it will greatly reduce our calculations. The proof given in [Matv, p.141, (18.12)] is well coordinated with our strategy, while the proof using [Hec, p.147, (4.98) & (4.99)] is not (See [Hec, p.147, Problem 4.71]). The calculations given in the latter proof are more complicated. Furthermore, the validity of the former proof is easily recognized once we know [Matv, (18.6) & (18.10)]. In contrast, if via [Hec, p.147, (4.98) & (4.99)], only in the late stage of calculations may we see the light from the end of the tunnel.

Links {1, 2, 3, 4}.