Browsed by
Month: July 2019

Using Dirac Delta Formalism to Calculate Shear Force and Bending Moment

Using Dirac Delta Formalism to Calculate Shear Force and Bending Moment

Editor’s note: This article was written by Valentin Fadeev, a physics PhD candidate in the UK. He also co-authored this article on equivalences and isomorphisms in category theory. You can find him on Twitter via the handle @ValFadeev.

The delta function was introduced by P.A.M. Dirac, one of the founders of quantum electrodynamics. The delta function belongs to the abstract concepts of function theory. In a rigorous sense it is a functional that picks a value of a given function at a given point. However, it also arises as the result of the differentiation of discontinuous functions. One of the most appealing properties of the delta function is the fact that it allows us to work with discretely and continuously distributed quantities in the same way, replacing discrete sums with integrals. Delta and its derivative meet the physical intuition in cases where we deal with quantities of large magnitude whose impact is limited to a small region of space or a very short amount of time (impulses). The inspiration for this particular article came during a course on mechanics in university.

In structural mechanics, it is often desired to understand the relationship between bending moment and shear force for objects, particularly beams, under applied loads. An explanation of the typical method is given here.

In the text I was using at the time, the proof of the relation between bending moment and shear force did not make explicit use of delta function. The final result was valid, but there was a serious fault in the proof. Point forces were considered as mere constant terms in the sum and differentiated away, despite the fact their contribution was accounted for only starting form certain points (suggesting delta-terms). I gave this problem a mathematically rigorous treatment which led to some nice closed formulae applicable to a very general distribution of load.

Relationship Between Shear Force and Bending Moment

A beam with various forces

Consider a rigid horizontal beam of length L, and introduce a horizontal axis x directed along it. Assume the following types of forces are applied:

  • point forces F^{p}_{i} at x_i, i=1,2,…,n, 0<x_i<L
  • distributed forces with linear density given by continuous functions \phi^{d}_{j}(x) on intervals (a_j,b_j),(j=1,2,…,m), 0<a_{j-1}<b_{j-1}<a_j<b_j<L
  • moments (pairs of forces) M_k with axes of rotation at x_k, (k=1,2,…,p), 0<x_k<L.

Note that we ignore our own mass of the beam for the sake of clarity, although it can be easily accounted for if necessary.

Point Forces

Although the point forces F^{p}_{i} are applied at certain points by definition, in reality the force is always applied to a certain finite area. In this case we can consider the density of the force being very large within this area and dropping to 0 outside it. Hence, it is convenient to define the distribution density as follows: \phi^{p}_{i}(x)=F^{p}_{i}\delta(x-x_i). Shear force F at point x is defined as the sum of all forces applied before that point (we are moving from the left end of the beam to the right one): F=\sum_{x_i<x}F_{i}
F^{p}(x)=\sum_{i}\int_0^x \phi^{p}_{i}(z)\mathrm{d}z=\sum_{i}F^{p}_{i}e(x-x_i), where e(x)=\begin{cases}1,\qquad x>0\\0, \qquad x\neq 0\end{cases} is the step function.

Distributed Forces

Now we shall find the contribution from the distributed forces. \phi^{d}_j(x) may be formally defined on the whole axis, so we must cut off the unnecessary intervals outside (a_j,b_j). Consider the following expressions: \phi^{d}_{j}(x,a_j,b_j)=\phi^{d}_j(x)[e(x-a_j)-e(x-b_j)]. Indeed it is easy to ascertain that the right side is equal to \phi^{d}_j(x) within (a_j,b_j) and vanishes everywhere else. Calculating shear force due to distributed forces demonstrates some useful properties of the delta function: \begin{aligned}F^d&=\sum_{j}\int_{0}^{x}\phi_{j}^{d}(z)[e(z-a_j)-e(z-b_j)]\mathrm{d}z\\&=\sum_{j}\left[\int_0^x\phi^{d}_j(z)e(z-a_j)\mathrm{d}z-\int_0^x \phi^{d}_j(z)e(z-b_j)\mathrm{d}z\right]\\&=\sum_{j}\left[\int_{a_j}^x\phi^{d}_{j}(z)e(z-a_j)\mathrm{d}z-\int{b_j}^{x}\phi^{d}_j(z)e(z-b_j)\mathrm{d}z\right]\\&=\sum_{j}\left[\left.\left(e(z-a_j)\int_{a_j}^x\phi^{d}_j(z)\mathrm{d}z\right)\right|_{a_j}^x-\int_{a_j}^x\left(\int_{a_j}^x\phi^{d}_j(z)\mathrm{d}z\right)\delta(z-a_j)\mathrm{d}z\right.\\&\left.-\left.\left(e(z-b_j)\int_{b_j}^x\phi^{d}_j(z)\mathrm{d}z\right)\right|{b_j}^x+\int_{b_j}^x\left(\int_{b_j}^x\phi^{d}_j(z)\mathrm{d}z\right)\delta(z-b_j)\mathrm{d}z\right]\\&=\sum_{j}\left[e(x-a_j)\int_{a_j}^x\phi^{d}_j(z)\mathrm{d}z+e(x-b_j)\int_{x}^{b_j}\phi^{d}_j(z)\mathrm{d}z\right]\end{aligned}

Here we used the defining property of delta: f(x)\delta(x-x_0)=f(x_0)\delta(x-x_0), from which it follows, in particular that \left(\int_{a_j}^x\phi^{d}_j(z)\mathrm{d}z\right)\delta(x-a_j)=\left(\int_{a_j}^{a_j}\phi^{d}_j(z)\mathrm{d}z\right)\delta(x-a_j)=0.

Bending Moments

Now we shall calculate bending moments created by all types of forces involved. Consider F^{p}_{i} applied at x_i. The bending moment created by this force evaluated at x>x_i can be determined as follows:

\begin{aligned}M_{i}^{p}(x)&=F_{i}^{p}e(x-x_i)(x-x_i),\\M_{i}^{p}(x+\mathrm{d}x)&=F_{i}^{p}e(x+\mathrm{d}x-x_i)(x+\mathrm{d}x-x_i)=F^{p}_{i}e(x-x_i)(x-x_i+\mathrm{d}x),\\dM_{i}^{p}&=F_{i}^{p}e(x-x_i)\mathrm{d}x,\\M_{i}^{p}&=F_{i}^{p}\int_0^{x}e(z-x_i)\mathrm{d}z=F^{p}_{i}\frac{x-x_i+|x-x_i|}{2}.\end{aligned} Hence, the total moment due to point forces is M^{p}=\sum_{i}F_{i}^{p}\frac{x-x_i+|x-x_i|}{2}.

To calculate moment created by the distributed forces we use the following approach adopted in mechanics. Replace the distributed force to the left of the point of summation with a point force applied at the center of gravity of the figure enclosed by the graph of \phi_j(x) and lines x=a_j, x=b_j. If a_j<x<b_j: \begin{aligned}F^{d}_{j}(x)&=\int_{a_j}^x\phi^{d}_{j}(z)\mathrm{d}z,\\ M^{d}_{j}&=\int_{a_j}^x\phi^{d}{j}(z)\mathrm{d}z\left(x-\frac{\int_{a_j}^x z\phi^{d}_{j}(z)\,\mathrm{d}z}{\int_{a_j}^x\phi^{d}_{j}(z)\mathrm{d}z}\right)\\&=x\int_{a_j}^x\phi^{d}_{j}(z)\mathrm{d}z-\int_{a_j}^x z\phi^{d}_{j}(z)\mathrm{d}z.\end{aligned} Differentiating both parts with respect to x we obtain \frac{\mathrm{d}M^{d}_{j}}{\mathrm{d}x}=\int_{a_j}^x\phi^{d}_{j}(z)\mathrm{d}z+x\phi^{d}_{j}(x)-x\phi^{d}_{j}(x)=\int_{a_j}^x\phi^{d}_{j}(z)\mathrm{d}z=F^{d}_{j}(x)

In fact, we could as well include point forces in the above derivation considering total density \phi(x)=\phi^{p}(x)+\phi^{d}(x), which is the correct way to account for point forces in this proof. We can therefore derive the general relation between shear force and bending moment: F=\frac{\mathrm{d}M}{\mathrm{d}x}. Now we need to calculate the contribution made by moments of pairs of forces. Each of these moments can be considered as a pair of equal large forces F applied at a small distance h of each other and oppositely directed. They will make the following contribution to the expression for total density \mu(x)=F\delta(x+h)-F\delta(x)=Fh\frac{\delta(x+h)-\delta(x)}{h}=M\frac{\delta(x+h)-\delta(x)}{h}. Taking the limit as h \to 0: \mu(x)=M\delta'(x)

Note that the derivative of \delta(x) is called the unit doublet.
This does not imply that expression for shear force will contain terms of the form M\delta(x). This is due to the fact that shear force consists of vertical components of all forces and for pairs of forces those components will cancel out each other.
Therefore, the total bending moment of the pairs of forces is expressed as follows M=\sum_{k}M_ke(x-x_k). Finally we can write the expressions for shear force and bending moment in the most general case \begin{aligned}F(x)&=\sum_{i}F^{p}_{i}e(x-x_i)\\&\qquad+\sum_{j}\left[e(x-a_j)\int_{a_j}^x\phi^{d}_j(z)\mathrm{d}z+e(x-b_j)\int_{x}^{b_j}\phi^{d}_j(z)\mathrm{d}z\right],\\M(x)&=\sum_{k}M_ke(x-x_k)+\sum_{i}F^{p}_{i}\frac{x-x_i+|x-x_i|}{2}\\&\qquad+\sum_{j}\int_0^x\left[e(t-a_j)\int_{a_j}^t \phi^{d}_j(z)\mathrm{d}z+e(t-b_j)\int_{t}^{b_j}\phi^{d}_j(z)\mathrm{d}z\right]\mathrm{d}t.\end{aligned} In an important particular case where \phi^{d}_j(x)\equiv \phi^{d}_j=const these expressions reduce to the following \begin{aligned}F(x)&=\sum_{i}F^{p}_{i}e(x-x_i)+\sum_{j}\phi^{d}_j\left[(x-a_j)e(x-a_j)+(x-b_j)e(x-b_j)\right], \\ M(x)&=\sum_{k}M_ke(x-x_k)+\sum_{i}F^{p}_{i}\frac{x-x_i+|x-x_i|}{2}\\ &\qquad+\sum_{j}\frac{\phi^{d}_j}{2}\left[(x-a_j)^2e(x-a_j)+(x-b_j)^2e(x-b_j)\right].\end{aligned}


In this example we have demonstrated the application of Dirac delta and the related functions to a problem where the input data has a combination of continuous and discrete distributions. Singular functions allow us to work with this data in a uniform and relatively rigorous way. The advantage is that we get the final results in the closed form, as opposed to an algorithmic step-by-step method of solution. We can also obtain concise proofs, such as that for the relation between shear force and bending moment, and extend the results to very general inputs.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Should I accept this shipment?

Should I accept this shipment?

Suppose you work for an engineering or manufacturing firm, and you receive shipments of different parts from various suppliers. It’s not good business practice to just blindly accept a shipment, because some parts in your batch may be defective. Too many defective parts, and you’ll slow down manufacturing (in addition to wasting valuable time and money). You come up with a brilliant plan (which is used in industry already): you’ll take a sample from the shipment, test the sample, and if the number of defective parts in the sample is below some predetermined number, you’ll accept the shipment. Otherwise, you’ll send it back and probably place an angry call to your supplier.


(1) Why not just test everything?

For one, shipments may contain thousands of parts. That will take forever. Secondly, some tests are destructive, meaning you push an item to its breaking point, which renders it unusable. (In these cases, you’re testing to ensure the breaking point is high enough.) Thirdly, certain kinds of tests may be expensive or time-consuming. We have real costs — time and money and physical scarcity — we must consider now.

(2) How big should our sample be?

There are statistical considerations and business/cost considerations. There’s no perfect answer here. Sampling theory is a large branch of statistics, and there are tools for analysis of optimal sample size. However, we must also consider the specific case. If we have a destructive test, the “statistically optimal” size may destroy too many. If the test is time-consuming, we may lack the time to perform the test on a large number of items1

(3) What should be the “cutoff” for shipment rejection?

This discussion is the main focus of this article. We’re going to take a pretty small and simplistic overview of it, just to illustrate how powerful even basic probability can be for engineering and business problems. To do this, we’ll briefly describe the hypergeometric distribution, and then illustrate its use in an operating characteristic curve.

The Hypergeometric Distribution

Suppose we have N objects belonging to one of two classes, with N_{1} in the first class, and N_{2} in the second class, so N = N_{1} + N_{2}. (In this example, N_{1} is the total number of defective parts in the batch, and N_{2} is the number of good parts.) We select n objects without replacement2. Find the probability that exactly x belong to the first class and n-x to the second.

We can select x objects from the N_{1} in class one in {N_{1} \choose x} = \frac{N_{1}!}{x!(N_{1}-x)!} ways3. There are n-x left and they must come from N_{2} objects. We can select those in {N_{2} \choose n-x} ways. Then P(X=x) = \frac{{N_{1} \choose x}{N_{2} \choose n-x}}{{N \choose n}}.

Evaluating a shipment acceptance plan

We’ll create a small example with manageable numbers to illustrate the use of the hypergeometric distribution in acceptance sampling.

A supplier ships parts to another company in lots of 25. Some items in the shipment may be defective. The receiving company has an acceptance sampling plan to inspect n=5 parts from the lot without replacement. If none of the sample is defective, they accept the shipment, otherwise they reject it. Suppose we want to evaluate this plan.

If X is the random number of defectives in the sample, then X has a hypergeometric distribution. Except this time, N_{1} and N_{2} (number of defective and nondefective parts respectively) is unknown. We only know N_{1} + N_{2} = 25.

In designing an acceptance plan, we want the probability of accepting the lot to be large if N_{1} is small. That is, we want to have a high probability of accepting the lot if the true number of defective parts is very small. We also want the probability of “falsely accepting” the lot to be low. (That is, we want the probability of acceptance to be low when N_{1} is high).

When we treat these probabilities as a function of N_{1} (or equivalently, the fraction defective given by p = \frac{N_{1}}{25} in the lot) we call this the operating characteristic curve. Mathematically, the operating characteristic curve, denoted OC(p) is given in this case by:

OC(p) = P(X=0) = \frac{{N_{1} \choose 0}{25-N_{1} \choose 5}}{{25 \choose 5}}

Note here that OC(p) = P(X=0) because that’s the plan in this fictional example. If the plan changed, the operating characteristic curve would be defined by something different. For example, if the plan was to accept shipments that contain 1 or fewer defects, then OC(p) = P(X\leq 1) and we would recalculate those probabilities using the hypergeometric probability mass function given above.

Let’s look at some numerical values of OC(p) for our fictional example. Remember that p = \frac{N_{1}}{25}.


Is the acceptance plan satisfactory? With N_{1} = 1, OC(0.04) = 0.8 which may be considered too low (we may reject perfectly valid shipments), and with N_{1} = 4, OC(0.16) = 0.383 may be too high (we may not want that high a probability of accepting a shipment with that many defects). Thus, we may want to reconsider the number of items we test, or reconsider our acceptance plan.

Usually lot sizes are far larger than the numbers seen here, and sample sizes are in the hundreds, so as the values get large, this computation becomes cumbersome. We can approximate the hypergeometric distribution with the Poisson distribution, which we won’t cover here.


This is a small illustration of the very practical use of the hypergeometric distribution to devise an intelligent strategy for accepting/rejecting shipments of goods. This type of work falls within the purview of the services we offer.

Energy Levels of Molecules are Bounded Below

Energy Levels of Molecules are Bounded Below

Editor’s Note: This guest submission is from Nikita Lisitsa, a professional software developer and mathematician. You can follow him on Twitter.

Are Molecules Perpetual Motion Machines?

Short answer: no, of course not. Perpetual motion machines do not exist. There are deep theoretical reasons for that, as well as less deep but nevertheless convincing experimental data. That’s why we care so much about renewable energy and stuff: we know that energy supply is limited in any real physical system.

From the perspective of mechanics (classical, relativistic, or quantum) when you take some energy from a system, it jumps to a state with lower energy. If there were a system with no lower bound on energy levels, we could withdraw energy forever, forcing the system to fall lower and lower on the energy ladder (this is roughly what forced Dirac to introduce the concept of the Dirac sea). Thus, a sufficiently good mechanical system should have a lower bound on possible energy states.

We now focus on atoms and molecules. What would bound their energy levels from below? To begin, we need a good framework to describe these energy levels. Unfortunately, classical mechanics fails here: the virial theorem implies that for a potential of the form V(r) \sim \frac{1}{r} the time average of kinetic and potential energies, denoted \langle T\rangle and \langle V\rangle, respectively, are related by 2\langle T \rangle = -\langle V \rangle,and the total energy E (which is conserved) is given by E = \langle E \rangle = \langle T \rangle + \langle V \rangle = \frac{1}{2} \langle V \rangle.

For a single electron (which has charge -e) moving in a Coulomb field of a nucleus with charge Ze, the potential energy is V(r) = -\frac{kZe^2}{r}, where k is a Couloumb constant. Thus, if the distance r to the nucleus tends to zero, the potential energy tends to -\infty, and so does the total energy E. This, together with the fact that an electron orbiting a nucleus would lose energy in the form of electromagnetic waves, was one of the major problems of classical mechanics that needed a “quantum treatment”.

Annoyingly Fast and Heavy Introduction to Quantum Mechanics

Let’s dive into the (non-relativistic) quantum world. Unfortunately, we are immediately forced to accept an approximation; as of this writing, I’m unaware of a solution that doesn’t use one. The Born-Oppenheimer approximation suggests that since the protons and neutrons are about 2000 times heavier than electrons, they move much slower and can be considered static. Thus, we must now prove that a system of N electrons moving in a field of K fixed nuclei has a lower bound on possible values of energy.

Under this framework, a molecule is a carefully chosen separable Hilbert space of possible states together with a “nice” essentially self-adjoint operator acting on it. The Hilbert space we are will use is denoted (L^2(\mathbb{R}^3))^{\otimes N} \cong L^2(\mathbb{R}^{3N}), where \cong denotes an isomorphism. The notation indicates that we can represent an element of our Hilbert space in two equivalent ways; we’ll use the second, which is a square-integrable function of 3N real variables which are coordinates of the electrons – 3 coordinates for each of the N electrons. The inner product of two functions \psi and \phi is defined as an integral over the whole 3N-dimensional space:

\langle\psi,\phi\rangle = \int\limits_{\mathbb{R}^{3N}} \overline\psi(r_1\dots r_N) \phi(r_1 \dots r_N) dr_1 \dots dr_N

Remark: The actual space should be tensored by \mathbb C^{2^N} to account for spin, but the effect of spin on energy is too small for us to care in this case. Nevertheless, it exists. Furthermore, the elements of the space are actually not functions;they are equivalence classes of functions modulo function having zero norm. Following the usual colloquial terminology, we’ll call them functions nevertheless; we’ll refer to them as wavefunctions.

A physical state is not the same as a wavefunction: you can multiply the wavefunction by an arbitrary non-zero complex constant, and the result represents the same state. States form some peculiar kind of space; we shall stick to wavefunctions, which are elements of a vector space so we can use the full power of linearity. Thus, it is common to assume wavefunctions to be normalized such that1 \langle \psi,\psi\rangle=1. We shall assume this too.

Next, we need to define the Hamiltonian. Here, the Hamiltonian is similar to that in non-quantum mechanics, with the exception that we must quantize it: turn numbers into operators. The general solution to this problem is unknown. (See Groenewold’s theorem on the inconsistency of canonical quantization and John Baez’s article series on some modern approaches). In our case, however, the recipe is simple: turn the kinetic energy \frac{p_i^2}{2m} into -\frac{\hbar^2}{2m} \Delta_i and turn the potential energy V(r_1,r_2,\dots,r_N) into an operator that multiplies a wavefunction by V.

The energy consists of 

  • the kinetic energy of electrons,
  • the attraction of these electrons to the nuclei, and
  • the repulsion energy between electrons

The usual quantum-mechanical way to write this is:
H = -\sum\limits_i \frac{\hbar^2}{2m} \Delta_i – \sum\limits_{i,A} \frac{kZ_Ae^2}{|r_i – R_A|} + \sum\limits_{i,j} \frac{ke^2}{|r_i – r_j|} We now detail each item in the above equation

  • i,j are indices that run over the electrons in the system
  • A runs over nuclei
  • \Delta_{i} is the Laplacian with respect to i-th electron coordinates, i.e. \frac{\partial^2}{\partial x_i^2}+\frac{\partial^2}{\partial y_i^2}+\frac{\partial^2}{\partial z_i^2}
  • m is the electron mass
  • k is the Coulomb constant
  • Z_A is the number of protons in the A-th nucleus
  • e is the electron charge
  • r_i is the position of i-th electron
  • R_A is the position of A-th nucleus (which are fixed by Born-Oppenheimer)

Nobody likes constants, right? Let’s stick to atomic units instead, where m=k=e=\hbar=1, and the speed of light is 137 (this is exactly \frac{1}{\alpha}, where \alpha is the fine-structure constant). The Hamiltonian becomes a bit cleaner: H = -\sum\limits_i \frac{1}{2} \Delta_i-\sum\limits_{i,A} \frac{Z_A}{|r_i – R_A|} + \sum\limits_{i,j} \frac{1}{|r_i – r_j|}

What does it mean? An operator should act on wavefunctions. For a function \psi, the operator takes some second derivatives of \psi, multiply \psi by other functions, and adds all this up. The result is another function – the result of operator acting on \psi, called H\psi. One important thing is that this operator is essentially self-adjoint, though we shall treat it as if it’s adjoint: this follows from the Laplacian \Delta_i being self-adjoint, and multiplication byreal-valued function being a self-adjoint operator as well.

The energy (well, the average or expectation value energy), of a state \psi is \langle \psi, H\psi\rangle (the inner product of \psi and H\psi), which happens to be equal to \langle H\psi , \psi\rangle, thanks to self-adjointness. Physicists love to use bra-ket notation for expressions like this, and write it as \langle \psi | H | \psi\rangle.

Concerning Energies

We have a space of functions, a differential operator acting on these functions, and an integral that we wish to bound from below whenever the functions are normalized.2

Our way to prove that the expression \langle \psi, H\psi\rangle is indeed bounded below will be broken up into three parts.

Ignoring Repulsion

A good thing about the average energy is that it is linear in the energy operator. This means that, for a sum of two operators, we have \langle \psi, (H_1+H_2)\psi \rangle = \langle \psi, H_1\psi \rangle + \langle \psi, H_2\psi \rangle, which means that we can analyze the terms in the Hamiltonian separately.

Now, the last term is electron-electron repulsion. Let’s write it down again:
\sum\limits_{i,j} \frac{1}{|r_i – r_j|} The only thing that matters for our purposes is that this operator is multiplication by a nonnegative function, which we’ll call f(r_1,\dots,r_N)=\sum\limits_{i,j} \frac{1}{|r_i - r_j|} for now. Non-negativeness implies that the expectation value of this term is never below zero:
\langle \psi, f \cdot \psi \rangle = \int\limits_{\mathbb R^{3N}} |\psi|^2 f dr_1 \dots dr_N \geq 0

Thus, since the expectation value is linear, we can simply drop the electron-electron repulsion term: this action can only lower the energy; since we want to find a lower bound on energy levels, this is okay.

Separating Electrons

Now the simplified Hamiltonian looks like this:
H’ = -\sum\limits_i \frac{1}{2} \Delta_i – \sum\limits_{i,A} \frac{Z_A}{|r_i – R_A|} The electrons do not interact anymore, which makes the total Hamiltonian a sum of single-particle Hamiltonians of individual electrons: H’ = \sum\limits_i \left[ -\frac{1}{2} \Delta_i – \sum\limits_{A} \frac{Z_A}{|r_i – R_A|}  \right] = \sum\limits_i H_i,

where each of H_i depends only on the i-th electron coordinates. In the tensor-product-version of the description of our Hilbert space of states, this means that H_i acts on the i-th component of the tensor product (L^2(\mathbb R^3))^{\otimes N}. Whatever Hilbert space description you use, this implies that the whole system Hamiltonian H' is separable, so analyzing the expectation values of H_i is, in essence, a single-particle problem: the wavefunction \psi(r_1, \dots, r_N) does still depend on all electrons’ coordinates, but they are ignored by H_i. So, if we find a lower bound for each of H_i, we are done.

Reducing to a Hydrogen-Like Atom

Everything before this section was actually pretty straightforward; it is now that we’ll do a simple trick. Look at H_i again:
H_i = -\frac{1}{2} \Delta_i – \sum\limits_{A} \frac{Z_A}{|r_i – R_A|} It would be awesome if we could put the Laplacian under the sum, for the resulting summands would be just the Hamiltonians of a hydrogen-like atom, — a quantum system with any charged particle orbiting around another charged particle. This is one of the most important quantum systems and one of the few that has an exact and known analytical solution.

So, let’s actually put the Laplacian under the sum!

H_i = \sum\limits_{A} \left[ -\frac{1}{2M} \Delta_i – \frac{Z_A}{|r_i – R_A|} \right] = \sum\limits_A H_{i,A},

where M is the number of nuclei. Now, each term of the sum is the Hamiltonian of an atom-like system, with a nucleus of charge Z_A and an electron orbiting it, except that the electron has mass M instead of 1. Thankfully, this is not a problem: plug m_e=M into the exact solutions of the hydrogen-like atom, and we get solutions for each H_{i,A} term of the sum above. We don’t need the actual solutions, though; all we need is the fact that expectation value of energy is bounded below in this solution.

Summing It Up

So, what we’ve done is shown H \geq H’ = \sum\limits_i H_i = \sum\limits_{i,A} H_{i,A}, and the expectation value of each H_{i,A} is bounded below, therefore the expectation value of H itself is bounded below.

This whole idea was inspired by the paper of Tosio Kato “Fundamental properties of Hamiltonian operators of Shrödinger type”. You’ll find many more details on the topic there.

All this is just a tiny bit of the whole story of why the world around us doesn’t explode/freeze/break apart. For a full exposition, check out “The stability of matter: from atoms to stars” by Elliott H. Lieb. I should thank Valentin Fadeev for introducing me to this awesome work.