Theorem of the Euler-Lagrange equation for the case of one dependent variable

Let $\ J:C^{2}\left[x_{0},x_{1}\right]\rightarrow\mathbb{R}$ be a functional of the form

$\ J\left(y\right)=\int_{x_{0}}^{x_{1}}f\left(x,y,y^{\prime}\right)dx$,

where $\ f$ has continuous partial derivatives of second order and $\ x_{0}.

Let $\ S=\left\{y\in C^{2}\left[x_{0},x_{1}\right]:y\left(x_{0}\right)=y_{0}\;and\;y\left(x_{1}\right)=y_{1}\right\}$,

where $\ y_{0}$ and $\ y_{1}$ are given real numbers. If $\ y\in S$ is a local extremum for $\ J$, then

$\ \frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)-\frac{\partial f}{\partial y}=0$

for all $\ x\in\left[x_{0},x_{1}\right]$.

The Proof for the case of one dependent variable

Here we consider a particular class of problem called the fixed endpoint variational problem and work in $\ C^{2}\left[x_{0},x_{1}\right]$.

Let $\ J:C^{2}\left[x_{0},x_{1}\right]\rightarrow\mathbb{R}$ be a functional of the form

$\ J\left(y\right)=\int_{x_{0}}^{x_{1}}f\left(x,y,y^{\prime}\right)dx$,

where $\ f$ is a function assumed to have at least second-order continuous partial derivatives.

Given two values $\ y_{0},y_{1}\in\mathbb{R}$, the fixed endpoint variational problem consists of determining the functions $\ y\in C^{2}\left[x_{0},x_{1}\right]$ such that $\ y\left(x_{0}\right)=y_{0},y\left(x_{1}\right)=y_{1}$, and $\ J$ has a local extremum in $\ S$ at $y\in S$, where

$\ S=\left\{ y\in C^{2}\left[x_{0},x_{1}\right]:y\left(x_{0}\right)=y_{0}\;\textrm{and}\; y\left(x_{1}\right)=y_{1}\right\}$,

and

$\ H=\left\{ \eta\in C^{2}\left[x_{0},x_{1}\right]:\eta\left(x_{0}\right)=\eta\left(x_{1}\right)=0\right\}$.

We want to derive necessary conditions for $\ J$ to have a local extremum in $\ S$ at $\ y$.

Suppose that $\ J$ has a local extremum in $\ S$ at $\ y$.

Then there is an $\ \epsilon>0$ such that $\ J\left(\hat{y}\right)-J\left(y\right)$ does not change sign for all $\ \hat{y}\in S$ such that $\ \left\Vert \hat{y}-y\right\Vert <\epsilon$.

For any $\ \hat{y}\in S$ there is an $\ \eta\in H$ such that $\ \hat{y}=y+\epsilon\eta$, and for $\ \epsilon$ small Taylor's theorem implies that

$\ f\left(x,\hat{y},\hat{y}^{\prime}\right)=f\left(x,y+\epsilon\eta,y^{\prime}+\epsilon\eta^{\prime}\right)=f\left(x,y,y^{\prime}\right)+\epsilon\left(\eta\frac{\partial f}{\partial y}+\eta^{\prime}\frac{\partial f}{\partial y^{\prime}}\right)+O\left(\epsilon^{2}\right)$.

Please note that the partial derivatives in the above expression are all evaluated at the point $\ \left(x,y\left(x\right),y^{\prime}\left(x\right)\right)$.

Now,

$\ J\left(\hat{y}\right)-J\left(y\right)=\int_{x_{0}}^{x_{1}}f\left(x,\hat{y},\hat{y}^{\prime}\right)dx-\int_{x_{0}}^{x_{1}}f\left(x,y,y^{\prime}\right)dx$

$\ =\int_{x_{0}}^{x_{1}}\left(\left(f\left(x,y,y^{\prime}\right)+\epsilon\left(\eta\frac{\partial f}{\partial y}+\eta^{\prime}\frac{\partial f}{\partial y^{\prime}}\right)+O\left(\epsilon^{2}\right)\right)-f\left(x,y,y^{\prime}\right)\right)dx$

$\ =\epsilon\int_{x_{0}}^{x_{1}}\left(\eta\frac{\partial f}{\partial y}+\eta^{\prime}\frac{\partial f}{\partial y^{\prime}}\right)dx+O\left(\epsilon^{2}\right)$

$\ =\epsilon\delta J\left(\eta,y\right)+O\left(\epsilon^{2}\right)$.

The quantity

$\ \delta J\left(\eta,y\right)=\int_{x_{0}}^{x_{1}}\left(\eta\frac{\partial f}{\partial y}+\eta^{\prime}\frac{\partial f}{\partial y^{\prime}}\right)dx$

is called the first variation of $\ J$.

It is clear that if $\ \eta\in H$ then $\ -\eta\in H$, and $\ \delta J\left(-\eta,y\right)=-\delta J\left(\eta,y\right)$. For $\ \epsilon$ small, the sign of $\ J\left(\hat{y}\right)-J\left(y\right)$ is determined by the sign of the first variation, unless $\ \delta J\left(\eta,y\right)=0$ for all $\ \eta\in H$.

Since $\ J$ has a local extremum in $\ S$ at $\ y$, $\ J\left(\hat{y}\right)-J\left(y\right)$ does not change sign for all $\ \hat{y}\in S$ such that $\ \left\Vert \hat{y}-y\right\Vert <\epsilon$. Hence, $\ \delta J\left(\eta,y\right)=\int_{x_{0}}^{x_{1}}\left(\eta\frac{\partial f}{\partial y}+\eta^{\prime}\frac{\partial f}{\partial y^{\prime}}\right)dx=0$ for all $\ \eta\in H$.

Using integration by parts,

$\ \int_{x_{0}}^{x_{1}}\eta^{\prime}\frac{\partial f}{\partial y^{\prime}}dx=\left[\eta\frac{\partial f}{\partial y^{\prime}}\right]_{x_{0}}^{x_{1}}-\int_{x_{0}}^{x_{1}}\eta\frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)dx$

$\ =-\int_{x_{0}}^{x_{1}}\eta\frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)dx$,

since $\ \eta\left(x_{0}\right)=\eta\left(x_{1}\right)=0$.

Hence, $\ \delta J\left(\eta,y\right)=\int_{x_{0}}^{x_{1}}\eta\left(\frac{\partial f}{\partial y}-\frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)\right)dx=0$ for all $\ \eta\in H$.

Note that $\ \frac{\partial f}{\partial y}-\frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)=\frac{\partial f}{\partial y}-\frac{\partial^{2}f}{\partial x\partial y^{\prime}}-\frac{\partial^{2}f}{\partial y\partial y^{\prime}}y^{\prime}-\frac{\partial^{2}f}{\partial y^{\prime}\partial y^{\prime}}y^{\prime\prime}$,

and given that $\ f$ has at least two continuous derivatives, we see that for any $\ y\in C^{2}\left[x_{0},x_{1}\right]$ the function $\ E:\left[x_{0},x_{1}\right]\rightarrow\mathbb{R}$ defined by

$\ E\left(x\right)=\frac{\partial f}{\partial y}-\frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)$

is continuous on the interval $\ \left[x_{0},x_{1}\right]$. Note again that for a given function $\ y$ the partial derivatives defining $\ E$ are evaluated at the point $\ \left(x,y\left(x\right),y^{\prime}\left(x\right)\right)$.

Hence, $\ \int_{x_{0}}^{x_{1}}\eta\left(x\right)E\left(x\right)dx=0$ for all $\ \eta\in H$.

By the Fundamental Lemma of Calculus of Variations, $\ E=0$.

Hence, $\ \frac{d}{dx}\left(\frac{\partial f}{\partial y^{\prime}}\right)-\frac{\partial f}{\partial y}=0\qquad\left(1\right)$

and the equation $\left(1\right)$ is called the Euler-Lagrange equation.

Theorem of the Euler-Lagrange equation for the case of several dependent variables

Let $\ J:\mathbf{C}^{2}\left[t_{0},t_{1}\right]\rightarrow\mathbb{R}$ be a functional of the form

$\ J\left(\mathbf{q}\right)=\int_{t_{0}}^{t_{1}}L\left(t,\mathbf{q},\dot{\mathbf{q}}\right)dt$,

where $\ \mathbf{q}=\left(q_{1},q_{2},\ldots,q_{n}\right)$, and $\ L$ has continuous second order partial derivatives.

Let $\ S=\left\{\mathbf{q}\in\mathbf{C}^{2}\left[t_{0},t_{1}\right]:\mathbf{q}\left(t_{0}\right)=\mathbf{q}_{0}\;and\;\mathbf{q}\left(t_{1}\right)=\mathbf{q}_{1}\right\}$,

where $\ \mathbf{q}_{0},\mathbf{q}_{1}\in\mathbb{R}^{n}$ are given vectors.

If $\ \mathbf{q}$ is a local extremum for $\ J$ in $\ S$ then

$\ \frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{k}}-\frac{\partial L}{\partial q_{k}}=0$

for $\ k=1,2,\ldots,n$.

The proof for case of several dependent variables

Let $\ \mathbf{C}^{2}\left[t_{0},t_{1}\right]$ denote the set of functions $\ \mathbf{q}:\left[t_{0},t_{1}\right]\rightarrow\mathbb{R}^{n}$ such that for every $\ \mathbf{q}=\left(q_{1},q_{2},\ldots,q_{n}\right)$ we have $\ q_{k}\in C^{2}\left[t_{0},t_{1}\right]$ for $\ k=1,2,\ldots,n$.

Consider a functional of the form

$\ J\left(\mathbf{q}\right)=\int_{t_{0}}^{t_{1}}L\left(t,\mathbf{q},\dot{\mathbf{q}}\right)dt$,

where $\ \dot{}$ denotes differentiation with respect to $\ t$, and $\ L$ is a function having continuous partial derivatives of second order.

Given two vectors $\ \mathbf{q}_{0},\mathbf{q}_{1}\in\mathbb{R}^{n}$, the fixed endpoint problem consists of determining the local extrema for $\ J$ subject to the conditions $\ \mathbf{q}\left(t_{0}\right)=\mathbf{q}_{0}$ and $\ \mathbf{q}\left(t_{1}\right)=\mathbf{q}_{1}$.

Again, we have

$\ S=\left\{\mathbf{q}\in\mathbf{C}^{2}\left[t_{0},t_{1}\right]:\mathbf{q}\left(t_{0}\right)=\mathbf{q}_{0}\;and\;\mathbf{q}\left(t_{1}\right)=\mathbf{q}_{1}\right\}$ and

$\ H=\left\{\boldsymbol{\eta}\in\mathbf{C}^{2}\left[t_{0},t_{1}\right]:\boldsymbol{\eta}\left(t_{0}\right)=\boldsymbol{\eta}\left(t_{1}\right)=0\right\}$.

For $\ \epsilon$ small Taylor's theorem implies that

$\ L\left(t,\hat{\mathbf{q}},\dot{\hat{\mathbf{q}}}\right)=L\left(t,\mathbf{q}+\epsilon\boldsymbol{\eta},\dot{\mathbf{q}}+\epsilon\dot{\boldsymbol{\eta}}\right)$

$\ =L\left(t,\mathbf{q},\dot{\mathbf{q}}\right)+\epsilon\sum_{k=1}^{n}\left(\eta_{k}\frac{\partial L}{\partial q_{k}}+\dot{\eta}_{k}\frac{\partial L}{\partial\dot{q}_{k}}\right)+O\left(\epsilon^{2}\right)$,

where $\ \hat{\mathbf{q}}=\mathbf{q}+\epsilon\boldsymbol{\eta}$.

Hence,

$\ J\left(\hat{\mathbf{q}}\right)-J\left(\mathbf{q}\right)=\int_{t_{0}}^{t_{1}}L\left(t,\hat{\mathbf{q}},\dot{\hat{\mathbf{q}}}\right)dt-\int_{t_{0}}^{t_{1}}L\left(t,\mathbf{q},\dot{\mathbf{q}}\right)dt$

$\ =\epsilon\int_{t_{0}}^{t_{1}}\sum_{k=1}^{n}\left(\eta_{k}\frac{\partial L}{\partial q_{k}}+\dot{\eta}_{k}\frac{\partial L}{\partial\dot{q}_{k}}\right)dt+O\left(\epsilon^{2}\right)$.

Therefore the first variation for this functional is

$\ \delta J\left(\boldsymbol{\eta},\mathbf{q}\right)=\int_{t_{0}}^{t_{1}}\sum_{k=1}^{n}\left(\eta_{k}\frac{\partial L}{\partial q_{k}}+\dot{\eta}_{k}\frac{\partial L}{\partial\dot{q}_{k}}\right)dt$.

If $\ J$ has a local extremum at $\ \mathbf{q}$ then similar arguments imply that $\ \delta J\left(\boldsymbol{\eta},\mathbf{q}\right)=0$ for all $\ \boldsymbol{\eta}\in H$.

Define $\ H_{1}=\left\{\left(\eta_{1},0,\ldots,0\right)\in H\right\}$. For any $\ \boldsymbol{\eta}\in H_{1}$ the above condition reduces to

$\ \int_{t_{0}}^{t_{1}}\left(\eta_{1}\frac{\partial L}{\partial q_{1}}+\dot{\eta}_{1}\frac{\partial L}{\partial\dot{q}_{1}}\right)dt=0$.

From the one variable case, we know this condition leads to the Euler-Lagrange equation

$\ \frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{1}}-\frac{\partial L}{\partial q_{1}}=0$.

Similarly, we have

$\ \frac{d}{dt}\frac{\partial L}{\partial\dot{q}_{k}}-\frac{\partial L}{\partial q_{k}}=0$ for $\ k=1,2,\ldots,n$.

Special cases

Case 1

Suppose $\ f$ does not depend on $\ y$.

Then, the Euler-Lagrange equation simplifies to

${d\over dx}\left({\partial f\over\partial {y'}}\right) = 0$

and hence

${\partial f\over\partial {y'}} = C$

where $\ C$ is some constant.

Solving for $\ y'$, we get

$\ y' = g(x, C)$

for some function $\ g$. This can be solved by a quadrature.

Case 2

We will use a more compact notation to improve readability.

Suppose the integrand does not depend on $\ x$, in which case $\ f = f(y, y')$.

First, we notice the following

${d\over dx}(f-y'f_{y'})=(f_yy'+f_{y'}y'')-(y''f_{y'}+y'(f_{yy}y'+f_{yy'}y''))$

again by the chain rule. The second and third terms cancel, and we obtain

${d\over dx}(f-y'f_{y'})=f_yy'+f_{yy}{y'}^2-f_{yy'}y''=(f_y-f_{y'y}y'-f_{y'y'}y'')y'$ (*)

Now, from the Euler-Lagrange equation, we get

$0=f_y-{d\over dx}f_{y'} = f_y-f_{y'y}y'-f_{y'y'}y''$

by the chain rule.

However, the right side of this equation (which is zero) can be directly substituted into the right side of (*). Hence (*) is also zero. That is,

${d\over dx}(f-y'f_{y'})=0$

or

$\ f-y'f_{y'} = C$

where $\ C$ is a constant.

Case 3

Suppose $\ f$ does not depend on $\ y'$.

Then, the Euler-Lagrange equation takes the form

$\ {\partial f\over y}(x,y)=0$

Examples

Straight Lines

Result: The shortest line between two points $\ P$ and $\ Q$ in $\ \mathbb R^n$ is straight.

Proof: Consider a continuously differentiable function $\ \vec x:[t_1, t_2] \to \mathbb R^n$ that parametrizes a curve with the required endpoints.

We will let $\ x_i$ denote the ith component of the function for $\ i = 1, 2, ..., n$.

We have the arc length function

$\ S = \int_{t_1}^{t_2} {\left(\sum_{i = 1}^n \dot{x_i}^2\right)}^{1/2} dt$

where $\ f$ is the integrand.

Now, for every $\ i = 1, 2, ..., n$

${\partial f \over \partial x_i} = 0$

and

${\partial f \over \partial \dot{x_i}} = \dot{x_i} {\left(\sum_{i = 1}^n \dot{x_i}^2\right)}^{-1/2}$

Substituting into the Euler-Lagrange equation, we get $\dot{x_i} = 0$ for $\ i = 1, 2, ..., n$, which describes a straight line.

We have only shown that the straight line with the required endpoints an extremum. Nonetheless, in 3 (or less) dimensions, it is intuitively clear that such a line has the minimum length.

See Geodesic.

See Catenary.

Cycloid

For a discussion on the brachistochrone property of the cycloid, see Cycloid.

Hypocycloid

Result: The frictionless tunnel connecting two points $\ A$ and $\ B$ on the surface of a (spherical, homogeneous) planet taking the shape of a hypocycloid is the one for which a ball dropped (initially from rest) at one point moves to the other the fastest.

Proof: We will study the system partly in polar co-ordinates $\ (r, \theta)$ with the planet centered at the origin and partly in Cartesian co-ordinates $\ (x,y)$, whereby we assume that the problem is 2-dimensional.

Out problem is to minimize the total time

$\ t_{AB} = \int_A^B \frac1vds$

where

$\ v$ denotes the speed of the ball at time $\ t$
$\ s$ denotes the total distance traveled by time $\ t$

We now find $\ v$ by first finding the potential energy $\ U(r)$, where we set $\ U(0) = 0$.

Consider a spherical surface of radius $\ r$ concentric to the planet such that $\ r, where $\ R$ is the radius of the planet. Then, applying Gauss' law on this surface, we obtain

$4\pi r^2 F = - 4\pi G {\left(\frac43 \pi r^3 \rho\right)}m$

where

$\ F$ is the gravitational force (positive outward) at radius $\ r$
$\ \rho$ is the planet's density
$\ m$ is the mass of the ball
$\ G$ is the gravitational constant

The left side is the surface flux of the gravitational force and the right side is the total mass contained in the surface.

Substituting

$\ \rho = {M\over\frac43\pi r^3}$

we obtain

$\ F = -\frac{GMm}{R^3}r$

Hence,

$U(r) = -\int_0^r F(u)du = -\int_0^r -\frac{GMm}{R^3}udu = \frac12\frac{GMm}{R^3}r^2$

To find $\ v$, we apply the conservation of mechanical energy, so that

$\frac12 mv^2 = U(0) - U(t) = \frac12\frac{GMm}{R^3}R^2-\frac12\frac{GMm}{R^3}r^2$

where $\ m$ is the mass of the ball, and we remind that $\ v(0) = 0$ and $\ r(0) = R$.

Letting $\ g = \frac{GM}{R^2}$, we get

$\ v = \sqrt{\frac{g(R^2 - r^2)}R} = \sqrt{\frac{g(R^2 - (x^2 + y^2))}R}$

Furthermore,

$\ ds = \sqrt{}dt$

Hence, we conclude that

$\ t_{AB} = \int_A^B \frac1vds = \int_0^{t} \sqrt{\frac{\left({\dot x}^2+{\dot y}^2\right)R}{g(R^2-x^2-y^2)}} du$

where

$\cdot$ denotes the time derivative

and we let $\ f$ denote the integrand and minimized the integral using the Euler-Lagrange equation.

Since $\ f$ is independent of $\ t$, therefore our equation reduces to (case 2 from above)

$\ f - {x'}{\partial f\over\partial x'} = C_1$
$\ f - {y'}{\partial f\over\partial y'} = C_2$

where $\ C_1$ and $\ C_2$ are constants.

Differentiating then substituting, we obtain

$\ A = -\frac{{\dot y}^2 z}{{\dot x}^2 - {\dot y}^2}$
$\ B = \frac{{\dot x}^2 z}{{\dot x}^2 - {\dot y}^2}$

where

$\ z = \sqrt{\frac{\left({\dot x}^2+{\dot y}^2\right)R}{g(R^2-x^2-y^2)}}$

Adding both equations then squaring both sides, we get

$\ R({\dot x}^2+{\dot y}^2)=gC^2(R^2-x^2-y^2)$

where $\ C = A + B$

The following cycloid with outer radius $\ R$ and inner radius $\ b$ satisfies the required equation

$\vec r = (R-b)\left(\cos t, \sin t\right) + b\left(\cos {\frac{R - b}b t}, -\sin{\frac{R - b}b t}\right)$

This completes the proof.