Page d'accueil de
Christian Magnan


version française
pdf version of this document

Please help me ! Could a nice person correct my english?
Such a charitable deed would be very useful to many people



Advance of the perihelion of Mercury
in General Relativity

Christian Magnan

Together with the deviation of light rays near the Sun the change of the orientation of Mercury's orbit century after century is one of the first dramatic confirmations of the correctness of the general relativity theory conceived by Einstein. But it is quite difficult to provide the reader with the details of the calculation as there is no simple way to do it. In fact, to get the result it is necessary to dive into the full theory of general relativity. Here is a good opportunity to discover this theory while illustrating it on an example!

The derivation of the equations given here closely follows the presentation of Edwin F. Taylor and John Archibald Wheeler in their delightful work "Exploring Black Holes, Introduction to General Relativity" (Addison Wesley Longman, 2000). The calculation of the integral giving the value of the precession of the orbit is adapted from the one developped by Steven Weinberg (Gravitation and Cosmology, Principles and Applications of the General Theory of Relativity, John Wiley & Sons, 1972).



Principle of the calculation

Whereas the qualitative aspect and the historical impact of the precession of the orbit of Mercury, namely the advance of the perihelion (the orbital point closest to the Sun) of the planet with each revolution, are widely described, the calculation of the amount of the shift is rarely given. This page is dedicated to this numerical exercise.

General relativity teaches us that by propagating into space a free particle (that is a particle not submitted to the acceleration of a motor) follows what is called a "geodesic" of spacetime. Thus we have to examine the following points:

Metrics for curved spacetime around a massive object

Newtonian mechanics describes the motion of a particle in an absolute space with respect to an absolute time. The position of the moving object is located by its coordinates with respect to some frame of reference and is given as a function of time $t$. General relativity affirms that there exists no absolute time ant that time cannot be dissociated from space. The theory bases its reasoning on events, each event being characterized by a point M (where it happens!) and a time $t$ (when it happens!). The events attached to a moving particle constitute what is called a worldline.

Let us consider for example a spaceship moving freely through space, which means that all his motors are turned off. Let us imagine that regular flashes are emitted in accordance with a clock located inside the rocket and beating the time. The time interval between two successive flashes will be denoted by $\tau$ (this quantity is thus measured with respect to the proper time of the spaceship). Think now of another frame of reference as constituted by an ensemble of space beacons, also free from acceleration, at constant mutual distances from one another (each free beacon stays at the same distance from its neighbours). Every signal bears the indication of its position in space (for instance by showing its distance from some origin) and holds its own clock. The clocks of this second frame are synchronized between them. Then in that frame the interval between two flashes (i.e. two events) is characterized by two numbers: the space interval $s$ and the time interval $t$. To determine those two quantities it suffices to record which beacon faces Flash #1 and which beacon faces Flash #2 while noting the times of those events.

Special relativity is based on the following principle. The proper time interval $\tau$ between Event #1 et Event #2 is given by the formula

\begin{displaymath}
\tau^2 = t^2 - s^2
\end{displaymath} (1)

and this quantity does not depend on the frame in which it is evaluated. In other words all observers agree on the value of $\tau$ computed by Formula (1), although the values of $s$ and $t$ differ from one system of reference to another.

Be careful: unless otherwise indicated, distances will be measured in units of time, as is often done in astronomy. We have chosen to do so in writing Equation (1). On the contrary if distances s are expressed in conventional units, for instance in centimeters, then one should pass from the latter to our distance s expressed in seconds via the formula s (in seconds) = s (in centimeters)/c where c is the speed of light in conventional units, namely $3\times10^{10}$ cm/s. (Expressing distances and times with the same unit would amount to taking the speed of light equal to unity.)

In general relativity the property of invariance of the proper interval with respect to a change of the coordinates remains valid but only locally, i.e. under the condition of staying in a sufficiently small region of spacetime (its size depends on the accuracy of the measurements). The main novelty concerns the expression of the proper time as given by Formula (1). The coefficients entering this formula depend now on the point of spacetime under consideration and the resulting expression takes the name of metrics. In fact the whole structure of spacetime, and especially its curvature, is included in the local expression of $\tau$ and in the form of its coefficients.

We are interested here in the structure of spacetime around the sun. In order to describe the physics locally we consider two nearby events separated by infinitesimal amounts of the time and space coordinates $dt$, $dx$, $dy$ and $dz$. If space were flat, the metrics would have the form

\begin{displaymath}(d\tau)^2 = (dt)^2 - (dx)^2 - (dy)^2 - (dz)^2\end{displaymath}

which is usually written (and a little sloppily) by convention as
\begin{displaymath}d\tau^2 = dt^2 - dx^2 - dy^2 - dz^2\end{displaymath} (2)

By working in spherical coordinates, in a plane containing the center of the sun (this choice removes one spatial coordinate), that formula becomes

\begin{displaymath}d\tau^2 = dt^2 - dr^2 - r^2 d\phi^2\end{displaymath}

where r denotes the distance to the center and $\phi$ an azimutal angle in the plane of the orbit (see the figure below).

polar coordinates

But spacetime around a center of attraction of mass $M$ (for instance a black hole or the vicinity of the sun) is not flat. It is characterized by the Schwarzschild metric

\begin{displaymath}
d\tau^2 = (1 - 2M/r) dt^2 - (1 - 2M/r)^{-1} dr^2 - r^2 d\phi^2
\end{displaymath} (3)

The story is really fantastic: the whole structure of spacetime is embodied in this "simple" formula (3).Even the famous black hole lurks behind those apparently innocuous symbols.

One question: in which units is expressed the mass $M$ in that formula? It is seen that $M$ has the dimension of a length, a quantity that we measure here in seconds. Therefore $M$ will also be measured in seconds. The formula allowing to transform grams in seconds is

\begin{displaymath}M{\mbox{(in seconds})} = (G/c^3) M{\mbox{(in grams)}}\end{displaymath}

where $(G/c^3) = 2,5\times10^{-39}\ \mbox{s/g}$.

The equations of a geodesic

The metric, that is (exactly) the formula expressing at a givent point of spacetime the temporal interval between two nearby events, reveals the presence of curvature as soon as the expression deviates from Formula (2) corresponding to flat euclidian space. That metric will allow us to find the properties of the motion of a test particle free from acceleration. Actually both special and general relativity teach us that between two given events $E_1$ and $E_2$ a freely moving body follows the path for which the time interval $\tau$ is maximum. Equivalently one can say that a freely moving particle follows a geodesic of spacetime as a geodesic is precisely defined by this property of maximazing the time interval.


Definition of a geodesic: the geodesic between two events $E_1$ and $E_2$ is the wordline for which the interval of proper time between $E_1$ and $E_2$ is maximum.

That property of maximazing the proper time will allow us to derive the equations of a geodesic. It will also yield the expressions of the energy and angular momentum of a particle in orbit around the center of attraction.

Energy of the particle

Let us apply the principle of maximisation of the proper time interval in the following manner. Suppose that a free spatial ship (whose rockets are turned off) falls radially, therefore along a straight path, towards the central attractive mass. Imagine that three successive flashes, with nearby time and space coordinates, are emitted inside the spaceship. We observe those three events in some external frame. In that latter frame the event $E_1$ consists in the emission of a flash at time $t = 0$ when the spatial engine is located at radius $r_1$. The flash $E_2$ is emitted at time $t$ when the cabin is at radius $r_2$. The flash $E_3$ is emitted at time $T$ when the cabin is at radius $r_3$. The quantity $T$ is assumed to be small. We then assume that we vary the intermediate coordinates of $E_2$. The principle of maximal aging says that the geodesic starting from $E_1$ and ending at $E_3$ will pass through Event $E_2$ such that the proper time interval
\begin{displaymath}
\tau = \tau_A + \tau_B,
\end{displaymath} (4)

is maximum. Here $\tau_A$ measures the interval over the first spacetime segment $A$, which connects $E_1$ to $E_2$ and $\tau_B$ measures the time interval over the second segment $B$, which connects $E_2$ to $E_3$.

In order to avoid varying all quantities at the same time, we assume in this experiment that the locations of the radii $r_1$, $r_2$ and $r_3$ are fixed and that only the time $t$, at which the second flash is emitted, is allowed to change. According to Formula (3) the interval of proper time over the first segment $A$ is given by its square

\begin{displaymath}
{\tau_A}^2 = (1 - 2M/r_A) t^2 + \mbox{(terms without $t$)}
\end{displaymath} (5)

from which we deduce
\begin{displaymath}
\tau_A d\tau_A = (1 - 2M/r_A)tdt
\end{displaymath} (6)

The lapse of time over Segment $B$ between the events $E_2$ and $E_3$ is $(T-t)$, and therefore the proper time duration $\tau_B$ is given by

\begin{displaymath}
{\tau_B}^2 = (1 - 2M/r) (T - t)^2 + \mbox{(terms without $t$)}
\end{displaymath} (7)

from which we deduce
\begin{displaymath}
\tau_B d\tau_B = - (1 - 2M/r_B)(T-t) dt.
\end{displaymath} (8)

To make the total time interval $\tau = \tau_A + \tau_B$ maximum with respect to a variation $dt$ of the time $t$, we write

\begin{displaymath}
\frac{d\tau}{dt} = \frac{d\tau_A}{dt} + \frac{d\tau_B}{dt} = 0
\end{displaymath} (9)

Deducing $d\tau_A$ and $d\tau_B$ from Equations (6) and (8) and letting quite naturally $t=t_A$ and $T-t = t_B$, we easily get
\begin{displaymath}
(1 -2M/r_A) (t_A/\tau_A) = (1 - 2M/r_B) (t_B/\tau_B)\ .
\end{displaymath} (10)

The left side of that equation depends only on parameters characterizing the first segment A (which connects $E_1$ to $E_2$). The right side depends only on parameters related to the second segment B (which connects $E_2$ to $E_3$).

We have discovered in Equation (10) a quantity that is the same for both segment. This quantity is thus a constant of the motion for the free particle under consideration. For good physical reasons (especially to recover the formulae of special relativity), one is led to identify that constant of motion as the ratio of the energy of the particle to its mass. We write this very important result under the form

\begin{displaymath}
E/m = (1-2M/r) (dt/d\tau)
\end{displaymath} (11)

an expression in which we have returned to the differential notation for the intervals $t$ and $\tau$.

Incidentally we may notice that with the units we have chosen, energy $E$ and mass $M$ are expressed in the same unit (for instance the centimeter).

Angular momentum of the particle

We have applied the principle of maximazing the proper time interval by varying the time of the intermediate event E2. We now perform the same operation but this time we vary the angle $\phi$ of that intermediate event. We recall that $\phi$ measures the direction of the moving particle with respect to some direction chosen as the origin. We call it the azimuth.

We consider again three events consisting in the emission of flashes inside a spaceship floating freely in space. The first segment $A$ connects Event $E_1$ to Event $E_2$. The second segment $B$ connects $E_2$ to $E_3$. The azimutal angle of the first event is fixed at $\phi=0$. The angle of the last one is fixed at $\phi = \Phi$. The intermediate azimuth is taken as the variable $\phi$. Again in order not to vary everything at the same time, we assume that the radius $r$ at which the second flash is emitted stays constant.

We follow the same chain of reasoning as in the previous section. From the metric (3), the time interval $\tau_A$ over the first segment is given by its square

\begin{displaymath}
{\tau_A}^2 = - {r_A}^2 \phi^2 + \mbox{(terms without $\phi$)}
\end{displaymath} (12)

and the interval $\tau_B$ over the second by
\begin{displaymath}
{\tau_B}^2 = - {r_B}^2 (\Phi - \phi)^2 + \mbox{(terms without $\phi$)}
\end{displaymath} (13)

from which we get
$\displaystyle \tau_A d\tau_A$ $\textstyle =$ $\displaystyle -{r_A}^2\phi d\phi$ (14)
$\displaystyle \tau_B d\tau_B$ $\textstyle =$ $\displaystyle {r_B}^2(\Phi - \phi) d\phi$ (15)

By writing $d\tau/d\phi = d(\tau_A + \tau_B)/d\phi = 0$ one easily obtains, similarly to Formula (10)
\begin{displaymath}
{r_A}^2 \phi_A/\tau_A = {r_B}^2 \phi_B/\tau_B
\end{displaymath} (16)

after having written quite naturally $\phi=\phi_A$ and $\Phi - \phi = \phi_B$. The left side, which contains only terms that are specific to the first segment, is equal to the right side, which contains only terms relative to the second segment. We thus exhibit another constant of motion, namely $r^2d\phi/d\tau$ (by shifting back to the differential notation), a quantity that turns out to be identified with the ratio of the angular momentum $L$ of the particle to its mass $m$, which we write as
\begin{displaymath}
L/m = r^2 (d\phi/d\tau)
\end{displaymath} (17)

Computing the orbit

Technically speaking in order to determine the trajectory of a moving body free from acceleration we apply the following strategy. Knowing the energy $E$ and the angular momentum $L$ of the particle of mass $m$ ($E$ and $L$ depend on the initial conditions) we can follow the position of that particle by computing the increments of its spacetime coordinates $t$, $r$ and $\phi$ as the proper time $\tau$ itself advances. Algebraically for each increment $d\tau$ of the proper time we compute (or the computer calculates) the corresponding increments $dt$, $dr$ and $d\phi$ of the coordinate of the mobile body. The squares of the increments $dt$ and $d\phi$ are extracted from Equations (11) and (17) in the following form:

$\displaystyle dt^2$ $\textstyle =$ $\displaystyle (E/m)^2 (1-2M/r)^{-2} d\tau^2$ (18)
$\displaystyle d\phi^2$ $\textstyle =$ $\displaystyle (L/m)^2 r^{-4} d\tau^2$ (19)

We notice that the expression of $dr$ is missing. We get it by transporting the values of $dt$ and $d\phi$ into the metric equation (3) and solving it for $dr$. This yields

\begin{displaymath}
dr^2 = \left\{ (E/m)^2 - (1 - 2M/r)[1 + (L/m)^2r^{-2}]\right\} d\tau^2
\end{displaymath} (20)

By dividing both sides of Equations (20) and (19) we directly arrive to the equation of the orbit in polar coordinates as

\begin{displaymath}
\left(\frac{1}{r^2} \frac{dr}{d\phi}\right)^2 = \left(\frac{...
...right) \left[\left(\frac{m}{L}\right)^2 + \frac{1}{r^2}\right]
\end{displaymath} (21)

The trajectory of the planet

By making the change of variable

\begin{displaymath}u = 1/r, \ \ du=-dr/r^2\end{displaymath}

Equation (21) becomes
\begin{displaymath}
\left(\frac{du}{d\phi}\right)^2 = \frac{E^2}{L^2} - (1 -2Mu)\left(\frac{m^2}{L^2} + u^2\right) \,.
\end{displaymath} (22)

Consider now a test particle following its closed orbit around the sun. Its distance $r$ to the central mass $M$ necessarily passes through a minimum $r_-$ and a maximum $r_+$, which correspond respectively to the perihelion and the aphelion. In conformity with the change of variable $u=(1/r)$ we let

\begin{displaymath}u_-=1/r_-= v\ ,u_+=1/r_+=w \ ,\ \mbox{with}\ \ r_-<r<r_+\ , w < u < v\ .\end{displaymath}

At both points the derivative $dr/d\phi$ vanishes, which writes:
$\displaystyle \frac{E^2}{L^2} - (1 - 2Mv)\left(\frac{m^2}{L^2} + v^2\right)$ $\textstyle =$ $\displaystyle 0$ (23)
$\displaystyle \frac{E^2}{L^2} - (1 - 2Mw)\left(\frac{m^2}{L^2} + w^2\right)$ $\textstyle =$ $\displaystyle 0 \ .$ (24)

It is easy to extract $E^2/L^2$ and $m^2/L^2$ from those equations as

$\displaystyle \frac{m^2}{L^2} = [v+w - 2M(v^2 +vw + w^2)]/2M$     (25)
$\displaystyle \frac{E^2}{L^2} = {(v+w)(1-2Mv)(1-2Mw)}/{2M}$     (26)

Expression (22) thus takes the form
$\displaystyle \left(\frac{du}{d\phi}\right)^2$ $\textstyle =$ $\displaystyle (v+w)(1-2Mv)(1-2Mw)/2M$ (27)
  $\textstyle -$ $\displaystyle (1-2Mu)[v+w -2M(v^2 + vw + w^2) + 2Mu^2]/2M$  

which does vanish for $u=v$ and $u=w$.

Multiplying both sides by the factor $(1 - 2Mu)^{-1}$ and developing that term up to the second order in M

\begin{displaymath}(1 - 2Mu)^{-1} \simeq 1 + 2Mu + 4M^2u^2\end{displaymath}

we get
$\displaystyle (1 - 2Mu)^{-1} \left(\frac{du}{d\phi}\right)^2 = (v+w)(1-2Mv)(1-2Mw)(1+2Mu+4M^2u^2)/2M$    
$\displaystyle -[v+w -2M(v^2 + vw + w^2) + 2Mu^2]/2M \ \ \ \ $   (28)

The trick to simplify the apparently quite complicated right side consists in noticing that we are dealing with a quadratic function of $u$ which vanishes at $u=v$ and $u=w$ (if we neglect the terms of order $M^3$). Therefore it has the form

\begin{displaymath}C(v-u)(u-w)\end{displaymath}

The value of the constant C is immediately obtained by letting $u=0$ as

\begin{displaymath}C = 1 - 2M (v+w)\end{displaymath}

which allows us to write Equation (28) for the trajectory in the following quite compact form
\begin{displaymath}
\left(\frac{du}{d\phi}\right)^2 = (1-2Mu)[1 -2M(v+w)](v-u)(u-w)
\end{displaymath} (29)

By taking the square root of both sides of that equation and by neglecting terms of order $M^2$ or higher the trajectory of the particle around the sun can be computed by integrating the expression

\begin{displaymath}
\frac{d\phi}{du}= \pm\frac{1 + M(u+v+w)}{\sqrt{(v-u)(u-w)}} \ .
\end{displaymath} (30)

The integration is trivial if one makes the change of variable $u \rightarrow\psi$ defined by

\begin{displaymath}
u = \frac{1}{2} (v+w) + \frac{1}{2} (v-w)\cos\psi
\end{displaymath} (31)

which easily leads to

\begin{displaymath}\sqrt{(v-u)(u-w)} = \frac{1}{2} (v-w) \sin\psi = - du/d\psi\end{displaymath}

or
\begin{displaymath}
\frac{du}{\sqrt{(v-u)(u-w)}} = -d\psi
\end{displaymath} (32)

We may count the angles starting at the perihelion. At that point $\phi=\psi=0$ and $u=v$. From that point $r$ increases and thus $u=1/r$ decreases while the angle $\psi$ increases. Taking into account Formulae (30), (31) and (32) we arrive at the very simple integral
\begin{displaymath}
\phi(u) = \int_0^{\psi(u)} \left[1 + \frac{3M}{2}(v+w) + \frac{M}{2}(v-w)\cos\psi\right]\,d\psi
\end{displaymath} (33)

The first term inside the brackets, which is equal to unity, leads to the classical newtonian ellipse. Actually, if$\phi=\psi$, then Expression (31) yields the equation of the trajectory in polar coordinates $(r,\, \phi)$ in the form

\begin{displaymath}
u \equiv \frac{1}{r} = \frac{1}{2} (v+w) + \frac{1}{2} (v-w)\cos\phi
\end{displaymath} (34)

which does correspond to an ellipse. Ordinarily the equation is written as
\begin{displaymath}
r = \frac{p}{1 + e \cos\phi}
\end{displaymath} (35)

where $p$ is the ellipse parameter (sometimes called the semi-latus rectum) and $e$ its eccentricity.

Identifying Formulae (34) and (35) we see that

\begin{displaymath}
\frac{1}{p} = \frac{1}{2} (v+w) = \frac{1}{2}\left(\frac{1}{r_-} + \frac{1}{r_+}\right)
\end{displaymath} (36)

and
\begin{displaymath}
e = \frac{v - w}{v + w} = \frac{r_+ - r_-}{r_+ + r_-}
\end{displaymath} (37)

With commun notations the major axis of the ellipse is
\begin{displaymath}
2a = r_- + r_+
\end{displaymath} (38)

and the following relation holds

\begin{displaymath}p=a(1- e^2) \ .\end{displaymath}

Advance of the perihelion

The second term in the integral (33), namely $(3M/2)(v+w)$, reveals that the angle $\phi$ is going to increase more than twice $\pi$ when the planet returns to its perihelion, from which it started. This phenomenon precisely represents the precession of the orbit. In fact we multiply by 2 the increase in the angle $\phi$ between the perihelion $r_-$ (corresponding to $\psi=0$) and the aphelion $r_+$ (corresponding to $\psi=\pi)$ to obtain a shift per period equal to
\begin{displaymath}
\Delta\phi = \frac{6M\pi}{p}
\end{displaymath} (39)

where $p$ is defined in (36).

The third sinusoidal term in $\sin\psi$ in the integral (33) adds only a periodical perturbation which produces a kind of noise.

Let us calculate the numerical value of the angle of precession. The semimajor axis of the elliptical orbit of Mercure is $a=5.8\times10^{12}\ \mbox{cm}$ and its eccentricity is $e=0.206$. Thus $p=5.55\times10^{12}\ \mbox{cm}$. The mass of the sun in centimeters is $M(\mbox{cm})=(G/c^2)M(\mbox{g})$, which yields (with $M=2\times10^{33}$ g) $M=1.5\times10^5$ cm. Thus

\begin{displaymath}\Delta\phi=5\times10^{-7}\ \mbox{radian/revolution} = 0.103''\mbox{/revolution}\end{displaymath}

Knowing that there are 415 revolutions per century we conclude that the advance of the perihelion amounts to

\begin{displaymath}\Delta\phi = 43''\ \mbox{par siècle}.\end{displaymath}

Period of revolution

Let us profit from those formulae to calculate the period of revolution $T$ of a planet around the sun and recover the newtonian result. Formula (17) teaches us that the increment $(1/2)r^2 d\phi=(1/2)(L/m)d\tau$ is proportional to the time increment $d\tau$. But that quantity is nothing else than the elementary area $dA$ swept out by the radius joining the sun to the planet. Therefore the total area swept out at time $\tau$ since time $\tau=0$ is
\begin{displaymath}
A= (1/2)(L/m)\tau
\end{displaymath} (40)

After one complete revolution the total time elapsed is $T$ and the total swept out area equals the area of the ellipse, namely $\pi ab$ if $a$ and $b$ denote the semiaxes. We can thus conclude that
\begin{displaymath}
T= 2\pi (m/L)\, a b
\end{displaymath} (41)

Taking into account Formula (25) which gives (ignoring the $M$ term inside the parentheses)

\begin{displaymath}\frac{m^2}{L^2}=\frac{1}{2}(v+w)/M = \frac{1}{pM} \end{displaymath}

and known relations

\begin{displaymath}p = a(1-e^2)\ ,\ b=a\sqrt{1-e^2}\ ,\end{displaymath}

we find
\begin{displaymath}T^2 =\frac{4\pi^2 a^3}{M} \ .\end{displaymath} (42)

In conventional units the formula reads

\begin{displaymath}T_{\mbox{\small {s}}}^2 /a_{\mbox{\small {cm}}}^3 = 4\pi^2/(G M_{\mbox{\small {g}}}) \ ,\end{displaymath}

an expression in which the velocity of light does not appear.

About this document...

Advance of the perihelion of Mercury
in General Relativity

This document was generated using the LaTeX2HTML translator Version 2002 (1.67)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html PrecessionMercure

The translation was initiated by Christian Magnan on 2007-01-09


Christian Magnan
2007-01-09

Page d'accueil