9.4 Numerical Solution Methods, Initial Value Problem

If one also allows for non-linear relationships, most of the differential equations that are important for physics are quite simple. This is maybe not really a surprise, but nature in its deepest relationships is really simple! The causes and effects expressed via the differential equations can thus be clearly and quickly derived and understood from physics.

In spite of this simplicity the solution of differential equations can become highly complex, especially if they are non-linear. In all generality they can only be solved (integrated) in individual cases. Therefore one makes simplifying assumptions about the form of the differential equations or first makes calculations for simply solvable special cases and treats general cases, that deviate only little from these, using the so-called “perturbation theory”

Due to the ubiquity of the personal computer many of these restrictions fall away. Suitable numerical programs can calculate the solutions of non-linear and implicit differential equations as quickly and accurately as the solution of those equations, that can be easily solved in the classical analytical way.

Using a computer it is also possible to solve systems of many differential equations in reasonable time. These are for example present, if the interaction between many bodies is supposed to be calculated. In this case a N-body problem leads in general to 6N differential equations and a corresponding number of initial conditions, since every body has six degrees of freedom, namely three for the position and three for the momentum. A extreme example is a simulation of the gravitational collapse in the early cosmos that was done at the Max Planck Institute for Astrophysics in Munich. In this calculation the interaction between 1010 mass points was simulated and the calculation time of the supercomputer used was about one month! On the data carrier for the digital version of the essay edited by Martienssen and Röss, that was announced in the introduction to the present volume, there is a video of this simulation, which is described in a contribution by Günter Hasinger.

In the following we sketch the general approach to the solution first for the example of the explicit differential equation of first order and compare it with the familiar immediate integration approach.

Given an initial value of the function and the relationship between derivative, function and variable we proceed as follows:

direct integrationdifferential equation  differential equationy = f(x)differential equation y = f(x,y) y0 = f(0,y0) initial value y0 = Cinitial value y0 solution y =0xf(x)dx = g(x) + Crequired: solution for y(x) mity(0) = y0. mitf(x) = dg(x) dx

For the normal integration task the derivative is a priori known in the whole interval as a function of the variable, while for the differential integration it can initially only be computed from the differential equation and the given initial value. For other values of x one does not yet know y and therefore also y. Thus one has to determine for the whole interval y and y at the same time. To achieve this one has, in addition to the differential equation, that is valid everywhere, only the initial value as well as the initial value of the derivative obtained from the differential equation available

The numerical methods correspond to a careful step from the first to the second point , from there to the next, and from there to the one after that, and so on. Thus, depending on the method, one arrives at a more or less suitable guess, as to, given the initial values and the initial slope, where the next point could lie. For this point one uses the differential equation to calculate an estimate of the next point. In every step errors are created, and therefore it is quite astonishing what accuracy can be reached with the advanced methods for rather simple algorithms. This is helped by the fact, that many interesting tasks deal with periodic problems (orbits of planets, pendulums, periodic electric fields), such that positive and negative errors from the two half periods compensate each other quite often.

9.4.1 Explicit Euler Method

For the simplest method, the classical Euler-method, one assumes, that the next value of y lies on the tangent that starts at the initial value y0 and has a slope of y(0): Euler

initial value given as y0 differential equation y = f(x,y) y0 = f(x0,y0) y1 = y0 + Δx y0 x1 = x0 + Δx y1 = f(x1,y1) y2 = y1 + Δx y1 ... yn = yn-1 + Δx yn-1 yn = f(xn,yn).

The method is called “explicit”, since only data of the n - 1-th point are used to calculate the n-th point.

The Euler method is analogous to the the integration of a known functions y using the previously discussed method of trapezoidal steps.The additional complication with the analogous use for the initial value problem of a differential equation is, that both the function as well as its derivative are unknown except for the initial point. The knowledge of the relationship between function and derivative is however sufficient to determine both of them approximately. However, one pays the price, that the determination of y1 at the first point is affected by the error committed when estimating y1 itself from the initial values.

In Fig.9.5 the situation is clarified graphically for the example of the exponential function drawn in red. At the initial abscissa x0 the initial value of the function y0 is known. The differential equation yields the slope of the tangent drawn in blue. Its intersection with the interval boundary x1 gives the next value according to the Euler method 1 marked by a blue circle. In this example this value is clearly smaller then the actual value y1 of the exact curve, since the exponential function does not have a constant but rather a constantly increasing slope. The Euler method does not take into account changes of the derivative during the interval. Therefore one must make Δx as small as possible, to limit the error.

According to the construction of the algorithm one does not use the slope of the curve at the initial point x1 of the new interval, but a value that is obtained from abscissa and ordinate of the point via the differential equation, i.e. y1 = f(x1,1).

We have chosen the exponential function for the example, since the ordinate x does not appear explicitly in the corresponding differential equation. Therefore a simple graphical construction is possible for the second value of the derivative: it is equal to the slope of the dashed green tangent on the red curve at the ordinate of the second point 1. With this slope we continue (blue) parallel to the dashed green line from the first calculated point to the next one.

In the general case the relationship would be less clear.


PIC
Figure 9.5: One step for the Euler-method: details see text.

As known from the analogous integration method, the error of this simple method is quite substantial. It can be controlled to some extent at the price of larger computational effort by choosing the intervals Δx sufficiently small and decreases linearly with the width of the integration intervals. With growing resolution the method converges linearly to the correct solution. For periodic functions the errors partially compensate each other in the half periods, since the deviation is negative for a concave graph, while it is positive for a convex one.

9.4.2 Heun Method

Heun

The Heun method also calculates the corresponding next point in such a way, that it lies on a straight line through the initial point x0 (This also applies for the Runge-Kutta method, that is described next and has especially large practical importance). In contrast to the Euler method a more favourable angle is used. For the Euler method this angle was simply determined as the result of the differential equation at the respective initial point of the new interval. For the so-called multi-stage methods, of which the Heun method is one, this angle is determined as mean value of from several calculations. Thus the slope is obtained in more than one point using the differential equation.

As shown in Fig.9.6 the Heun method uses for one step of the method the differential equation both in the initial point as well as at the endpoint of the interval. As the Euler method it first calculates the so-called Euler point (the blue point in Fig.9.6). using the slope of the tangent at the start point. Then it calculates the corresponding derivative at this point. In the figure this slope corresponds to the dashed blue tangent. Now the mean value of these two slopes (not of the angles, but rather of their tangent) is calculated, which is indicated by a dashed line with the corresponding slope in magenta. With this average slope on now calculates in the forward direction from the initial point (solid green line, that has been shifted in parallel). Its intersection with the interval boundary at x1 is the next point of the Heun method (green point). It is considerably closer to the “true” value then the result of the Euler approximation.


PIC
Figure 9.6: One step for the Heun-method: details see text.

Expressed in formulae:

forwardy0 = f(x0,y0) y1,Euler = y0 + Δx y0Euler point asintermediate step y1,Euler = f(y1,Euler) calculation of the mean value y0¯ = y0+y 1,Euler 2 y1 = y0 + Δx y0¯;y 1 = f(y 1)

In the form presented above, the Heun method is implicit, since the new point to be calculated appears on both sides of the equation. The equations therefore have to be solved with iterative methods.

The Heun method proceeds analogously to the integration of a known function with the help of the trapezoidal chord method. As shown when this method was discussed the accuracy is considerably better then for the Euler method. The error of the Heun method thus decreases quadratically with the interval width, the method converges quadratically. It takes the change of the derivative within the interval into account in a linear approximation, thus it assumes a kink.

9.4.3 Runge-Kutta Method

Runge-Kutta

The Euler- and Heun methods have been described for historic systematic reasons, but even more so for didactical reasons. In their simple form they are nowadays not used anymore, since the larger computational effort per interval for more advanced multi-stage methods is today not an argument any more, and therefore one can achieve much more accurate results for the same interval width.

The most popular kings road to the integration of differential equations is the Runge-Kutta method. In its four-step basic version it is analogous to the parabolic approximation for the integration of known functions and takes into account the change of the slope within in the interval in a quadratic approximation, thus it uses a parabolic curvature. As the integration using the parabolic approximation it converges with the fourth power of the interval width Δx4.

For the parabolic method one uses, as described above, three points to fix the parabola, that approximate the true curvature in the interval: the initial point x0, the midpoint of the interval x12 and the endpoint x1.

For the integration the value of the derivative of the anti-derivative to be determined is known in the whole interval, thus also in those three points. When solving the differential equation, the derivative is initially only known at the start-point of the first interval. The derivatives at the following points first have to be found. We now first compare for the start-point y0 and the interval width Δx δ the structure of the formulas for calculating the next point y1.

integration using parabolic method  Runge-Kutta method  y0 =  y(x0) y0 =  y(x0) y1 = y0 + δ 6(y0 + 4y 12 + y 1) y 1 = y0 + δ 6(y0 + 2y 12a + 2y 12b + y 1c)

One can recognize the formal similarity. However for the Runge-Kutta method the listed derivatives are not the actual differential quotients of the desired solutions, but auxiliary variables which are obtained using the differential equation. In addition we use instead of the derivative in the middle of the interval the mean value of two corresponding quantities with indices a,b.

Runge-Kutta method for an interval interval widthδ initial variable  x 0  initial ordinate y0 y0 =  y(x0) y0 = f(x0,y0) y12a = y0 + δ 2y0 y 12a = f(x 0 + δ 2,y12a) y12b = y0 + δ 2y12a y 12b = f(x 0 + δ 2y12b) y1c = y0 + δy12b; y1c = f(x0 + δ,y12b) y1 = y0 + δ 6(y0 + 2y 12a + 2y 12b + y 1c) y 1 = f(x 0 + δ,y1)

One defines an auxiliary abscissa in the middle of the interval and calculates for it in a two step procedure two points a and b with their ordinates and derivatives. The first intermediate auxiliary point in the middle of the interval(index a) corresponds to the Euler point for half the interval width. Using the derivative at the Euler point one determines, beginning at the initial point a second point in the middle of the interval (b). With the derivative at this point one determines a third point at the end point of the interval with its associate derivative (c). After taking the average of the two derivatives at the midpoint, one has three points for the integration according to the parabolic method.

9.4.4 Further Developments

The just-described four-stage Runge-Kutta method converges so well that it is used for many applications.

One can improve the convergence of the method further by including additional points - similar to an approximation using polynomials of a higher order.

The speed of computation can be increased considerably by choosing the interval width not constant, but adapting it to the slope and curvature of the function to be integrated (“adaptive interval width”). This possibility is contained in popular five-stage runge-kutta programs and other numerical programs. Thus the interval can be selected automatically in such a way, that a given error per interval is not exceeded.

The approximation rules used in the Runge-Kutta are well tested, but they are not the only feasible ones. One can work with other criteria, that are more favourable for special classes of functions. In addition there are quite a few approximation methods, that have been derived in a very different way and that are also part of commercial programs and are discussed in the literature.

The computation speed of all methods depends on whether one works in higher-level languages or with languages that are closer to the operating system. Programs in Java or in Mathematica therefore run faster then algorithms written for example in Visual Basic for EXCEL. The speed of the following Java simulations is not limited by the computation speed, but is selected in such a way, that one can easily follow the time development.

A program that one has written from scratch has, compared to using prebuilt algorithms that run in the background, the didactic advantage, that one can accurately follow the development and intervene in it.