### 5.9 Integral

#### 5.9.1 Definition of Anti-derivative via its Differential Equation

The first derivative , the differential quotient, describes the change of a give function $Math content$ in its dependence from the variable $Math content$. We can now ask the converse question:

Integral Is there a function $Math content$ whose change is described by $Math content$ and what properties does this function have? If such a function exists , it is called the anti-derivative of $Math content$ or its indefinite integral. It is described by a very simple differential equation:

$Math content$

The integral sign serves as a reminder, that the calculation proceeds via a summation and the notation $Math content$ reminds us, that a limiting process takes place for the calculation for which the variable interval becomes infinitesimally small, that means $Math content$; we will visualize this soon.

This differential equations defines obviously a whole cohort of functions, that can differ by a constant value, because the derivative (change) of a constant vanishes. Thus the indefinite integral of a given function is known up to a constant.

$Math content$

If the differential equation has a meaningful solution, i.e. if the function is integrable, the indefinite integral is in analogy to the differential quotient a function, that describes, up to a constant a local property of the integrated function $Math content$.

#### 5.9.2 Definite Integral and Initial Value

What is the meaning of the integration constant? As long as we do not decide on the range of the variable $Math content$ it is simply an arbitrary number.

If we however start at a certain initial value $Math content$ and take into account, that $Math content$ is the change $Math content$ of the anti-derivative, then the anti-derivative describes the process of the changes in $Math content$ given by $Math content$ from the variable value $Math content$ onwards.

We now show this in a simple example from physics: we assume that $Math content$ is the time dependent velocity of $Math content$ of an object. The result of this time dependent velocity, which can also have negative values, is the distance traveled $Math content$ i.e. $Math content$. Thus $Math content$ determines the distance from the initial point as a function of time.

The constant $Math content$ is the initial value $Math content$ of the integral for the variable $Math content$, in out example the position from which we start.

Provided the range of the variable is open , i.e. $Math content$, the definite integral defined in this way is a function of the variable $Math content$.

If we are interested in the behaviour of the anti-derivative in a closed interval $Math content$ the definite integral becomes a fixed value. The value at the end of the integration range is the result of the initial value and of all changes until the final value of $Math content$ and is given by the anti-derivative $Math content$. The change within the interval results from the difference to the initial value. Calculating this difference also gets rid of the unknown integration constant, if we repeat the same line of thought with for initial and final value with an arbitrary initial value outside of the interval:

$Math content$

This relationship is known as main theorem of differential and integral calculus.

Thus, in order to calculate a definite integral we “only” need to know its anti-derivative. To determine the anti-derivative for an arbitrary function $Math content$ is in general not as easily possible as for the derivative. Basic functions can be easily integrated via inverting the well known relations for their derivatives; for many complicated functions there are tables. There are also quite a few useful, general rules, that can help to find the anti-derivative, for example “integration by parts”. But there is unfortunately no rule that always succeeds.

Therefore numerical methods play an especially important role for integration, as we will discuss later.

#### 5.9.3 Integral as Limit of a Sum

In analogy to calculating the partial sums of a series one can define the integral as surface measure of the function value in an interval of the variable. It is obvious, that one can not simply calculate a sum of function values, since their number would be infinitely large. The factor to be used is analogous to the index difference for series and is equal to the width of the interval. If one multiplies this factor with a suitably chosen function value we obtain a measure for the surface under the function in the interval.

Since functions change in general when the variable changes, choosing an arbitrary function value from the interval (for example at the the beginning , in the middle or at the end) can only yield an approximation. In this case one decomposes a larger interval $Math content$ into $Math content$ intervals chosen equal for expedience of width $Math content$ and sums over the approximate measures of the sub-intervals. Then the integral is defined as limit of this sum for a vanishingly small sub-interval.

The notation in the following three lines is somewhat inexact!!!

$Math content$

The definite integral provides the area between the function $Math content$ and the $Math content$-axis in the region of integration.

The limiting process is shown in the interactive simulation of Fig.5.10

The sine function to be integrated is drawn in blue, while the analytical integral function is drawn in red. The small blue point, that can be moved with the slider indicates the initial point for the integration and thus at the same time the zero point of the formal integral. The thick end point in magenta can be adjusted with the mouse. The second slide determines the number of sub-intervals .

The green rectangles represent the contribution for the individual interval, if the initial value of the function in the interval is assumed to be constant for the whole interval. The sum of the contributions for all intervals yields the big green point. With decreasing width of the intervals it approaches the analytically calculated integral. For a sufficiently large number of intervals this value runs along the integral curve when pulling the end point.

You will find further instructions for experiments in the description pages of the simulation.

#### 5.9.4 The Definition of the Integral due to Riemann

We still require a criterion to decide, whether a function can be integrated at all in a given region. In the classical sense this is provided by the Integral definition of Riemann.

RiemannInt For this purpose we define for the intervals given by $Math content$ with interval widths $Math content$ two sums, namely the upper sum and the lower sum, of which the first one uses the largest function value, the supremum, in each interval and the second one uses the smallest function value, the infinum, in each interval. If both sums converge to the same value for $Math content$, the one from above the other one from below, the function is considered as integrable in the Riemannian sense

an inexact notation again!!!

$Math content$

In the following interactive simulation shown in Fig.5.11 the construction of the Riemannian sums is demonstrated using the example of the sine function. In the left window the upper sum (supremum) is used and in the right window the lower sum (infinum). The width of all intervals is the same. The formal integral is shown in yellow. The initial and final $Math content$-values can again be adjusted as well as the number of intervals. With increasing resolution both sums tend to the same value.

The initial $Math content$-value can again be adjusted with a slider and the final $Math content$-value ( magenta coloured) can be pulled with the mouse. The number $Math content$ of sub-intervals in the integration region is adjusted with the second slider. The analytically determined integral is indicated in yellow. Its initial value is given by the initial ordinate of the integration region. The point that is surrounded by a square shows the sum of approximating intervals.

If it is known, that a function is Riemann-integrable, then any sum that uses as measure any value of the function in the sub-intervals converges against the integral. Thus one has a lot of freedom in the choice of numerical integration method. You are urged to compare the last two figures. The step-function approximation is neither equal to the approximation with the supremum nor to that via the infinum. but converges to the same limit.

As an example for a function, that can not be integrated in the Riemannian sense, the exotic function mentioned above can be considered:

$Math content$

In its domain of definition it has obviously an upper sum 1 and a lower sum 0, since there are both rational and irrational numbers in every interval of an arbitrarily small length $Math content$, and thus there exist function values of $Math content$ and $Math content$. Thus the upper sum and the lower sum converge, but not to the same value and therefore the function is not Riemann-integrable.

#### 5.9.5 Lebesgue Integral

LebesgueInt

The previous statement is not really satisfactory. The number of rational numbers is much smaller than that of the irrational ones, and therefore the function $Math content$ has the value !$Math content$ for nearly all values of $Math content$. Therefore the integral of this function should be close to 1.

This question can be more easily answered with the alternative notion of the Lebesgue-integral. For this approach one subdivides the integration region in stripes parallel to the $Math content$axis and asks for the limit of the sum over these intervals, each interval contributing the product of the function value in the interval and the corresponding Lebesgue-measure of the interval on the ordinate:

$Math content$

In the exotic example the top stripe has the function value $Math content$ and the measure of its variable interval is (for the moment approximately) $Math content$, since nearly all numbers are irrational. the lowest strip has the function value $Math content$, independent of the measure for the variable interval.

The exotic function is therefore Lebesgue-integrable and the result is $Math content$.

The advantage of the integral definition of Lebesgue is, that using it , the integral notion can beyond the domain of numbers to sets in general, if these set can be decomposed in to subsets, which each can be measured in the sense of a finite area. The following holds: a function, that is Riemann-integrable is also Lebesgue-integrable but the converse is not true. Thus the Lebesgue integral is the more general notion.

For the following simple example we visualize the integration of a parabola on the left hand side using Riemann’s approach and on the right hand side with Lebesgue’s approach. For the Lebesgue integral the interval measure was calculated in such a way, that the measure is exact irrespective of the width of the interval. the

#### 5.9.6 Rules for the Analytical Integration

As for derivatives there a number of important and general rules (the integration constant we drop in the following for clarity).

$Math content$

For the especially useful rules of partial integration and substitution of a new variable it is important, to find such functions that can be easily integrated, as for example the exponential function, powers of $Math content$ and the trigonometric functions.

The following formulas for basic functions without the integration constant follow very easily from the formulas given above for the first derivatives and therefore we only list those of the largest practical importance

$Math content$

The analytical integration of functions that can be integrated in principle and are possible rather complex is as a rule more tedious than the always easily achievable differentiation. Therefore there exist voluminous collections of integrals in the corresponding text books, manuals and on the internet. Numerical computer program such as Mathematica also have a wide range of formal integrals built in, that one can access as formulas if one enters the function to be integrated.

It is obvious that numerical integration methods play a very important role, since it does not matter for their application whether an integral of the function to be integrated is known analytically or not, and since one can even integrate functions, that are only known as discrete measured values $Math content$.

#### 5.9.7 Numerical Integration Methods

Integrals often have to be calculated numerically , if it is not possible to determine the anti-derivative analytically. In this case the sums obtained using step functions converge only relatively slowly when decreasing the interval widths; one would have to therefore subdivide the integration region into many sub-intervals to achieve a high accuracy.

Therefore one uses other approximations of the function $Math content$ than the step functions in order to reach convergence faster. An obvious approximation when looking at the last figure consists of not taking the value $Math content$ at the beginning of the interval as constant for the interval (step-function approximation), but to use the mean value between initial and final value $Math content$. This corresponds to a trapezoidal approximation , where one adds to the staircase the triangle leading to the next function value; the curve is now approximated via the initial value in the interval and the secant connecting the final and initial value with the slope $Math content$.

The approximation of the function becomes even more accurate if one uses a parabola

S Rule (Simpson’s/Kepler’s method), that is fixed via three consecutive function values. This now also takes the curvature (second derivative) in each interval into account approximately. Thus those regions of the function that possess like a parabola no turning points in the respective sub-intervals ($Math content$) are approximated well. One can continue is this manner if one uses polynomials of third or fourth degree which then also allow for the representation of turning points. However one then needs to use more and more intermediate points in each sub-interval. Therefore one usually restricts this approach to the parabola and chooses the interval sufficiently small.

All these methods have the advantage, that the approximation of the function in terms of constants, secants and parabolas can be quite easily integrated.

$Math content$

The simulation in Fig.5.13 compares the three methods for two adjacent sub-intervals. As example we again consider the sine function(blue) with its analytical integral (red). Initial and end point of the integration region can be changed. The sum of both sub-intervals is shown as a green point. The Simulation shows the large superiority of the parabolic approximation, whose result agrees with the red curve even for a coarse subdivision of the interval.

It is a bit tedious to calculate the parameters of a parabola, that goes through three points, but this is only necessary, if as for this simulation the osculating curves are calculated. The following steps are required for the calculation in each sub-interval $Math content$:

$Math content$

For the approximation to the integral over the sub-interval $Math content$ one obtains, using the parameters of the parabola and integrating a surprisingly simple formula, for which only the three function values and the width of the interval are required.

$Math content$

#### 5.9.8 Error Estimate for Numerical Integration

To get an idea of the accuracy of the different integration methods, we expand the function in a Taylor series and use, assuming the interval is sufficiently small, the first neglected term as an estimate for the error. To simplify the notation, we expand the function in a Taylor series around $Math content$ up to 5th order:

$Math content$

$Math content$

For the step-function method we use only the first term $Math content$ in 1). The error for each interval is thus of the order $Math content$. If one wants know the error for the whole integration region, one has to sum over $Math content$ interval. Thus the total error is proportional to $Math content$. Doubling the resolution leads to halving of the error or doubling of the accuracy.

For the trapezoidal method the first two terms are used in 1). The error then is proportional to $Math content$, thus the total error depends on $Math content$. Doubling the resolution leads to a improvement in the accuracy by a factor of $Math content$.

For the parabola method we expand the function from the middle of the double interval once to the right and once to the left and the integral over the whole interval is the sum over both sub-interval. The result then only contains odd powers of $Math content$. For the parabola we also take into account the curvature, i.e. $Math content$. The error for each interval is then proportional to $Math content$, the total error is thus proportional to $Math content$; doubling the resolution leads to an increase of the accuracy by a factor of $Math content$. In addition the large factor $Math content$ contributes to a small error.

Important hint: the approximating parabola used for the integration is not identical with the third partial sum of the Taylor series. This one only agrees with the function at the computation point, while the approximating parabola used for the integration is equal to the function at all three points.

The following figure compares the deviation from the analytic integral for the sine function in double logarithmic scale for the trapezoidal and parabolic methods as function of the resolution (number) of sub-intervals. The points represent the numerical integration results over a constant integration region, the lines represent the functions $Math content$ and $Math content$, with $Math content$ and $Math content$ chosen in such a way, that both lines coincide with the numerical error for the smallest number of intervals. The further behaviour of both functions and the points confirms the expected dependence on $Math content$.

This example should show demonstrate to you, how versatile the Taylor series of fifth order is, and therefore we have treated this in such depth.