× Binary Unit Round-off Roundoff Error Number System with base β Floating point error analysis Orders of Convergence Newton's Method Error Analysis for Newton's Method Theorems on Newton's Method Horner's Algorithm Contractive Mapping Polynomial Interpretation Hermite Interpolation Change of Intervals Gaussian Quadrature Error Analysis for Gaussian Quadrature Composite Formulae θ-Method Error Initial Value Problem Stability Local Truncation Error for Multistep Methods Schur's Form Hermite Matrix QR Factorization Householder LU Decomposition Norm Positive Definite Gerschgorin Theorem Rayleigh Quotient August 2017 August 2018 January 2019 August 2019 January 2020 August 2020 January 2021 August 2021 January 2022 August 2022 January 2023 August 2023 January 2024 References
☰ Menu

My name is Hanzhang Yin, and I have developed this website as a resource to facilitate the review of key concepts in abstract algebra, and it is not fully completed. Feel free to email me at hanyin@ku.edu if there are any errors or suggestions for improvement.

Numerical Analysis

Binary

x=±(0.a1an)2×2m where a1=1

Note. For Marc-32, n=24, 126m127. For Marc-64, n=53, 1021m1024.


Unit Roundoff

fl(1+ε)>1 ε=224 108 ε=253 1016


Roundoff Error

Let δ=xfl(x)x. define |δ|ε Suppose that x=±(0.a1an)2×2m, then we pick two points such that x is less than x and x+ is greater than x. Then we have x=(0,a1an)2×2m x+=(0,a1an+0.0001n)2×2m fl(x)=x if an+1=0(x is closer to x_-than x_+) and fl(x)=x+ if an+1=1 (x is closer tox_+ than x_-) |fl(x)x|12×(x+x)=122n2m |fl(x)xx|122n2m(0,a1an)2×2m=2n


Number System with base β

The unit roundoff is ε=12β1t where t is the number of digits in the mantissa.


Floating point error analysis

Let x,y be two machine numbers. Let is a binary operation. fl(xy)=(xy)(1+δ) where |δ|ε is the roundoff error. fl(x)=x(1+δ)

Example.


Orders of Convergence

Let {xn} be a sequence of real numbers tending to a limit x. We say that the rate of convergence is at least linear if there are a constant c<1 and an integer N such that |xn+1x|c|xnx|(nN) We say that the rate of convergence is at least superlinear if there exist a sequence ϵn tending to 0 and an integer N such that |xn+1x|ϵn|xnx|(nN) The convergence is at least quadratic if there are a constant C (not necessarily less than 1) and an integer N such that |xn+1x|C|xnx|2(nN) In general, if there are positive constants C and α and an integer N such that |xn+1x|C|xnx|α(nN) we say that the rate of convergence is of order α at least.

"Numerical Analysis: Mathematics of Scientific Computing" page. 17.

Newton's Method

Definition. Newton's method, also known as the Newton-Raphson method, is an iterative root-finding algorithm used to find successively better approximations to the roots (or zeros) of a real-valued function. Given a function f(x) and its derivative f(x), Newton's method starts with an initial guess x0 and iteratively refines this guess using the following formula: xn+1=xnf(xn)f(xn), where:

Convergence Criteria. The method typically converges if the initial guess x0 is sufficiently close to the actual root, and f(x) is continuously differentiable in the neighborhood of the root.

Example. Consider finding the root of the function f(x)=x22. The derivative of the function is f(x)=2x. Firstly, we start with an initial guess x0. Secondly, we apply the Newton's iteration: xn+1=xnxn222xn=2xnxn22xn2=xn+2xn2 So, the iterative formula becomes: xn+1=12(xn+2xn) By repeatedly applying this formula, the sequence {xn} will converge to the square root of 2.


Error Analysis for Newton's Method

Let r be a solution of f(x)=0 (i.e. f(r)=0). Suppose that we have already computed xn and the error in xn is en=|xnr|. We now derive a formula that relates the error after the next step, |xn+1r|, to |xnr|. According to Taylor's theorem, there is a c between xn and r such that 0=f(r)=f(xn)+f(xn)(rxn)+12f(c)(rxn)2(1) By the definition of Newton's method, we have 0=f(xn)+f(xn)(xn+1xn)(2) Subtracting (2) from (1). 0=f(xn)(rxn+1)+12f(c)(rxn)2xn+1r=f(c)2f(xn)(xnr)2en+1=|xn+1r|=|f(c)2f(xn)||xnr|2 If xn is close to r, then c, which must be between xn and r, is also close to r and en+1=|xn+1r||f(r)2f(r)||xnr|2=|f(r)2f(r)|en2=Cen2

Thus, we can see that Newton's Method is quadratically convergent.

However, the disadvantage of Newton's Method is that if the initial value is closed enough to the root.


Theorems on Newton's Method

Theorem. Let f be continuous and let r be a simple zero of f. Then there is a neighborhood of r and a constant C such that if Newton's method is started in that neighborhood, the successive points become steadily closer to r and satisfy |xn+1r|C(xnr)2(n0)

$\textbf{Theorem on Newton's Method for a Convex Function. }$ If f belongs to C2(R), is increasing, is convex, and has a zero, then the zero is unique, and the Newton iteration will converge to it from any starting point.

$\textbf{Convergence of Newton's method.}$ Assume that f(x), f(x), and f(x) are continuous for all x in some neighborhood of α and f(α)=0 and f(α)0. If x0 is sufficiently close to α, then xnα as n. Moreover, limn|xn+1α||xnα|2=12|f(α)f(α)|.


Horner's Algorithm

This method is also known as Horner's method, nested multiplication, or synthetic division.

Definition.  Horner's algorithm is a method for evaluating a polynomial at a point. Given a polynomial p(x)=a0+a1x+a2x2++anxn, Horner's algorithm evaluates p(x) at a point x=c as follows: p(c)=a0+c(a1+c(a2++c(an1+can))).

Given a polynomial p(x)=anxn+an1xn1++a1x+a0. Then bn1=anbn2=an1+z0bn1b0=a1+z0b1p(z0)=a0+z0b0 Horner's algorithm evaluates p(x) at a point x=z0 as follows: anan1an2a0z0z0bn1z0bn2z0b0bn1bn2bn3b1 The boxed number satisfies p(z0)=b1.

Example.  Given the polynomial p(x)=4x33x2+2x1 and z0=2, evaluate p(z0). 4321281024451223 Hence, we have p(2)=23.

Theorem on Horner's Method.  Let p(x)=anxn++a1x+a0. Define pairs (αj,βj) for j=n,n1,,0 by the algorithm {(αn,βn)=(an,0)(αj,βj)=(aj+xαj+1,αj+1+xβj+1)(n1j0) Then α0=p(x) and β0=p(x).


Contractive Mapping

Newton's method and Steffensen's method are examples of procedures whereby a sequence of points is computed by a formula of the form (1)xn+1=F(xn)(n0) The algorithm defined by such an equation is called functional iteration.

"Numerical Analysis: Mathematics of Scientific Computing" page. 100.

Definition.  A mapping F of a metric space (X,d) into itself is called a contraction if there exists a constant k with 0k<1 such that d(F(x),F(y))kd(x,y) for all x,y in X.

"Numerical Analysis: Mathematics of Scientific Computing" page. 101.

Contractive Mapping Theorem

Theorem (Contractive Mapping Theorem). Let C be a closed subset of the real line. If F is a contractive mapping of C into C, then F has a unique fixed point. Moreover, this fixed point is the limit of every sequence obtained from Equation (1) with a starting point x0C.

"Numerical Analysis: Mathematics of Scientific Computing" page. 102.


Polynomial Interpretation

Theorem on Polynomial Interpolation Error

Theorem. Let f be a function in Cn+1[a,b], and let p be the polynomial of degree at most n that interpolates the function f at n+1 distinct points x0,x1,,xn in the interval [a,b]. To each x in [a,b] there corresponds a point ξx in (a,b) such that f(x)p(x)=1(n+1)!f(n+1)(ξx)i=0n(xxi).


Lagrange's Form

The Lagrange form of the interpolation polynomial is a method for constructing a polynomial that passes through a given set of points. Specifically, given a set of n+1 distinct data points (x0,y0),(x1,y1),,(xn,yn), the Lagrange interpolation polynomial P(x) is defined as: P(x)=i=0nyii(x) where i(x) are the Lagrange basis polynomials given by: i(x)=0jnjixxjxixj Each i(x) is a polynomial of degree n that is 1 at x=xi and 0 at x=xj for ji.

Example.  Given points: (1,1),(2,4),(3,9). Here x0=1,x1=2,x2=3, and y0=1,y1=4,y2=9.

Compute the Lagrange basis polynomials i(x): 0(x)=(x2)(x3)(12)(13)=(x2)(x3)2 1(x)=(x1)(x3)(21)(23)=(x1)(x3) 2(x)=(x1)(x2)(31)(32)=(x1)(x2)2

Construct the interpolation polynomial P(x): P(x)=10(x)+41(x)+92(x). P(x)=1(x2)(x3)24(x1)(x3)+9(x1)(x2)2. Simplifying, we get: P(x)=(x2)(x3)24(x1)(x3)+9(x1)(x2)2.


Divided Differences

The divided differences of a set of data points are a sequence of numbers that are used to construct the Newton form of the interpolation polynomial.

f[xi,xi+1,,xi+j]=f[xi+1,xi+2,,xi+j]f[xi,xi+1,,xi+j1]xi+jxi The Following table shows the divided differences for a set of data points: x0f[x0]f[x0,x1]f[x0,x1,x2]f[x0,x1,x2,x3]x1f[x1]f[x1,x2]f[x1,x2,x3]x2f[x2]f[x2,x3]x3f[x3]

Example.  Given the following table of data points: x3156f(x)1324 Solution: We arrange the given table vertically and compute divided differences by the use of Formula above, arriving at 3123/87/40135/43/2052264

Theorem on Permutations in Divided Differences. The divided difference is a symmetric function of its arguments. Thus, if (z0,z1,,zn) is a permutation of (x0,x1,,xn), then f[z0,z1,,zn]=f[x0,x1,,xn].

Theorem on Derivatives and Divided Differences. If f is n times continuously differentiable on [a,b] and if x0,x1,,xn are distinct points in [a,b], then there exists a point ξ in (a,b) such that f[x0,x1,,xn]=1n!f(n)(ξ).


Newton's Form

The Newton form of the interpolation polynomial is a method for constructing a polynomial that passes through a given set of points. Specifically, given a set of n+1 distinct data points (x0,y0),(x1,y1),,(xn,yn), the Newton interpolation polynomial P(x) is defined as: P(x)=a0+a1(xx0)+a2(xx0)(xx1)++an(xx0)(xx1)(xxn1) where a0,a1,,an are the divided differences of the data points.

Remark. Here, an=f[x0,x1,,xn].

Example.  Given the following table of data points: x3156f(x)1324 We already found the divided differences in the previous example. The Newton form of the interpolation polynomial is p(x)=1+2(x3)38(x3)(x1)+740(x3)(x1)(x5).


Theorem on Polynomial Interpolation Error

Theorem. Let f be a function in Cn+1[a,b], and let p be the polynomial of degree at most n that interpolates the function f at n+1 distinct points x0,x1,,xn in the interval [a,b]. To each x in [a,b] there corresponds a point ξx in (a,b) such that f(x)p(x)=1(n+1)!f(n+1)(ξx)i=0n(xxi).


Hermite Interpolation

Hermite Interpolation Theorem. Let n0, and suppose that xi, i=0,,n, are distinct real numbers. Then, given two sets of real numbers yi, i=0,,n, and zi, i=0,,n, there is a unique polynomial p2n+1 in P2n+1 such that p2n+1(xi)=yi,p2n+1(xi)=zi,i=0,,n.(6.13)

Remark. In general, if there is a set of m distinct points, there is a unique polynomial of degree 2m1 that interpolates the function at those points for hermite interpolation.

"Introduction to Numerical Analysis", page. 188.

Theorem. Suppose that n0 and let f be a real-valued function, defined, continuous and 2n+2 times differentiable on the interval [a,b], such that f(2n+2) is continuous on [a,b]. Further, let p2n+1 denote the Hermite interpolation polynomial of f defined by (6.16). Then, for each x[a,b] there exists ξ=ξ(x) in (a,b) such that f(x)p2n+1(x)=f(2n+2)(ξ)(2n+2)![πn+1(x)]2,(6.17) where πn+1 is as defined in (6.9). Moreover, |f(x)p2n+1(x)|M2n+2(2n+2)![πn+1(x)]2,(6.18) where M2n+2=maxζ[a,b]|f(2n+2)(ζ)|.

"Introduction to Numerical Analysis", page. 190.

Theorem on Hermite Interpolation Error Estimate Let x0,x1,,xn be distinct nodes in [a,b] and let fC2n+2[a,b]. If p is the polynomial of degree at most 2n+1 such that p(xi)=f(xi)andp(xi)=f(xi)(0in) then to each x in [a,b] there corresponds a point ξ in (a,b) such that f(x)p(x)=f(2n+2)(ξ)(2n+2)!i=0n(xxi)2

"Numerical Analysis: Mathematics of Scientific Computing" page. 344.


Change of Intervals

Suppose that a numerical integration formula is given: cdf(t)dti=0nAif(ti). Let us assume that it is exact for all polynomials of degree m or less. If we want to integrate the function f over the interval [a,b], we can use the change of interval formula. We first define a linear function λ of t such that if t traverses [c,d], λ(t) will traverse [a,b]. The function λ is given explicitly by λ(t)=badct+adbcdc. Now in the integral abf(x)dx we make the change of variable x=λ(t). Then dx=λ(t)dt=(ba)(dc)1dt, and so we have abf(x)dx=badccdf(λ(t))dt badci=0nAif(λ(ti)) Hence, we have abf(x)dxbadci=0nAif(badcti+adbcdc) Observe that, because λ is linear, f(λ(t)) is a polynomial in t if f is a polynomial, and the degrees are the same. Hence, the new formula is exact for polynomials of degree m. For Simpson’s rule, this procedure yields Equation (6) as a consequence of Equation (5) using λ(t)=(ba)t+a.

"Numerical Analysis: Mathematics of Scientific Computing," page. 486.


Gaussian Quadrature

The theory can be formulated for quadrature rules of a slightly more general form; namely, abf(x)w(x)dxi=0nAif(xi), where w is a fixed positive weight function.

The Gauss quadrature rule: (10.6)abw(x)f(x)dxGn(f)=k=0nWkf(xk), where the quadrature weights are (10.7)Wk=abw(x)[Lk(x)]2dx, and the quadrature points xk,k=0,,n, are chosen as the zeros of the polynomial of degree n+1 from a system of orthogonal polynomials over the interval (a,b) with respect to the weight function w. Since this quadrature rule was obtained by exact integration of the Hermite interpolation polynomial of degree 2n+1 for f, it gives the exact result whenever f is a polynomial of degree 2n+1 or less.

"Introduction to Numerical Analysis", page. 279.


Error Analysis for Gaussian Quadrature

Given the Gaussian quadrature formula abf(x)dx=i=1nwif(xi)+Rn(f), where Rn(f)=(ba)2n+1(n!)4(2n+1)[(2n)!]3f(2n)(c).

Theorem. Suppose that w is a weight function, defined, integrable, continuous and positive on (a,b), and that f is defined and continuous on [a,b]; suppose further that f has a continuous derivative of order 2n+2 on [a,b], n0. Then, there exists a number η in (a,b) such that (10.18)abw(x)f(x)dxk=0nWkf(xk)=Knf(2n+2)(η), and Kn=1(2n+2)!abw(x)[πn+1(x)]2dx. Consequently, the integration formula (10.6), (10.7) will give the exact result for every polynomial of degree 2n+1.

"Introduction to Numerical Analysis," page. 282.

Theorem on Gaussian Formula with Error Term Consider a Gaussian formula with error term: abf(x)w(x)dx=i=0n1Aif(xi)+E For an fC2n[a,b], we have E=f(2n)(ξ)(2n)!abq2(x)w(x)dx where a<ξ<b and q(x)=i=0n1(xxi).

Proof. By Hermite interpolation (Section 6.3, p. 338), a polynomial p of degree at most 2n1 exists such that p(xi)=f(xi),p(xi)=f(xi)(0in1) The error formula for this interpolation is f(x)p(x)=f(2n)(ζ(x))q2(x)(2n)!(11) as given in Theorem 2 of Section 6.3 (p. 344). Here q(x) is the polynomial of degree n which is w-orthogonal to the polynomials of degree at most n1. It follows that abf(x)w(x)dxabp(x)w(x)dx=1(2n)!abf(2n)(ζ(x))q2(x)w(x)dx Now use the fact that p is of degree at most 2n1 to see that abp(x)w(x)dx=i=0n1Aip(xi)=i=0n1Aif(xi) Furthermore, the Mean-Value Theorem for Integrals can be used to write abf(2n)(ζ(x))q2(x)w(x)dx=f(2n)(ξ)abq2(x)w(x)dx This requires continuity of f(2n)(ζ(x)), which can be inferred from Equation (11). The desired error formula results from making the substitution described.

"Numerical Analysis: Mathematics of Scientific Computing" page. 498.

Theorem 8. Let w(x) be a weight function and pn(x) be the monic polynomial of degree n orthogonal to all polynomials of smaller degrees. Let xj, j=1,2,,n, be zeros of pn(x). Let Q(f) be the quadrature rule defined in Eqs. (36) and (37). Suppose fC2n[a,b]. Then one can find λ(a,b) such that abf(x)w(x)dx=Q(f)+γnf(2n)(λ)(2n)!,whereγn=abpn2(x)w(x)dx.(43)

Proof. We use the Hermite interpolation to prove the error estimate. There exists a unique polynomial h2n1(x) of degree 2n1 such that h2n1(xj)=f(xj)andh2n1(xj)=f(xj). In addition, there exists ζ(a,b) depending on x such that f(x)=h2n1(x)+f(2n)(ζ)(2n)!(xx1)2(xxn)2. Note that (xx1)(xxn)=pn(x) as pn is monic and xj, j=1,,n, are its roots. Therefore, abf(x)w(x)dx=abh2n1(x)w(x)dx+abf(2n)(ζ(x))(2n)!pn2(x)w(x)dx. Since the quadrature is exact for polynomials of degree 2n1, it is exact for h2n1(x). Hence abh2n1(x)w(x)dx=Q(h2n1)=j=1nh2n1(xj)wj=j=1nf(xj)wj=Q(f). On the other hand, because pn2(x)w(x)0 on [a,b], we can apply the mean value theorem and get abf(2n)(ζ(x))(2n)!pn2(x)w(x)dx=f(2n)(λ)(2n)!abpn2(x)w(x)dx. for some λ(a,b). This completes the proof.


Composite Formulae

Definition(Composite trapezium rule). abf(x)dxh[12f(x0)+f(x1)++f(xm1)+12f(xm)].

Introduction to Numerical Analysis page. 209.

Definition (Composite Simpson rule). abf(x)dxh3[f(x0)+4f(x1)+2f(x2)+4f(x3)++2f(x2m2)+4f(x2m1)+f(x2m)].

Introduction to Numerical Analysis page. 210.

Definition (Composite Midpoint Rule). The composite midpoint rule is the composite Gauss formula with w(x)1 and n=0 defined by abf(x)dxhj=1mf(a+(j12)h). This follows from the fact that when n=0 there is one quadrature point ξ0=0 in (1,1), which is at the midpoint of the interval, and the corresponding quadrature weight W0 is equal to the length of the interval (1,1), i.e., W0=2. It follows from (10.24) with n=0 and C0=11t2dt=23 that the error in the composite midpoint rule is E0,m=(ba)324m2f(η), where η(a,b), provided that the function f has a continuous second derivative on [a,b].

"Introduction to Numerical Analysis", page. 286.


θ-Method

Theta methods. We consider methods of the form yn+1=yn+h[θf(tn,yn)+(1θ)f(tn+1,yn+1)],n=0,1,, where θ[0,1] is a parameter: - If θ=1, we recover Euler's method. - If θ(0,1), then the theta method (3.3) is implicit: Each time step requires the solution of N (in general, nonlinear) algebraic equations for the unknown vector yn+1. - The choices θ=0 and θ=12 are known as Backward Euler: yn+1=yn+hf(tn+1,yn+1), Trapezoidal rule: yn+1=yn+12h[f(tn,yn)+f(tn+1,yn+1)]. Solution of nonlinear algebraic equations can be done by iteration. For example, for backward Euler, letting yn+1[0]=yn, we may use $\textit{Direct iteration}$: yn+1[j+1]=yn+hf(tn+1,yn+1[j]); $\textit{Newton–Raphson}$: yn+1[j+1]=yn+1[j][Ihf(tn,yn)y]1[yn+1[j]ynhf(tn+1,yn+1[j])]; $\textit{Modified Newton–Raphson}$: yn+1[j+1]=yn+1[j][Ihf(tn,yn)y]1[yn+1[j]ynhf(tn+1,yn+1[j])].


Error

In numerically solving a differential equation, several types of errors arise. These are conveniently classified as follows:

Local Truncation Error. The local truncation error is the error made in a single step of a numerical method.

{Local Roundoff Error. The local roundoff error is the error made in a single step of a numerical method due to roundoff.

Global Truncation Error. The global truncation error is the accumulation of the local truncation errors in the previous steps.

Global Roundoff Error. The global roundoff error is the accumulation of the local roundoff errors in the previous steps.

Total Error. The total error is the sum of the global truncation error and the global roundoff error.

"Numerical Analysis: Mathematics of Scientific Computing" page. 533.


Richardson Extrapolation

Definition.  Richardson extrapolation is a method for improving the accuracy of a numerical method. Given a numerical method that approximates a quantity with error E(h) for a step size h, Richardson extrapolation constructs a new approximation with error E(h2) by combining two approximations with step sizes h and h/2.


Initial value problems for ODEs

Our model is an initial-value problem written in the form (1){x=f(t,x)x(t0)=x0 Here x is an unknown function of t that we hope to construct from the information given in Equations (1), where x=dx(t)dt. The second of the two equations in (1) specifies one particular value of the function x(t). The first equation gives the slope.

"Numerical Analysis: Mathematics of Scientific Computing" page.524.

Existence

$\textbf{First Existence Theorem, Initial-Value Problem}$ If f is continuous in a rectangle R centered at (t0,x0), say (4)R={(t,x):|tt0|α,|xx0|β} then the initial-value problem above has a solution x(t) for |tt0|min(α,β/M), where M is the maximum of |f(t,x)| in the rectangle R.

"Numerical Analysis: Mathematics of Scientific Computing" page 525.

$\textbf{Second Existence Theorem, Initial-Value Problem.}$ If f is continuous in the strip atb,<x< and satisfies there an inequality (5)|f(t,x1)f(t,x2)|L|x1x2| then the initial-value problem (1) has a unique solution in the interval [a,b].

"Numerical Analysis: Mathematics of Scientific Computing" page.526.

Lipschitz Condition

Definition: A function f:RnR satisfies the Lipschitz condition on a domain DRn if there exists a constant L0 such that for all x,yD: f(x)f(y)Lxy where denotes a norm (commonly the Euclidean norm). And L is called the Lipschitz constant.

Note. Hence, inequality (5) is the Lipschitz condition.

Uniqueness

Uniqueness Theorem, Initial-Value Problem. If f and f/x are continuous in the rectangle R defined by $\textbf{(4)}$, then the initial-value problem (1) has a unique solution in the interval |tt0|<min(α,β/M).

$\textbf{Theorem (Picard's Theorem).}$ Suppose that the real-valued function (x,y)f(x,y) is continuous in the rectangular region D defined by x0xXM, y0Cyy0+C; that |f(x,y0)|K when x0xXM; and that f satisfies the Lipschitz condition: there exists L>0 such that |f(x,u)f(x,v)|L|uv|for all(x,u)D,(x,v)D. Assume further that (12.3)CKL(eL(XMx0)1). Then, there exists a unique function yC1[x0,XM] such that y(x0)=y0 and y=f(x,y) for x[x0,XM]; moreover, |y(x)y0|C,x0xXM.

"Introduction to Numerical Analysis", page. 311.

One-step methods

Definition. Generally, a one-step method may be written in the form (12.13)yn+1=yn+hΦ(xn,yn;h),n=0,1,,N1,y(x0)=y0, where Φ(,;) is a continuous function of its variables.

"Introduction to Numerical Analysis", page. 317.

Definition. In order to find the accuracy of the numerical method (12.13), we define the global error, en, by en=y(xn)yn.

Definition. The truncation error, Tn, is defined by (12.14)Tn=y(xn+1)y(xn)hΦ(xn,y(xn);h).

"Introduction to Numerical Analysis", page. 317.

Theorem. Consider the general one-step method (12.13) where, in addition to being a continuous function of its arguments, Φ is assumed to satisfy a Lipschitz condition with respect to its second argument, that is, there exists a positive constant LΦ such that, for 0hh0 and for all (x,u) and (x,v) in the rectangle D={(x,y):x0xXM,|yy0|C}, we have that |Φ(x,u;h)Φ(x,v;h)|LΦ|uv|.(12.15) Then, assuming that |yny0|C, n=1,2,,N, it follows that |en|TLΦ(eLΦ(xnx0)1),n=0,1,,N,(12.16) where T=max0nN1|Tn|.

"Introduction to Numerical Analysis", page. 318.

$\textbf{Euler's method.}$ Given that y(x0)=y0, let us suppose that we have already calculated yn, up to some n, 0nN1, N1; we define yn+1=yn+hf(xn,yn). Thus, taking in succession n=0,1,,N1, one step at a time, the approximate values yn at the mesh points xn can be easily obtained. This numerical method is known as Euler's method.

"Introduction to Numerical Analysis", page. 317.


Consistency

Definition. The numerical method (12.13) is consistent with the differential equation (12.1) if the truncation error, defined by (12.14), is such that for any ϵ>0 there exists a positive h(ϵ) for which |Tn|<ϵ for 0<h<h(ϵ) and any pair of points (xn,y(xn)),(xn+1,y(xn+1)) on any solution curve in D.

"Introduction to Numerical Analysis", page. 321.

Theorem. Suppose that the initial value problem (12.1), (12.2) satisfies the conditions of Picard's Theorem, and also that its approximation generated from (12.13) when hh0 lies in the region D. Assume further that the function Φ(,;) is continuous on D×[0,h0], and satisfies the consistency condition (12.21) and the Lipschitz condition (12.22)|Φ(x,u;h)Φ(x,v;h)|LΦ|uv|onD×[0,h0]. Then, if successive approximation sequences (yn), generated by using the mesh points xn=x0+nh, n=1,2,,N, are obtained from (12.13) with successively smaller values of h, each h less than h0, we have convergence of the numerical solution to the solution of the initial value problem in the sense that limnyn=y(x)asxnx[x0,XM]whenh0andn.

"Introduction to Numerical Analysis", page. 322.

Adams-Bashforth Formula

Suppose the resulting formula is of the following type: xn+1=xn+afn+bfn1+cfn2+ where fi denotes f(ti,xi). An equation of this type is called an Adams-Bashforth formula.

Note. The Adams-Bashforth formula is an explicit method since it does not include $f_{n+1}$.

Adams-Bashforth Formula of Order $5$

xn+1=xn+h720[1901fn2774fn1+2616fn21274fn3+251fn4]

Adams-Moulton Formula

Definition. Suppose the resulting formula is of the following type: xn+1=xn+afn+1+bfn+cfn+ where fi denotes f(ti,xi). An equation of this type is called an Adams-Moulton formula.

Note. The Adams-Moulton formula is an implicit method since it includes fn+1.

Adams-Moulton Formula of Order $5$

xn+1=xn+h720[251fn+1+646fn264fn1+106fn219fn3]

"Numerical Analysis: Mathematics of Scientific Computing", page. 551.

Linear Multistep Methods

Linear $k$-step method

The format of any such method is (9)akxn+ak1xn1++a0xnk=h[bkfn+bk1fn1++b0fnk]. Or in general, it can be written as j=0kajxn+j=hj=0kbjf(tn+j,xn+j), where ak0. This is called a k-step method.

Note. The left side of the equation (9) is called the corresponding characteristic polynomial of the method.

Note

Examples.

"Introduction to Numerical Analysis" page. 331.

Order

The accuracy of a numerical solution of a differential equation is largely determined by the order of the algorithm used.

Definition.The order indicates how many terms in a Taylor-series solution are being simulated by the method.

For example, the Adams-Bashforth method: $x_{n+1} = x_n + \frac{h}{720} \left[ 1901 f_n - 2774 f_{n-1} + 2616 f_{n-2} - 1274 f_{n-3} + 251 f_{n-4} \right]$ is said to be of order 5 because it produces approximately the same accuracy as the Taylor-series method with terms h,h2,h3,h4, and h5. The error can then be expected to be O(h6) in each step of a solution using the above method.

In order to have better understanding of the order of the Multistep methods, we will define the following function Lx=i=0k[aix(ih)hbix(ih)]a. However, in the following analysis, we assume that x is represented by its Taylor series at t=0. By using the Taylor series for x, one can express L in this form: (11)Lx=d0x(0)+d1hx(0)+d2h2x(0)+ To compute the coefficients, di, in Equation (11), we write the Taylor series for x and x: x(ih)=j=0(ih)jj!x(j)(0) x(ih)=j=0(ih)jj!x(j+1)(0) These series are then substituted into Equation (10). Rearranging the result in powers of h, we obtain an equation having the form of (11), with these values of di: (12)d0=i=0kaid1=i=0k(iaibi)d2=i=0k(12i2aiibi)dj=i=0k(ijj!aiij1(j1)!bi)(j1) Here we use 0!=1 and i0=1.

Theorem on Multistep Method Properties. These three properties of the multistep method (9) are equivalent:

  1. d0=d1==dm=0
  2. Lp=0 for each polynomial p of degree m
  3. Lx is O(hm+1) for all xCm+1

"Numerical Analysis: Mathematics of Scientific Computing", page. 552 & 553.

Alternate Definition of Order

Now, we can define the order of the multistep method as follows: The order of the multistep method in Equation (9) is the unique natural number m such that d0=d1==dm=0dm+1.

EXAMPLE 1 What is the order of the method described by the following equation? xnxn2=13h(fn+4fn1+fn2). Solution. Firstly, we rewrite the equation in the form of Equation (9): xnxn2=13h(fn+4fn1+fn2)xn+0xn1xn2=h(13fn+43fn1+13fn2) And we have to match it to a2xn+a1xn1+a0xn2=h[b2fn+b1fn1+b0fn2]. The vector (a0,a1,a2) is (1,0,1) and the vector (b0,b1,b2) is (13,43,13). Thus, the di are d0=a0+a1+a2=0d1=b0+(a1b1)+(2a2b2)=0d2=(12a1b1)+(2a22b2)=0d3=(16a112b1)+(43a22b2)=0d4=(124a116b1)+(23a243b2)=0d5=(1120a1124b1)+(415a223b2)=190 The order of the method is 4.

"Numerical Analysis: Mathematics of Scientific Computing", page 554.

Root Condition

Definition. The roots of the first characteristic polynomial lie in the closed unit disc and those on the unit circle are simple is often called the Root Condition.

Stability

Definition. The roots of $\rho(z)$ of modulus one are called essential roots. The root z=1 is called the principal root. The roots of ρ(z) of modulus <1 are called nonessential roots.

Weakly Stability

Definition. A linear multistep method is strongly stable if all roots of ρ(z) are 1 in magnitude and only one root has magnitude one. If more than one root has magnitude one, the method is called weakly or conditionally stable.

Note. We still require only simple roots of magnitude one. Also, note these definitions refer to the case h=0.

Strong Stability

Definition. A multistep method is said to be strongly stable if p(1)=0,p(1)0, and all other roots of p satisfy the inequality |z|<1.

"Numerical Analysis: Mathematics of Scientific Computing", page 564.

Absolute Stability

Definition. A linear multistep method is said to be absolutely stable for a given value of λh if each root zr=zr(λh) of the associated stability polynomial π(;λh) satisfies |zr(λh)|<1.

Region of Absolute Stability

Definition. The region of absolute stability of a linear multistep method is the set of all points λh in the complex plane for which the method is absolutely stable.

A-stable

Definition. A linear multistep method is said to be A-stable if its region of absolute stability contains the negative (left) complex half-plane.

Second Barrier Theorem:

"Introduction to Numerical Analysis", page 348.


Local Truncation Error, Multistep Method

Theorem (Theorem on Local Truncation Error, Multistep Method). If the multistep method (2) is of order m, if xCm+2, and if f/x is continuous, then under the hypotheses of the preceding paragraph, (13)x(tn)xn=(dm+1ak)hm+1x(m+1)(tnk)+O(hm+2) (The coefficients dk are defined in Section 8.4, p. 553.)

"Numerical Analysis: Mathematics of Scientific Computing", page 560.

Proof It suffices to prove the equation when n=k, because xn can be interpreted as the value of a numerical solution that began at the point tnk. Using the linear functional L, Equation (10) in Section 8.4 (p. 552), we have ((14))Lx=i=0k[aix(ti)hbix(ti)]=i=0k[aix(ti)hbif(ti,x(ti))] On the other hand, the numerical solution satisfies the equation ((15))0=i=0k[aixihbif(ti,xi)] Because we have assumed that xi=x(ti) for i<k, the result of subtracting Equation (15) from Equation (14) will be ((16))Lx=ak[x(tk)xk]hbk[f(tk,x(tk))f(tk,xk)] To the last term of Equation (16) we apply the Mean-Value Theorem, obtaining ((17))Lx=ak[x(tk)xk]hbkfx(tk,ξ)[x(tk)xk]=[akhbkF][x(tk)xk] where F=f(tk,ξ)x for some ξ between x(tk) and xk. Now recall from Equation (13) in Section 8.4 (p. 553) that if the method being used is of order m, then Lx will have the form ((18))Lx=dm+1hm+1x(m+1)(t0)+O(hm+2) Combining Equations (17) and (18), we obtain (13). Here we can ignore hbkF in the denominator. (See Problem 8.5.8, p. 564.)

"Numerical Analysis: Mathematics of Scientific Computing", page 563.


Schur Form

(Schur Decomposition). For any square matrix A, there exists a unitary matrix U such that UAU=T where T is upper triangular.

Remark. That is, every square matrix is similar to an upper-triangular matrix.


Hermite Matrix

Definition. A matrix A is called a Hermite matrix if it is equal to its conjugate transpose, that is, A=A.

Theorem. Every Hermitian n×n matrix A can be diagonalized by a unitary matrix, UAU=Λ, where U is unitary and Λ is a diagonal matrix.

Proof. Since we know that the hermite matrix has the schur form of A=UTU, where T is upper triangular. Since A=A, we have A=UTU=(UTU)=UTU=UTU. Thus, T=T. Since T is upper triangular, we have T is lower triangular. Thus, T=T is diagonal.

Gram-Schmidt

QR Factorization

Reduced QR Factorization

Definition [Reduced QR factorization]. Let AFm×n (mn). There is an orthonormal matrix Q^Fm×n and a square upper triangular matrix R^Fn×n such that A=Q^R^.

There are several ways to compute the QR factorization of a matrix.

Householder Transformation

Example. Let A=[a1a2a3]=[200110111]. Compute a QR factorization by using Householder Transformation.

Solution. Let v1=a1+[a1200]=[2+211]., we define our first Householder reflector as H1=I2v1v1v1v1=[12+22121212112(2+2)12(2+2)1212(2+2)112(2+2)]=[2/21/21/21/2(2+2)/4(22)/41/2(22)/4(2+2)/4] Then, we have a new matrix M2 as M2=H1A=[12+22121212112(2+2)12(2+2)1212(2+2)112(2+2)][200110111]=[21120112+212+20112+2112(2+2)]=[211/202/2(22)/202/2(2+2)/4] Now we pick vector a2=[2/2,2/2]T, and we have a22=1. Now, we define v2=a2+[10]=[2/2+12/2]. Then, we define our second reduced Householder reflector as H2=I2v2v2v2v2=[12+2222221222]=[2/22/22/22/2] And we extend H2 to a 3×3 matrix H2 as H2=[10002/22/202/22/2] Then we have M3=H2M2 as M3=H2M2=[10002/22/202/22/2][211/202/2(22)/202/2(2+2)/4]=[211/2011/2001/2]. Given that for any Householder reflector H, we have H1=H (Proved in the next problem). Thus, we have QR factorization as [200110111]=[211/202/2(22)/202/2(2+2)/4][10002/22/202/22/2][211/2011/2001/2]=[2/22/201/21/22/21/21/22/2][211/2011/2001/2].

LU Decomposition

The LU decomposition of a matrix A is a factorization of A into a product of a lower triangular matrix L and an upper triangular matrix U. The LU decomposition with partial pivoting is given by PA=LU where P is a permutation matrix, L is a lower triangular matrix with ones on the diagonal, and U is an upper triangular matrix. The LU decomposition is computed using Gaussian elimination with partial pivoting. One thing we need to know is when the LU decomposition exists and the following proposition gives us the answer.

Proposition 5. Let A=(aij) be an n×n matrix and for k=1,,n1, let A(k) denote the k×k submatrix of A consisting of the first k rows and k columns of A. If A(k) is invertible for k=1,2,,n1, then A has a unique LU-decomposition.

LU Decomposition provides a good explanation of the LU decomposition.

Bauer-Fike Theorem

Consider a diagonalizable matrix ACn×n with non-singular eigenvector matrix VCn×n, such that A=VΛV1, where Λ is a diagonal matrix. If XCn×n is invertible, its condition number in the p-norm is denoted by κp(X) and defined as: κp(X)=|X|p|X1|p

Theorem.  Let A be an n×n matrix with a complete set of linearly independent eigenvectors and suppose the V1AV=D where V is non-singular and D is diagonal. Let δA be a perturbation of A and let μ be an eigenvalue of A+δA. Then A has an eigenvalue λ such that |μλ|κp(V)δAp,1p where κp(V) is the p-norm condition number of V.

The Bauer-Fike Theorem:

The Bauer–Fike theorem is a powerful result, and its proof involves several key concepts from linear algebra and perturbation theory. Let's outline the main steps of the proof:

Theorem [Bauer-Fike Theorem] Suppose A,ECm×m and A=XΛX1, where Λ=[λ1λm]. Then for any eigenvalue λ~ of A~=A+E, there exists λi, an eigenvalue of A, such that |λ~λi|κ2(X)E2, where κ2(X)=X2X12.

Proof. Suppose 0xCm is an eigenvector of A~ corresponding to λ~. Then for y=X1x, (A+Eλ~I)x=0(XΛX1λ~I+E)x=0(Λλ~I+X1EX)y=0, and from which (Λλ~I)y=X1EXy. Then σmin(Λλ~I)y2(Λλ~I)y2=X1EXy2X1EX2y2. Since Λλ~I is diagonal, σmin(Λλ~I)=minj|λjλ~|. Suppose the minimum is attained when j=i. Then |λiλ~|X1EX2κ2(X)E2.

"Lecture note No. 22"


Norms

Definition (Forbenius norm). For a matrix ACm×n, the Forbenius norm is defined as AF=i=1mj=1n|aij|2.

Definition (p-norm). For a matrix ACm×n, the p-norm is defined as Ap=maxx0Axpxp=maxxp=1Axp, where x is a vector in Cn.

Theorem. For any ACm×n and unitary QCm×m, we have QA2=A2,QAF=AF.

"Numerical Linear Algebra", Page. 23

Theorem. For any ACm×n and unitary QCn×n, we have AQ2=A2,AQF=AF.


Positive Definite

The following definitions all involve the term xMx. Notice that this is always a real number for any Hermitian square matrix M. An n×n Hermitian complex matrix M is said to be positive-definite if xMx>0 for all non-zero x in Cn. Formally, Mpositive-definitexMx>0 for all xCn{0} An n×n Hermitian complex matrix M is said to be positive semi-definite or non-negative-definite if xMx0 for all x in Cn. Formally, Mpositive semi-definitexMx0 for all xCn

Theorem. A matrix M is positive-definite if and only if it satisfies any of the following equivalent conditions:


Gerschgorin’s Theorem

Theorem [Row version Gerschgorin’s Theorem] Let ACm×m. Consider m disks in the complex plane: Di={z|zaii|ji|aij|=|ai1|++|ai,i1|+|ai,i+1|++|aim|}. Then we have the following results.


Theorem [Column version Gerschgorin’s Theorem] Let ACm×m. Consider m disks in the complex plane: D~i={z|zaii|ji|aji|=|a1i|++|ai1,i|+|ai+1,i|++|ami|}. Then we have the following results.

Theorem [Row version Generalized Gerschgorin’s Theorem] Let ACm×m. Consider m disks in the complex plane: Di={z|zaii|1|di|ji|aij||dj|}, where d1,,dn are arbitrary nonzero numbers. Then we have the following results.

"Lecture note No. 22"


Rayleigh Quotient

Definition. Let ACm×m. For a given vector 0xCm, the Rayleigh quotient of x is the scalar defined by r(x)=xAxxx

Algorithm: Rayleigh quotient iteration

Choose an initial vector x(0)Cm with x(0)2=1.

Compute μ(0)=(x(0))Ax(0).

For k=1,2,

End


Upper Hessenberg Matrix

Definition. A matrix H=[hij]Cm×m is upper Hessenberg if hij=0 for all ij+2, or equivalently, H=[h11h12h1mh21h22h2mhm,m1hmm] H is called unreduced upper Hessenberg if h21,,hm,m10. Otherwise H is reduced upper Hessenberg.

References