My name is Hanzhang Yin, and I have developed this website as a resource to facilitate the review of key concepts in abstract algebra, and it is not fully completed. Feel free to email me at hanyin@ku.edu if there are any errors or suggestions for improvement.
\[ x = \pm (0. a_1\cdots a_n\cdots)_2\times 2^m \] where \(a_1 = 1\)
\(\textbf{Note. }\)For Marc-32, \(n = 24\), \(-126\leq m\leq 127\). For Marc-64, \(n = 53\), \(-1021\leq m\leq 1024\).
\[ fl(1 + \varepsilon) > 1 \] \[ \varepsilon = 2^{-24} ~ 10^{-8} \] \[ \varepsilon = 2^{-53} ~ 10^{-16} \]
Let \(\delta = \frac{x - fl(x)}{x}\). define \[ \left|\delta\right|\leq \varepsilon \] Suppose that \(x = \pm (0. a_1\cdots a_n\cdots)_2\times 2^m\), then we pick two points such that \(x_-\) is less than \(x\) and \(x_+\) is greater than \(x\). Then we have \[ x_- = (0, a_1\cdots a_n)_2\times 2^m \] \[ x_+ = (0, a_1\cdots a_n+0.000\dots 1_n)_2\times 2^m \] \[ fl(x) = x_- \text{ if }a_{n+1} = 0 \text{(x is closer to x_- than x_+) and } fl(x) = x_+ \text{ if }a_{n+1} = 1\text{ (x is closer to x_+ than x_-)} \] \[ |fl(x) - x|\leq \frac{1}{2}\times (x_+ - x_-) = \frac{1}{2}\cdot 2^{-n}\cdot 2^m \] \[ \left|\frac{fl(x) - x}{x}\right|\leq \frac{\frac{1}{2}2^{-n}2^m}{(0,a_1\cdots a_n)_2\times 2^m} = 2^{-n} \]
The unit roundoff is \[ \varepsilon = \frac{1}{2}\beta^{1-t} \] where \(t\) is the number of digits in the mantissa.
Let \(x, y\) be two machine numbers. Let \(\circ\) is a binary operation. \[ fl(x\circ y) = (x\circ y)(1 + \delta) \] where \(|\delta|\leq\varepsilon\) is the roundoff error. \[ fl(x) = x(1 + \delta) \]
\(\textbf{Example.}\)
Let \(\{x_n\}\) be a sequence of real numbers tending to a limit \( x^* \). We say that the rate of convergence is at least linear if there are a constant \( c \lt 1 \) and an integer \( N \) such that \[ |x_{n+1} - x^*| \le c |x_n - x^*| \quad (n \ge N) \] We say that the rate of convergence is at least superlinear if there exist a sequence \( \epsilon_n \) tending to 0 and an integer \( N \) such that \[ |x_{n+1} - x^*| \le \epsilon_n |x_n - x^*| \quad (n \ge N) \] The convergence is at least quadratic if there are a constant \( C \) (not necessarily less than 1) and an integer \( N \) such that \[ |x_{n+1} - x^*| \le C |x_n - x^*|^2 \quad (n \ge N) \] In general, if there are positive constants \( C \) and \( \alpha \) and an integer \( N \) such that \[ |x_{n+1} - x^*| \le C |x_n - x^*|^\alpha \quad (n \ge N) \] we say that the rate of convergence is of order \(\alpha\) at least.
"Numerical Analysis: Mathematics of Scientific Computing" page. 17.
Definition. Newton's method, also known as the Newton-Raphson method, is an iterative root-finding algorithm used to find successively better approximations to the roots (or zeros) of a real-valued function. Given a function \( f(x) \) and its derivative \( f'(x) \), Newton's method starts with an initial guess \( x_0 \) and iteratively refines this guess using the following formula: \[ x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}, \] where:
Convergence Criteria. The method typically converges if the initial guess \( x_0 \) is sufficiently close to the actual root, and \( f(x) \) is continuously differentiable in the neighborhood of the root.
Example. Consider finding the root of the function \( f(x) = x^2 - 2 \). The derivative of the function is \( f'(x) = 2x \). Firstly, we start with an initial guess \( x_0 \). Secondly, we apply the Newton's iteration: \[ x_{n+1} = x_n - \frac{x_n^2 - 2}{2x_n} = \frac{2x_n - \frac{x_n^2 - 2}{x_n}}{2} = \frac{x_n + \frac{2}{x_n}}{2} \] So, the iterative formula becomes: \[ x_{n+1} = \frac{1}{2} \left( x_n + \frac{2}{x_n} \right) \] By repeatedly applying this formula, the sequence \( \{x_n\} \) will converge to the square root of 2.
Let \( r \) be a solution of \( f(x) = 0 \) (i.e. \( f(r) = 0 \)). Suppose that we have already computed \( x_n \) and the error in \( x_n \) is \(e_n = |x_n - r| \). We now derive a formula that relates the error after the next step, \( |x_{n+1} - r| \), to \( |x_n - r| \). According to Taylor's theorem, there is a \( c \) between \( x_n \) and \( r \) such that \[ 0 = f(r) = f(x_n) + f'(x_n)(r - x_n) + \frac{1}{2}f''(c)(r - x_n)^2 \quad (1) \] By the definition of Newton's method, we have \[ 0 = f(x_n) + f'(x_n)(x_{n+1} - x_n) \quad (2) \] Subtracting (2) from (1). \[ \begin{align} 0 &= f'(x_n)(r - x_{n+1}) + \frac{1}{2}f''(c)(r - x_n)^2\\ x_{n+1} - r &= \frac{f''(c)}{2f'(x_n)}(x_n - r)^2 \\ e_{n+1} &= |x_{n+1} - r| = \left| \frac{f''(c)}{2f'(x_n)} \right||x_n - r|^2 \\ \end{align} \] If \( x_n \) is close to \( r \), then \( c \), which must be between \( x_n \) and \( r \), is also close to \( r \) and \[ e_{n+1} = |x_{n+1} - r| \approx \left| \frac{f''(r)}{2f'(r)} \right||x_n - r|^2 = \left| \frac{f''(r)}{2f'(r)}\right| e_n^2 = Ce_n^2 \]
Thus, we can see that Newton's Method is quadratically convergent.
However, the disadvantage of Newton's Method is that if the initial value is closed enough to the root.
Theorem. Let \(f''\) be continuous and let \(r\) be a simple zero of \(f\). Then there is a neighborhood of \(r\) and a constant \(C\) such that if Newton's method is started in that neighborhood, the successive points become steadily closer to \(r\) and satisfy \[ |x_{n+1} - r| \leq C (x_n- r)^2 \quad (n \geq 0) \]
$\textbf{Theorem on Newton's Method for a Convex Function. }$ If \( f \) belongs to \( C^2(\mathbb{R}) \), is increasing, is convex, and has a zero, then the zero is unique, and the Newton iteration will converge to it from any starting point.
$\textbf{Convergence of Newton's method.}$ Assume that \( f(x) \), \( f'(x) \), and \( f''(x) \) are continuous for all \( x \) in some neighborhood of \( \alpha \) and \( f(\alpha) = 0 \) and \( f'(\alpha) \neq 0 \). If \( x_0 \) is sufficiently close to \( \alpha \), then \( x_n \to \alpha \) as \( n \to \infty \). Moreover, \[ \lim_{n \to \infty} \frac{|x_{n+1} - \alpha|}{|x_n - \alpha|^2} = \frac{1}{2} \left| \frac{f''(\alpha)}{f'(\alpha)} \right|. \]
This method is also known as Horner's method, nested multiplication, or synthetic division.
\(\textbf{Definition. }\) Horner's algorithm is a method for evaluating a polynomial at a point. Given a polynomial \( p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n \), Horner's algorithm evaluates \( p(x) \) at a point \( x = c \) as follows: \[ p(c) = a_0 + c(a_1 + c(a_2 + \cdots + c(a_{n-1} + c a_n) \cdots)). \]
Given a polynomial \( p(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0 \). Then \[ \begin{align} b_{n-1} &= a_n \\ b_{n-2} &= a_{n-1} + z_0 b_{n-1} \\ &\vdots \\ b_0 &= a_1 + z_0 b_1 \\ p(z_0) &= a_0 + z_0 b_0 \end{align} \] Horner's algorithm evaluates \( p(x) \) at a point \( x = z_0 \) as follows: \[ \begin{array}{cccccc} & a_n & a_{n-1} & a_{n-2} & \cdots & a_0 \\ z_0 & & z_0 b_{n-1} & z_0 b_{n-2} & \cdots & z_0 b_0 \\ & b_{n-1} & b_{n-2} & b_{n-3} & \cdots & \boxed{b_{-1}} \\ \end{array} \] The boxed number satisfies \(p(z_0) = b_{-1}\).
\(\textbf{Example. }\) Given the polynomial \( p(x) = 4x^3 - 3x^2 + 2x - 1 \) and \( z_0 = 2 \), evaluate \( p(z_0) \). \[ \begin{array}{ccccc} & 4 & -3 & 2 & -1 \\ 2 & & 8 & 10 & 24 \\ & 4 & 5 & 12 & \boxed{23} \\ \end{array} \] Hence, we have \( p(2) = 23 \).
\(\textbf{Theorem on Horner's Method. }\) Let \( p(x) = a_n x^n + \cdots + a_1 x + a_0 \). Define pairs \((\alpha_j, \beta_j)\) for \( j = n, n-1, \ldots, 0 \) by the algorithm \[ \begin{cases} (\alpha_n, \beta_n) = (a_n, 0) \\ (\alpha_j, \beta_j) = (a_j + x \alpha_{j+1}, \alpha_{j+1} + x \beta_{j+1}) & (n-1 \ge j \ge 0) \end{cases} \] Then \(\alpha_0 = p(x)\) and \(\beta_0 = p'(x)\).
Newton's method and Steffensen's method are examples of procedures whereby a sequence of points is computed by a formula of the form \[ x_{n+1} = F(x_n) \quad (n \geq 0) \tag*{(1)} \] The algorithm defined by such an equation is called functional iteration.
"Numerical Analysis: Mathematics of Scientific Computing" page. 100.
\(\textbf{Definition. }\) A mapping \( F \) of a metric space \( (X, d) \) into itself is called a contraction if there exists a constant \( k \) with \( 0 \leq k \lt 1 \) such that \[ d(F(x), F(y)) \leq k d(x, y) \] for all \( x, y \) in \( X \).
"Numerical Analysis: Mathematics of Scientific Computing" page. 101.
Theorem (Contractive Mapping Theorem). Let \( C \) be a closed subset of the real line. If \( F \) is a contractive mapping of \( C \) into \( C \), then \( F \) has a unique fixed point. Moreover, this fixed point is the limit of every sequence obtained from Equation (1) with a starting point \( x_0 \in C \).
"Numerical Analysis: Mathematics of Scientific Computing" page. 102.
\(\textbf{Theorem. }\)Let \( f \) be a function in \( C^{n+1}[a, b] \), and let \( p \) be the polynomial of degree at most \( n \) that interpolates the function \( f \) at \( n + 1 \) distinct points \( x_0, x_1, \ldots, x_n \) in the interval \([a, b]\). To each \( x \) in \([a, b]\) there corresponds a point \( \xi_x \) in \((a, b)\) such that \[ f(x) - p(x) = \frac{1}{(n + 1)!} f^{(n+1)}(\xi_x) \prod_{i=0}^{n} (x - x_i) . \]
The Lagrange form of the interpolation polynomial is a method for constructing a polynomial that passes through a given set of points. Specifically, given a set of \( n+1 \) distinct data points \((x_0, y_0), (x_1, y_1), \ldots, (x_n, y_n)\), the Lagrange interpolation polynomial \( P(x) \) is defined as: \[ P(x) = \sum_{i=0}^{n} y_i \ell_i(x) \] where \( \ell_i(x) \) are the Lagrange basis polynomials given by: \[ \ell_i(x) = \prod_{\substack{0 \le j \le n \\ j \ne i}} \frac{x - x_j}{x_i - x_j} \] Each \( \ell_i(x) \) is a polynomial of degree \( n \) that is 1 at \( x = x_i \) and 0 at \( x = x_j \) for \( j \ne i \).
\(\textbf{Example. }\) Given points: \( (1, 1), (2, 4), (3, 9) \). Here \(x_0 = 1, x_1 = 2, x_2 = 3\), and \(y_0 = 1, y_1 = 4, y_2 = 9\).
Compute the Lagrange basis polynomials \( \ell_i(x) \): \[ \ell_0(x) = \frac{(x-2)(x-3)}{(1-2)(1-3)} = \frac{(x-2)(x-3)}{2} \] \[ \ell_1(x) = \frac{(x-1)(x-3)}{(2-1)(2-3)} = - (x-1)(x-3) \] \[ \ell_2(x) = \frac{(x-1)(x-2)}{(3-1)(3-2)} = \frac{(x-1)(x-2)}{2} \]
Construct the interpolation polynomial \( P(x) \): \[ P(x) = 1 \cdot \ell_0(x) + 4 \cdot \ell_1(x) + 9 \cdot \ell_2(x). \] \[ P(x) = 1 \cdot \frac{(x-2)(x-3)}{2} - 4 \cdot (x-1)(x-3) + 9 \cdot \frac{(x-1)(x-2)}{2}. \] Simplifying, we get: \[ P(x) = \frac{(x-2)(x-3)}{2} - 4(x-1)(x-3) + \frac{9(x-1)(x-2)}{2}. \]
The divided differences of a set of data points are a sequence of numbers that are used to construct the Newton form of the interpolation polynomial.
\[f[x_i, x_{i+1}, \ldots, x_{i+j}] = \frac{f[x_{i+1}, x_{i+2}, \ldots, x_{i+j}] - f[x_i, x_{i+1}, \ldots, x_{i+j-1}]}{x_{i+j} - x_i}\] The Following table shows the divided differences for a set of data points: \[\begin{array}{cccc} x_0 & f[x_0] & f[x_0, x_1] & f[x_0, x_1, x_2] & f[x_0, x_1, x_2, x_3] \\ x_1 & f[x_1] & f[x_1, x_2] & f[x_1, x_2, x_3] & \\ x_2 & f[x_2] & f[x_2, x_3] & & \\ x_3 & f[x_3] & & & \end{array} \]
\(\textbf{Example. }\) Given the following table of data points: \[ \begin{array}{c|cccc} x & 3 & 1 & 5 & 6 \\ \hline f(x) & 1 & -3 & 2 & 4 \\ \end{array} \] Solution: We arrange the given table vertically and compute divided differences by the use of Formula above, arriving at \[ \begin{array}{ccccc} 3 & 1 & 2 & -3/8 & 7/40 \\ 1 & -3 & 5/4 & 3/20 & \\ 5 & 2 & 2 & & \\ 6 & 4 & & & \\ \end{array} \]
Theorem on Permutations in Divided Differences. The divided difference is a symmetric function of its arguments. Thus, if \((z_0, z_1, \ldots, z_n)\) is a permutation of \((x_0, x_1, \ldots, x_n)\), then \[ f[z_0, z_1, \ldots, z_n] = f[x_0, x_1, \ldots, x_n]. \]
Theorem on Derivatives and Divided Differences. If \( f \) is \( n \) times continuously differentiable on \([a, b]\) and if \( x_0, x_1, \ldots, x_n \) are distinct points in \([a, b]\), then there exists a point \( \xi \) in \((a, b)\) such that \[ f[x_0, x_1, \ldots, x_n] = \frac{1}{n!} f^{(n)}(\xi). \]
The Newton form of the interpolation polynomial is a method for constructing a polynomial that passes through a given set of points. Specifically, given a set of \( n+1 \) distinct data points \((x_0, y_0), (x_1, y_1), \ldots, (x_n, y_n)\), the Newton interpolation polynomial \( P(x) \) is defined as: \[ P(x) = a_0 + a_1(x - x_0) + a_2(x - x_0)(x - x_1) + \cdots + a_n(x - x_0)(x - x_1) \cdots (x - x_{n-1}) \] where \( a_0, a_1, \ldots, a_n \) are the divided differences of the data points.
\(\textbf{Remark. }\)Here, \(a_n = f[x_0, x_1, \ldots, x_n]\).
\(\textbf{Example. }\) Given the following table of data points: \[ \begin{array}{c|cccc} x & 3 & 1 & 5 & 6 \\ \hline f(x) & 1 & -3 & 2 & 4 \\ \end{array} \] We already found the divided differences in the previous example. The Newton form of the interpolation polynomial is \[ p(x) = 1 + 2(x - 3) - \frac{3}{8}(x - 3)(x - 1) + \frac{7}{40}(x - 3)(x - 1)(x - 5). \]
\(\textbf{Theorem. }\)Let \( f \) be a function in \( C^{n+1}[a, b] \), and let \( p \) be the polynomial of degree at most \( n \) that interpolates the function \( f \) at \( n + 1 \) distinct points \( x_0, x_1, \ldots, x_n \) in the interval \([a, b]\). To each \( x \) in \([a, b]\) there corresponds a point \( \xi_x \) in \((a, b)\) such that \[ f(x) - p(x) = \frac{1}{(n + 1)!} f^{(n+1)}(\xi_x) \prod_{i=0}^{n} (x - x_i) . \]
Hermite Interpolation Theorem. Let \( n \geq 0 \), and suppose that \( x_i \), \( i = 0, \ldots, n \), are distinct real numbers. Then, given two sets of real numbers \( y_i \), \( i = 0, \ldots, n \), and \( z_i \), \( i = 0, \ldots, n \), there is a unique polynomial \( p_{2n+1} \) in \( \mathcal{P}_{2n+1} \) such that \[ p_{2n+1}(x_i) = y_i, \quad p'_{2n+1}(x_i) = z_i, \quad i = 0, \ldots, n. \quad (6.13) \]
Remark. In general, if there is a set of \(m\) distinct points, there is a unique polynomial of degree \(2m-1\) that interpolates the function at those points for hermite interpolation.
"Introduction to Numerical Analysis", page. 188.
Theorem. Suppose that \( n \geq 0 \) and let \( f \) be a real-valued function, defined, continuous and \( 2n + 2 \) times differentiable on the interval \([a, b]\), such that \( f^{(2n+2)} \) is continuous on \([a, b]\). Further, let \( p_{2n+1} \) denote the Hermite interpolation polynomial of \( f \) defined by (6.16). Then, for each \( x \in [a, b] \) there exists \( \xi = \xi(x) \) in \((a, b)\) such that \[ f(x) - p_{2n+1}(x) = \frac{f^{(2n+2)}(\xi)}{(2n+2)!} [\pi_{n+1}(x)]^2, \quad (6.17) \] where \( \pi_{n+1} \) is as defined in (6.9). Moreover, \[ |f(x) - p_{2n+1}(x)| \leq \frac{M_{2n+2}}{(2n+2)!} [\pi_{n+1}(x)]^2, \quad (6.18) \] where \( M_{2n+2} = \max_{\zeta \in [a,b]} |f^{(2n+2)}(\zeta)| \).
"Introduction to Numerical Analysis", page. 190.
Theorem on Hermite Interpolation Error Estimate Let \( x_0, x_1, \ldots, x_n \) be distinct nodes in \([a, b]\) and let \( f \in C^{2n+2}[a, b] \). If \( p \) is the polynomial of degree at most \( 2n + 1 \) such that \[ p(x_i) = f(x_i) \quad \text{and} \quad p'(x_i) = f'(x_i) \quad (0 \leq i \leq n) \] then to each \( x \) in \([a, b]\) there corresponds a point \( \xi \) in \((a, b)\) such that \[ f(x) - p(x) = \frac{f^{(2n+2)}(\xi)}{(2n+2)!} \prod_{i=0}^{n} (x - x_i)^2 \]
"Numerical Analysis: Mathematics of Scientific Computing" page. 344.
Suppose that a numerical integration formula is given: \[ \int_c^d f(t) \, dt \approx \sum_{i=0}^n A_i f(t_i). \] Let us assume that it is exact for all polynomials of degree \( m \) or less. If we want to integrate the function \( f \) over the interval \([a, b]\), we can use the change of interval formula. We first define a linear function \(\lambda\) of \( t \) such that if \( t \) traverses \([c, d]\), \(\lambda(t)\) will traverse \([a, b]\). The function \(\lambda\) is given explicitly by \[ \lambda(t) = \frac{b - a}{d - c} t + \frac{ad - bc}{d - c}. \] Now in the integral \[ \int_a^b f(x) \, dx \] we make the change of variable \( x = \lambda(t) \). Then \( dx = \lambda'(t) \, dt = (b-a)(d-c)^{-1} \, dt \), and so we have \[ \int_a^b f(x) \, dx = \frac{b - a}{d - c} \int_c^d f(\lambda(t)) \, dt \] \[ \approx \frac{b - a}{d - c} \sum_{i=0}^n A_i f(\lambda(t_i)) \] Hence, we have \[ \int_a^b f(x) \, dx \approx \frac{b - a}{d - c} \sum_{i=0}^n A_i f \left( \frac{b - a}{d - c} t_i + \frac{ad - bc}{d - c} \right) \] Observe that, because \(\lambda\) is linear, \( f(\lambda(t)) \) is a polynomial in \( t \) if \( f \) is a polynomial, and the degrees are the same. Hence, the new formula is exact for polynomials of degree \( m \). For Simpson’s rule, this procedure yields Equation (6) as a consequence of Equation (5) using \( \lambda(t) = (b - a)t + a \).
"Numerical Analysis: Mathematics of Scientific Computing," page. 486.
The theory can be formulated for quadrature rules of a slightly more general form; namely, \[ \int_a^b f(x) w(x) \, dx \approx \sum_{i=0}^n A_i f(x_i), \] where \( w \) is a fixed positive weight function.
The Gauss quadrature rule: \[ \int_a^b w(x) f(x) \, dx \approx G_n(f) = \sum_{k=0}^{n} W_k f(x_k), \tag*{(10.6)} \] where the quadrature weights are \[ W_k = \int_a^b w(x) [L_k(x)]^2 \, dx, \tag*{(10.7)} \] and the quadrature points \( x_k, \, k = 0, \ldots, n \), are chosen as the zeros of the polynomial of degree \( n + 1 \) from a system of orthogonal polynomials over the interval \( (a, b) \) with respect to the weight function \( w \). Since this quadrature rule was obtained by exact integration of the Hermite interpolation polynomial of degree \( 2n + 1 \) for \( f \), it gives the exact result whenever \( f \) is a polynomial of degree \( 2n + 1 \) or less.
"Introduction to Numerical Analysis", page. 279.
Given the Gaussian quadrature formula \[ \int_a^b f(x) dx = \sum_{i=1}^{n} w_i f(x_i) + R_n(f), \] where \[ R_n(f) = \frac{(b-a)^{2n+1}(n!)^4}{(2n+1)[(2n)!]^3} f^{(2n)}(c). \]
Theorem. Suppose that \( w \) is a weight function, defined, integrable, continuous and positive on \( (a,b) \), and that \( f \) is defined and continuous on \( [a,b] \); suppose further that \( f \) has a continuous derivative of order \( 2n+2 \) on \( [a,b] \), \( n \geq 0 \). Then, there exists a number \( \eta \) in \( (a,b) \) such that \[ \int_a^b w(x)f(x)dx - \sum_{k=0}^n W_k f(x_k) = K_n f^{(2n+2)}(\eta), \tag{10.18} \] and \[ K_n = \frac{1}{(2n+2)!} \int_a^b w(x) [\pi_{n+1}(x)]^2 dx. \] Consequently, the integration formula (10.6), (10.7) will give the exact result for every polynomial of degree \( 2n + 1 \).
"Introduction to Numerical Analysis," page. 282.
Theorem on Gaussian Formula with Error Term Consider a Gaussian formula with error term: \[ \int_a^b f(x)w(x) \, dx = \sum_{i=0}^{n-1} A_i f(x_i) + E \] For an \( f \in C^{2n}[a, b] \), we have \[ E = \frac{f^{(2n)}(\xi)}{(2n)!} \int_a^b q^2(x) w(x) \, dx \] where \( a < \xi < b \) and \( q(x) = \prod_{i=0}^{n-1} (x - x_i) \).
Proof. By Hermite interpolation (Section 6.3, p. 338), a polynomial \( p \) of degree at most \( 2n - 1 \) exists such that \[ p(x_i) = f(x_i), \quad p'(x_i) = f'(x_i) \quad (0 \leq i \leq n-1) \] The error formula for this interpolation is \[ f(x) - p(x) = \frac{f^{(2n)}(\zeta(x)) q^2(x)}{(2n)!} \quad (11) \] as given in Theorem 2 of Section 6.3 (p. 344). Here \(q(x)\) is the polynomial of degree \(n\) which is \(w\)-orthogonal to the polynomials of degree at most \(n-1\). It follows that \[ \int_a^b f(x) w(x) \, dx - \int_a^b p(x) w(x) \, dx = \frac{1}{(2n)!} \int_a^b f^{(2n)}(\zeta(x)) q^2(x) w(x) \, dx \] Now use the fact that \( p \) is of degree at most \( 2n - 1 \) to see that \[ \int_a^b p(x) w(x) \, dx = \sum_{i=0}^{n-1} A_i p(x_i) = \sum_{i=0}^{n-1} A_i f(x_i) \] Furthermore, the Mean-Value Theorem for Integrals can be used to write \[ \int_a^b f^{(2n)}(\zeta(x)) q^2(x) w(x) \, dx = f^{(2n)}(\xi) \int_a^b q^2(x) w(x) \, dx \] This requires continuity of \( f^{(2n)}(\zeta(x)) \), which can be inferred from Equation (11). The desired error formula results from making the substitution described. \(\blacksquare\)
"Numerical Analysis: Mathematics of Scientific Computing" page. 498.
Theorem 8. Let \( w(x) \) be a weight function and \( p_n(x) \) be the monic polynomial of degree \( n \) orthogonal to all polynomials of smaller degrees. Let \( x_j \), \( j = 1, 2, \ldots, n \), be zeros of \( p_n(x) \). Let \( Q(f) \) be the quadrature rule defined in Eqs. (36) and (37). Suppose \( f \in C^{2n}[a, b] \). Then one can find \( \lambda \in (a, b) \) such that \[ \int_a^b f(x) w(x) \, dx = Q(f) + \gamma_n \frac{f^{(2n)}(\lambda)}{(2n)!}, \quad \text{where} \quad \gamma_n = \int_a^b p_n^2(x) w(x) \, dx. \quad (43) \]
Proof. We use the Hermite interpolation to prove the error estimate. There exists a unique polynomial \( h_{2n-1}(x) \) of degree \( 2n - 1 \) such that \[ h_{2n-1}(x_j) = f(x_j) \quad \text{and} \quad h'_{2n-1}(x_j) = f'(x_j). \] In addition, there exists \( \zeta \in (a, b) \) depending on \( x \) such that \[ f(x) = h_{2n-1}(x) + \frac{f^{(2n)}(\zeta)}{(2n)!} (x - x_1)^2 \cdots (x - x_n)^2. \] Note that \( (x - x_1) \cdots (x - x_n) = p_n(x) \) as \( p_n \) is monic and \( x_j \), \( j = 1, \ldots, n \), are its roots. Therefore, \[ \int_a^b f(x) w(x) \, dx = \int_a^b h_{2n-1}(x) w(x) \, dx + \int_a^b \frac{f^{(2n)}(\zeta(x))}{(2n)!} p_n^2(x) w(x) \, dx. \] Since the quadrature is exact for polynomials of degree \( 2n - 1 \), it is exact for \( h_{2n-1}(x) \). Hence \[ \int_a^b h_{2n-1}(x) w(x) \, dx = Q(h_{2n-1}) = \sum_{j=1}^n h_{2n-1}(x_j) w_j = \sum_{j=1}^n f(x_j) w_j = Q(f). \] On the other hand, because \( p_n^2(x) w(x) \geq 0 \) on \([a, b]\), we can apply the mean value theorem and get \[ \int_a^b \frac{f^{(2n)}(\zeta(x))}{(2n)!} p_n^2(x) w(x) \, dx = \frac{f^{(2n)}(\lambda)}{(2n)!} \int_a^b p_n^2(x) w(x) \, dx. \] for some \( \lambda \in (a, b) \). This completes the proof. \(\square\)
Definition(Composite trapezium rule). \[ \int_a^b f(x) \, dx \approx h \left[ \frac{1}{2} f(x_0) + f(x_1) + \cdots + f(x_{m-1}) + \frac{1}{2} f(x_m) \right]. \]
Introduction to Numerical Analysis page. 209.
Definition (Composite Simpson rule). \[ \int_a^b f(x) \, dx \approx \frac{h}{3} \left[ f(x_0) + 4f(x_1) + 2f(x_2) + 4f(x_3) + \cdots + 2f(x_{2m-2}) + 4f(x_{2m-1}) + f(x_{2m}) \right]. \]
Introduction to Numerical Analysis page. 210.
Definition (Composite Midpoint Rule). The composite midpoint rule is the composite Gauss formula with \(w(x) \equiv 1\) and \(n = 0\) defined by \[ \int_a^b f(x) \, dx \approx h \sum_{j=1}^m f\left(a + \left(j - \frac{1}{2}\right) h \right). \] This follows from the fact that when \(n = 0\) there is one quadrature point \(\xi_0 = 0\) in \((-1,1)\), which is at the midpoint of the interval, and the corresponding quadrature weight \(W_0\) is equal to the length of the interval \((-1,1)\), i.e., \(W_0 = 2\). It follows from (10.24) with \(n = 0\) and \[ C_0 = \int_{-1}^1 t^2 \, dt = \frac{2}{3} \] that the error in the composite midpoint rule is \[ E_{0,m} = \frac{(b-a)^3}{24m^2} f''(\eta), \] where \(\eta \in (a, b)\), provided that the function \(f\) has a continuous second derivative on \([a, b]\).
"Introduction to Numerical Analysis", page. 286.
Theta methods. We consider methods of the form \[ y_{n+1} = y_n + h [ \theta f(t_n, y_n) + (1 - \theta) f(t_{n+1}, y_{n+1}) ], \quad n = 0, 1, \ldots, \] where \( \theta \in [0, 1] \) is a parameter: - If \( \theta = 1 \), we recover Euler's method. - If \( \theta \in (0, 1) \), then the theta method (3.3) is implicit: Each time step requires the solution of \( N \) (in general, nonlinear) algebraic equations for the unknown vector \( y_{n+1} \). - The choices \( \theta = 0 \) and \( \theta = \frac{1}{2} \) are known as Backward Euler: \[ y_{n+1} = y_n + h f(t_{n+1}, y_{n+1}), \] Trapezoidal rule: \[ y_{n+1} = y_n + \frac{1}{2} h [ f(t_n, y_n) + f(t_{n+1}, y_{n+1}) ]. \] Solution of nonlinear algebraic equations can be done by iteration. For example, for backward Euler, letting \( y_{n+1}^{[0]} = y_n \), we may use $\textit{Direct iteration}$: \[ y_{n+1}^{[j+1]} = y_n + h f(t_{n+1}, y_{n+1}^{[j]}); \] $\textit{Newton–Raphson}$: \[ y_{n+1}^{[j+1]} = y_{n+1}^{[j]} - \left[ I - h \frac{\partial f(t_n, y_n)}{\partial y} \right]^{-1} \left[ y_{n+1}^{[j]} - y_n - h f(t_{n+1}, y_{n+1}^{[j]}) \right]; \] $\textit{Modified Newton–Raphson}$: \[ y_{n+1}^{[j+1]} = y_{n+1}^{[j]} - \left[ I - h \frac{\partial f(t_n, y_n)}{\partial y} \right]^{-1} \left[ y_{n+1}^{[j]} - y_n - h f(t_{n+1}, y_{n+1}^{[j]}) \right]. \]
In numerically solving a differential equation, several types of errors arise. These are conveniently classified as follows:
Local Truncation Error. The local truncation error is the error made in a single step of a numerical method.
{Local Roundoff Error. The local roundoff error is the error made in a single step of a numerical method due to roundoff.
Global Truncation Error. The global truncation error is the accumulation of the local truncation errors in the previous steps.
Global Roundoff Error. The global roundoff error is the accumulation of the local roundoff errors in the previous steps.
Total Error. The total error is the sum of the global truncation error and the global roundoff error.
"Numerical Analysis: Mathematics of Scientific Computing" page. 533.
\(\textbf{Definition. }\) Richardson extrapolation is a method for improving the accuracy of a numerical method. Given a numerical method that approximates a quantity with error \( E(h) \) for a step size \( h \), Richardson extrapolation constructs a new approximation with error \( E(h^2) \) by combining two approximations with step sizes \( h \) and \( h/2 \).
Our model is an initial-value problem written in the form \[ \begin{cases} x' = f(t, x) \\ x(t_0) = x_0 \quad \tag{1} \end{cases} \] Here \( x \) is an unknown function of \( t \) that we hope to construct from the information given in Equations (1), where \( x' = \frac{dx(t)}{dt} \). The second of the two equations in (1) specifies one particular value of the function \( x(t) \). The first equation gives the slope.
"Numerical Analysis: Mathematics of Scientific Computing" page.524.
$\textbf{First Existence Theorem, Initial-Value Problem}$ If \( f \) is continuous in a rectangle \( R \) centered at \( (t_0, x_0) \), say \[ R = \{ (t, x) : |t - t_0| \leq \alpha, \quad |x - x_0| \leq \beta \}\tag{4} \] then the initial-value problem above has a solution \( x(t) \) for \( |t - t_0| \leq \min(\alpha, \beta / M) \), where \( M \) is the maximum of \( |f(t, x)| \) in the rectangle \( R \).
"Numerical Analysis: Mathematics of Scientific Computing" page 525.
$\textbf{Second Existence Theorem, Initial-Value Problem.}$ If \( f \) is continuous in the strip \( a \leq t \leq b, -\infty < x < \infty \) and satisfies there an inequality \[ |f(t, x_1) - f(t, x_2)| \leq L |x_1 - x_2| \tag{5} \] then the initial-value problem (1) has a unique solution in the interval \([a, b]\).
"Numerical Analysis: Mathematics of Scientific Computing" page.526.
Definition: A function \( f : \mathbb{R}^n \to \mathbb{R} \) satisfies the Lipschitz condition on a domain \( D \subset \mathbb{R}^n \) if there exists a constant \( L \geq 0 \) such that for all \( x, y \in D \): \[ \| f(x) - f(y) \| \leq L \| x - y \| \] where \( \| \cdot \| \) denotes a norm (commonly the Euclidean norm). And \( L \) is called the Lipschitz constant.
Note. Hence, inequality (5) is the Lipschitz condition.
Uniqueness Theorem, Initial-Value Problem. If \( f \) and \(\partial f / \partial x \) are continuous in the rectangle \( R \) defined by $\textbf{(4)}$, then the initial-value problem (1) has a unique solution in the interval \( |t - t_0| < \min(\alpha, \beta / M) \).
$\textbf{Theorem (Picard's Theorem).}$ Suppose that the real-valued function \((x,y) \mapsto f(x,y)\) is continuous in the rectangular region \(D\) defined by \(x_0 \leq x \leq X_M\), \(y_0 - C \leq y \leq y_0 + C\); that \(|f(x,y_0)| \leq K\) when \(x_0 \leq x \leq X_M\); and that \(f\) satisfies the Lipschitz condition: there exists \(L > 0\) such that \[ |f(x,u) - f(x,v)| \leq L|u - v| \quad \text{for all} \quad (x,u) \in D, \quad (x,v) \in D. \] Assume further that \[ C \geq \frac{K}{L} \left( e^{L(X_M - x_0)} - 1 \right). \tag{12.3} \] Then, there exists a unique function \(y \in C^1[x_0, X_M]\) such that \(y(x_0) = y_0\) and \(y' = f(x,y)\) for \(x \in [x_0, X_M]\); moreover, \[ |y(x) - y_0| \leq C, \quad x_0 \leq x \leq X_M. \]
"Introduction to Numerical Analysis", page. 311.
Definition. Generally, a one-step method may be written in the form \[ y_{n+1} = y_n + h \Phi (x_n, y_n; h), \quad n = 0, 1, \ldots, N-1,\qquad y(x_0) = y_0, \tag*{(12.13)} \] where \(\Phi (\cdot, \cdot; \cdot)\) is a continuous function of its variables.
"Introduction to Numerical Analysis", page. 317.
Definition. In order to find the accuracy of the numerical method (12.13), we define the global error, \( e_n \), by \[ e_n = y(x_n) - y_n. \]
Definition. The truncation error, \( T_n \), is defined by \[ T_n = \frac{y(x_{n+1}) - y(x_n)}{h} - \Phi(x_n, y(x_n); h). \tag*{(12.14)} \]
"Introduction to Numerical Analysis", page. 317.
Theorem. Consider the general one-step method (12.13) where, in addition to being a continuous function of its arguments, \(\Phi\) is assumed to satisfy a Lipschitz condition with respect to its second argument, that is, there exists a positive constant \( L_{\Phi} \) such that, for \( 0 \le h \le h_0 \) and for all \((x, u)\) and \((x, v)\) in the rectangle \[ D = \{(x, y): x_0 \le x \le X_M, |y - y_0| \le C\}, \] we have that \[ |\Phi(x, u; h) - \Phi(x, v; h)| \le L_{\Phi} |u - v|. \quad (12.15) \] Then, assuming that \( |y_n - y_0| \le C \), \( n = 1, 2, \ldots, N \), it follows that \[ |e_n| \le \frac{T}{L_{\Phi}} \left( e^{L_{\Phi} (x_n - x_0)} - 1 \right), \quad n = 0, 1, \ldots, N, \quad (12.16) \] where \( T = \max_{0 \le n \le N-1} |T_n| \).
"Introduction to Numerical Analysis", page. 318.
$\textbf{Euler's method.}$ Given that \( y(x_0) = y_0 \), let us suppose that we have already calculated \( y_n \), up to some \( n \), \( 0 \leq n \leq N - 1 \), \( N \geq 1 \); we define \[ y_{n+1} = y_n + h f(x_n, y_n). \] Thus, taking in succession \( n = 0, 1, \ldots, N - 1 \), one step at a time, the approximate values \( y_n \) at the mesh points \( x_n \) can be easily obtained. This numerical method is known as Euler's method.
"Introduction to Numerical Analysis", page. 317.
Definition. The numerical method (12.13) is consistent with the differential equation (12.1) if the truncation error, defined by (12.14), is such that for any \( \epsilon > 0 \) there exists a positive \( h(\epsilon) \) for which \( |T_n| < \epsilon \) for \( 0 < h < h(\epsilon) \) and any pair of points \((x_n, y(x_n)), (x_{n+1}, y(x_{n+1}))\) on any solution curve in \( D \).
"Introduction to Numerical Analysis", page. 321.
Theorem. Suppose that the initial value problem (12.1), (12.2) satisfies the conditions of Picard's Theorem, and also that its approximation generated from (12.13) when \( h \le h_0 \) lies in the region \( D \). Assume further that the function \(\Phi (\cdot, \cdot; \cdot)\) is continuous on \( D \times [0, h_0] \), and satisfies the consistency condition (12.21) and the Lipschitz condition \[ |\Phi(x, u; h) - \Phi(x, v; h)| \le L_{\Phi} |u - v| \quad \text{on} \quad D \times [0, h_0]. \tag*{(12.22)} \] Then, if successive approximation sequences \((y_n)\), generated by using the mesh points \( x_n = x_0 + nh \), \( n = 1, 2, \ldots, N \), are obtained from (12.13) with successively smaller values of \( h \), each \( h \) less than \( h_0 \), we have convergence of the numerical solution to the solution of the initial value problem in the sense that \[ \lim_{n \to \infty} y_n = y(x) \quad \text{as} \quad x_n \to x \in [x_0, X_M] \quad \text{when} \quad h \to 0 \quad \text{and} \quad n \to \infty. \]
"Introduction to Numerical Analysis", page. 322.
Suppose the resulting formula is of the following type: \[ x_{n+1} = x_n + a\cdot f_n + b\cdot f_{n-1} + c\cdot f_{n-2} + \cdots \] where \( f_i \) denotes \( f(t_i, x_i) \). An equation of this type is called an Adams-Bashforth formula.
Note. The Adams-Bashforth formula is an explicit method since it does not include $f_{n+1}$.
\[x_{n+1} = x_n + \frac{h}{720} \left[ 1901 f_n - 2774 f_{n-1} + 2616 f_{n-2} - 1274 f_{n-3} + 251 f_{n-4} \right]\]
Definition. Suppose the resulting formula is of the following type: \[ x_{n+1} = x_n + a\cdot f_{n+1} + b\cdot f_{n} + c\cdot f_{n} + \cdots \] where \( f_i \) denotes \( f(t_i, x_i) \). An equation of this type is called an Adams-Moulton formula.
Note. The Adams-Moulton formula is an implicit method since it includes \( f_{n+1} \).
\[ x_{n+1} = x_n + \frac{h}{720} \left[ 251 f_{n+1} + 646 f_n - 264 f_{n-1} + 106 f_{n-2} - 19 f_{n-3} \right] \]
"Numerical Analysis: Mathematics of Scientific Computing", page. 551.
The format of any such method is \[ a_k x_n + a_{k-1} x_{n-1} + \cdots + a_0 x_{n-k} = h \left[ b_k f_n + b_{k-1} f_{n-1} + \cdots + b_0 f_{n-k} \right]. \tag{9}\] Or in general, it can be written as \[\sum_{j=0}^{k} a_j x_{n+j} = h \sum_{j=0}^{k} b_j f(t_{n+j}, x_{n+j}), \] where \( a_k \neq 0 \). This is called a \( k \)-step method.
Note. The left side of the equation (9) is called the corresponding characteristic polynomial of the method.
Note
Examples.
"Introduction to Numerical Analysis" page. 331.
The accuracy of a numerical solution of a differential equation is largely determined by the order of the algorithm used.
Definition.The order indicates how many terms in a Taylor-series solution are being simulated by the method.
For example, the Adams-Bashforth method: $x_{n+1} = x_n + \frac{h}{720} \left[ 1901 f_n - 2774 f_{n-1} + 2616 f_{n-2} - 1274 f_{n-3} + 251 f_{n-4} \right]$ is said to be of order 5 because it produces approximately the same accuracy as the Taylor-series method with terms \( h, h^2, h^3, h^4, \) and \( h^5 \). The error can then be expected to be \( \mathcal{O}(h^6) \) in each step of a solution using the above method.
In order to have better understanding of the order of the Multistep methods, we will define the following function \[ Lx = \sum_{i=0}^{k} \left[ a_i x(ih) - h b_i x'(ih) \right] a. \] However, in the following analysis, we assume that \( x \) is represented by its Taylor series at \( t = 0 \). By using the Taylor series for \( x \), one can express \( L \) in this form: \[ Lx = d_0 x(0) + d_1 h x'(0) + d_2 h^2 x''(0) + \cdots \tag{11} \] To compute the coefficients, \( d_i \), in Equation (11), we write the Taylor series for \( x \) and \( x' \): \[ x(ih) = \sum_{j=0}^{\infty} \frac{(ih)^j}{j!} x^{(j)}(0) \] \[ x'(ih) = \sum_{j=0}^{\infty} \frac{(ih)^j}{j!} x^{(j+1)}(0) \] These series are then substituted into Equation (10). Rearranging the result in powers of \( h \), we obtain an equation having the form of (11), with these values of \( d_i \): \[ \begin{aligned} d_0 &= \sum_{i=0}^{k} a_i \\ d_1 &= \sum_{i=0}^{k} (i a_i - b_i) \\ d_2 &= \sum_{i=0}^{k} \left( \frac{1}{2} i^2 a_i - i b_i \right) \\ &\vdots \\ d_j &= \sum_{i=0}^{k} \left( \frac{i^j}{j!} a_i - \frac{i^{j-1}}{(j-1)!} b_i \right) \quad (j \geq 1) \end{aligned} \tag{12} \] Here we use \( 0! = 1 \) and \( i^0 = 1 \).
Theorem on Multistep Method Properties. These three properties of the multistep method (9) are equivalent:
"Numerical Analysis: Mathematics of Scientific Computing", page. 552 & 553.
Now, we can define the order of the multistep method as follows: The order of the multistep method in Equation (9) is the unique natural number \( m \) such that \[ d_0 = d_1 = \cdots = d_m = 0 \neq d_{m+1}. \]
EXAMPLE 1 What is the order of the method described by the following equation? \[ x_n - x_{n-2} = \frac{1}{3}h(f_n + 4f_{n-1} + f_{n-2}). \] Solution. Firstly, we rewrite the equation in the form of Equation (9): \[ \begin{align} x_n - x_{n-2} &= \frac{1}{3}h(f_n + 4f_{n-1} + f_{n-2}) \\ x_n + 0\cdot x_{n-1} - x_{n-2} &= h\left(\frac{1}{3}f_n + \frac{4}{3}f_{n-1} + \frac{1}{3}f_{n-2}\right) \end{align} \] And we have to match it to \[ a_2 x_n + a_{1} x_{n-1} + a_0 x_{n-2} = h \left[ b_2 f_n + b_{1} f_{n-1} + b_0 f_{n-2} \right]. \] The vector \((a_0, a_1, a_2)\) is \((-1, 0, 1)\) and the vector \((b_0, b_1, b_2)\) is \(\left(\frac{1}{3}, \frac{4}{3}, \frac{1}{3}\right)\). Thus, the \( d_i \) are \[ \begin{align} d_0 &= a_0 + a_1 + a_2 = 0 \\ d_1 &= -b_0 + (a_1 - b_1) + (2a_2 - b_2) = 0 \\ d_2 &= \left( \frac{1}{2}a_1 - b_1 \right) + (2a_2 - 2b_2) = 0 \\ d_3 &= \left( \frac{1}{6}a_1 - \frac{1}{2}b_1 \right) + \left( \frac{4}{3}a_2 - 2b_2 \right) = 0 \\ d_4 &= \left( \frac{1}{24}a_1 - \frac{1}{6}b_1 \right) + \left( \frac{2}{3}a_2 - \frac{4}{3}b_2 \right) = 0 \\ d_5 &= \left( \frac{1}{120}a_1 - \frac{1}{24}b_1 \right) + \left( \frac{4}{15}a_2 - \frac{2}{3}b_2 \right) = -\frac{1}{90} \end{align} \] The order of the method is \(4\).
"Numerical Analysis: Mathematics of Scientific Computing", page 554.
Definition. The roots of the first characteristic polynomial lie in the closed unit disc and those on the unit circle are simple is often called the Root Condition.
Definition. The roots of $\rho(z)$ of modulus one are called essential roots. The root \( z = 1 \) is called the principal root. The roots of \(\rho(z)\) of modulus \( \lt 1 \) are called nonessential roots.
Definition. A linear multistep method is strongly stable if all roots of \(\rho(z)\) are \(\leq 1\) in magnitude and only one root has magnitude one. If more than one root has magnitude one, the method is called weakly or conditionally stable.
Note. We still require only simple roots of magnitude one. Also, note these definitions refer to the case \( h = 0 \).
Definition. A multistep method is said to be strongly stable if \( p(1) = 0, p'(1) \neq 0 \), and all other roots of \( p \) satisfy the inequality \( |z| < 1 \).
"Numerical Analysis: Mathematics of Scientific Computing", page 564.
Definition. A linear multistep method is said to be absolutely stable for a given value of \( \lambda h \) if each root \(z_r = z_r(\lambda h)\) of the associated stability polynomial \(\pi(\cdot; \lambda h)\) satisfies \(|z_r(\lambda h)| < 1\).
Definition. The region of absolute stability of a linear multistep method is the set of all points \(\lambda h\) in the complex plane for which the method is absolutely stable.
Definition. A linear multistep method is said to be A-stable if its region of absolute stability contains the negative (left) complex half-plane.
Second Barrier Theorem:
"Introduction to Numerical Analysis", page 348.
Theorem (Theorem on Local Truncation Error, Multistep Method). If the multistep method (2) is of order \( m \), if \( x \in C^{m+2} \), and if \( \partial f / \partial x \) is continuous, then under the hypotheses of the preceding paragraph, \[ x(t_n) - x_n = \left( \frac{d_{m+1}}{a_k} \right) h^{m+1} x^{(m+1)}(t_{n-k}) + \mathcal{O}(h^{m+2}) \tag*{(13)} \] (The coefficients \( d_k \) are defined in Section 8.4, p. 553.)
"Numerical Analysis: Mathematics of Scientific Computing", page 560.
Proof It suffices to prove the equation when \( n = k \), because \( x_n \) can be interpreted as the value of a numerical solution that began at the point \( t_{n-k} \). Using the linear functional \( L \), Equation (10) in Section 8.4 (p. 552), we have \[ Lx = \sum_{i=0}^k [a_i x(t_i) - h b_i x'(t_i)] = \sum_{i=0}^k [a_i x(t_i) - h b_i f(t_i, x(t_i))] \tag{(14)} \] On the other hand, the numerical solution satisfies the equation \[ 0 = \sum_{i=0}^k [a_i x_i - h b_i f(t_i, x_i)] \tag{(15)} \] Because we have assumed that \( x_i = x(t_i) \) for \( i \lt k \), the result of subtracting Equation (15) from Equation (14) will be \[ Lx = a_k [x(t_k) - x_k] - h b_k [f(t_k, x(t_k)) - f(t_k, x_k)] \tag{(16)} \] To the last term of Equation (16) we apply the Mean-Value Theorem, obtaining \[ Lx = a_k [x(t_k) - x_k] - h b_k \frac{\partial f}{\partial x} (t_k, \xi) [x(t_k) - x_k] = [a_k - h b_k F] [x(t_k) - x_k] \tag{(17)} \] where \( F = \frac{\partial f(t_k, \xi)}{\partial x} \) for some \( \xi \) between \( x(t_k) \) and \( x_k \). Now recall from Equation (13) in Section 8.4 (p. 553) that if the method being used is of order \( m \), then \( Lx \) will have the form \[ Lx = d_{m+1} h^{m+1} x^{(m+1)} (t_0) + \mathcal{O}(h^{m+2}) \tag{(18)} \] Combining Equations (17) and (18), we obtain (13). Here we can ignore \( h b_k F \) in the denominator. (See Problem 8.5.8, p. 564.) \(\blacksquare\)
"Numerical Analysis: Mathematics of Scientific Computing", page 563.
(Schur Decomposition). For any square matrix \( A \), there exists a unitary matrix \( U \) such that \[ U^* A U = T \] where \( T \) is upper triangular.
Remark. That is, every square matrix is similar to an upper-triangular matrix.
Definition. A matrix \( A \) is called a Hermite matrix if it is equal to its conjugate transpose, that is, \( A = A^* \).
Theorem. Every Hermitian \( n \times n \) matrix \( A \) can be diagonalized by a unitary matrix, \[ U^* A U = \Lambda, \] where \( U \) is unitary and \( \Lambda \) is a diagonal matrix.
Proof. Since we know that the hermite matrix has the schur form of \(A = U T U^*\), where \(T\) is upper triangular. Since \(A = A^*\), we have \(A = U T U^* = (U T U^*)^* = U T^* U^* = U T U^*\). Thus, \(T = T^*\). Since \(T\) is upper triangular, we have \(T^*\) is lower triangular. Thus, \(T = T^*\) is diagonal. \(\blacksquare\)
Definition [Reduced QR factorization]. Let \( A \in \mathbb{F}^{m \times n} \) (\( m \geq n \)). There is an orthonormal matrix \( \hat{Q} \in \mathbb{F}^{m \times n} \) and a square upper triangular matrix \( \hat{R} \in \mathbb{F}^{n \times n} \) such that \[ A = \hat{Q} \hat{R}. \]
There are several ways to compute the QR factorization of a matrix.
Example. Let \(A = [a_{1} \; a_{2} \; a_{3}] = \begin{bmatrix} \sqrt{2} & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 1 & 1 \end{bmatrix}\). Compute a QR factorization by using Householder Transformation.
Solution. Let \[ v_1 = a_1 + \begin{bmatrix} \|a_1\|_2 \\ 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 2+\sqrt{2} \\ 1 \\ 1\end{bmatrix}. \], we define our first Householder reflector as \[ H_1 = I - 2\dfrac{v_1v_1^*}{v_1^*v_1} = \begin{bmatrix} 1 - \frac{\sqrt{2}+2}{2} & -\frac{1}{2} & -\frac{1}{2} \\ -\frac{1}{2} & 1 - \frac{1}{2(\sqrt{2}+2)} & -\frac{1}{2(\sqrt{2}+2)} \\ -\frac{1}{2} & -\frac{1}{2(\sqrt{2}+2)} & 1-\frac{1}{2(\sqrt{2}+2)} \\ \end{bmatrix}= \begin{bmatrix} -\sqrt{2}/2 & -1/2 & -1/2 \\ -1/2 & (\sqrt{2}+2)/4 & (\sqrt{2}- 2)/4 \\ -1/2 & (\sqrt{2}-2)/4& (\sqrt{2}+2)/4\\ \end{bmatrix} \] Then, we have a new matrix \(M_2\) as \[ \begin{align} M_2 &= H_1\cdot A\\ &= \begin{bmatrix} 1 - \frac{\sqrt{2}+2}{2} & -\frac{1}{2} & -\frac{1}{2} \\ -\frac{1}{2} & 1 - \frac{1}{2(\sqrt{2}+2)} & -\frac{1}{2(\sqrt{2}+2)} \\ -\frac{1}{2} & -\frac{1}{2(\sqrt{2}+2)} & 1-\frac{1}{2(\sqrt{2}+2)} \\ \end{bmatrix}\cdot \begin{bmatrix} \sqrt{2} & 0 & 0 \\ 1 & 1 & 0\\ 1 & 1 & 1\\ \end{bmatrix}\\ &= \begin{bmatrix} -2 & -1 & -\frac{1}{2} \\ 0 & 1 -\frac{1}{\sqrt{2}+2} & - \frac{1}{\sqrt{2}+2} \\ 0 & 1 -\frac{1}{\sqrt{2}+2} & 1 - \frac{1}{2(\sqrt{2}+2)} \\ \end{bmatrix}\\ &= \begin{bmatrix} -2 & -1 & -1/2 \\ 0 & \sqrt{2}/2 & (\sqrt{2}-2)/2 \\ 0 & \sqrt{2}/2 & (2+\sqrt{2})/4 \\ \end{bmatrix} \end{align} \] Now we pick vector \(a'_2 = [\sqrt{2}/2, \sqrt{2}/2]^T\), and we have \(\|a'_2\|_2 = 1\). Now, we define \(v_2 = a'_2 + \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} \sqrt{2}/2 + 1 \\ \sqrt{2}/2 \end{bmatrix}\). Then, we define our second reduced Householder reflector as \[ \begin{align} H'_2 &= I - 2\dfrac{v_2v_2^*}{v_2^*v_2} \\ &= \begin{bmatrix} 1 - \frac{2+\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ -\frac{\sqrt{2}}{2} & 1 - \frac{2-\sqrt{2}}{2} \\ \end{bmatrix}\\ &= \begin{bmatrix} -\sqrt{2}/2 & -\sqrt{2}/2 \\ -\sqrt{2}/2 & \sqrt{2}/2 \\ \end{bmatrix}\\ \end{align} \] And we extend \(H'_2\) to a \(3\times 3\) matrix \(H_2\) as \[ H_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & -\sqrt{2}/2 & -\sqrt{2}/2 \\ 0 & -\sqrt{2}/2 & \sqrt{2}/2 \\ \end{bmatrix} \] Then we have \(M_3 = H_2M_2\) as \[ \begin{align} M_3 &= H_2M_2\\ &= \begin{bmatrix} 1 & 0 & 0 \\ 0 & -\sqrt{2}/2 & -\sqrt{2}/2 \\ 0 & -\sqrt{2}/2 & \sqrt{2}/2 \\ \end{bmatrix}\cdot \begin{bmatrix} -2 & -1 & -1/2 \\ 0 & \sqrt{2}/2 & (2-\sqrt{2})/2 \\ 0 & \sqrt{2}/2 & (2+\sqrt{2})/4 \\ \end{bmatrix}\\ &= \begin{bmatrix} -2 & -1 & -1/2 \\ 0 & -1 & -1/2 \\ 0 & 0 & 1/\sqrt{2} \end{bmatrix}. \end{align} \] Given that for any Householder reflector \(H\), we have \(H^{-1} = H\) (Proved in the next problem). Thus, we have \(QR\) factorization as \[ \begin{align} \begin{bmatrix} \sqrt{2} & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 1 & 1 \end{bmatrix} &= \begin{bmatrix} -2 & -1 & -1/2 \\ 0 & \sqrt{2}/2 & (\sqrt{2}-2)/2 \\ 0 & \sqrt{2}/2 & (2+\sqrt{2})/4 \\ \end{bmatrix}\cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & -\sqrt{2}/2 & -\sqrt{2}/2 \\ 0 & -\sqrt{2}/2 & \sqrt{2}/2 \\ \end{bmatrix} \begin{bmatrix} -2 & -1 & -1/2 \\ 0 & -1 & -1/2 \\ 0 & 0 & 1/\sqrt{2} \end{bmatrix} \\ &= \begin{bmatrix} -\sqrt{2}/2 & \sqrt{2}/2 & 0 \\ -1/2 & -1/2 & -\sqrt{2}/2 \\ -1/2 & -1/2 & \sqrt{2}/2 \\ \end{bmatrix}\cdot \begin{bmatrix} -2 & -1 & -1/2 \\ 0 & -1 & -1/2 \\ 0 & 0 & 1/\sqrt{2} \end{bmatrix}. \end{align} \]
The LU decomposition of a matrix \(A\) is a factorization of \(A\) into a product of a lower triangular matrix \(L\) and an upper triangular matrix \(U\). The LU decomposition with partial pivoting is given by \[ PA = LU \] where \(P\) is a permutation matrix, \(L\) is a lower triangular matrix with ones on the diagonal, and \(U\) is an upper triangular matrix. The LU decomposition is computed using Gaussian elimination with partial pivoting. One thing we need to know is when the LU decomposition exists and the following proposition gives us the answer.
Proposition 5. Let \( A = (a_{ij}) \) be an \( n \times n \) matrix and for \( k = 1, \ldots, n-1 \), let \( A(k) \) denote the \( k \times k \) submatrix of \( A \) consisting of the first \( k \) rows and \( k \) columns of \( A \). If \( A(k) \) is invertible for \( k = 1, 2, \ldots, n-1 \), then \( A \) has a unique LU-decomposition.
LU Decomposition provides a good explanation of the LU decomposition.
Consider a diagonalizable matrix \(A \in \mathbb{C}^{n \times n}\) with non-singular eigenvector matrix \(V \in \mathbb{C}^{n \times n}\), such that \(A = V\Lambda V^{-1}\), where \(\Lambda\) is a diagonal matrix. If \(X \in \mathbb{C}^{n \times n}\) is invertible, its condition number in the \(p\)-norm is denoted by \(\kappa_p(X)\) and defined as: \[ \kappa_p(X) = |X|_p |X^{-1}|_p \]
\(\textbf{Theorem. }\) Let A be an \( n \times n \) matrix with a complete set of linearly independent eigenvectors and suppose the \( V^{-1}AV = D \) where \( V \) is non-singular and \( D \) is diagonal. Let \( \delta A \) be a perturbation of \( A \) and let \( \mu \) be an eigenvalue of \( A + \delta A \). Then \( A \) has an eigenvalue \( \lambda \) such that \[ |\mu - \lambda| \leq \kappa_p(V) \| \delta A \|_p, \quad 1 \leq p \leq \infty \] where \( \kappa_p(V) \) is the \( p \)-norm condition number of \( V \).
The Bauer-Fike Theorem:
The Bauer–Fike theorem is a powerful result, and its proof involves several key concepts from linear algebra and perturbation theory. Let's outline the main steps of the proof:
Theorem [Bauer-Fike Theorem] Suppose \( A, E \in \mathbb{C}^{m \times m} \) and \( A = X \Lambda X^{-1} \), where \[ \Lambda = \begin{bmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_m \end{bmatrix}. \] Then for any eigenvalue \( \tilde{\lambda} \) of \( \tilde{A} = A + E \), there exists \( \lambda_i \), an eigenvalue of \( A \), such that \[ |\tilde{\lambda} - \lambda_i| \leq \kappa_2(X) \|E\|_2, \] where \( \kappa_2(X) = \|X\|_2 \|X^{-1}\|_2 \).
Proof. Suppose \( 0 \neq x \in \mathbb{C}^m \) is an eigenvector of \( \tilde{A} \) corresponding to \( \tilde{\lambda} \). Then for \( y = X^{-1} x \), \[ (A + E - \tilde{\lambda} I)x = 0 \implies (X \Lambda X^{-1} - \tilde{\lambda} I + E)x = 0 \implies (\Lambda - \tilde{\lambda} I + X^{-1} EX)y = 0, \] and from which \[ (\Lambda - \tilde{\lambda} I)y = -X^{-1} EX y. \] Then \[ \sigma_{\min}(\Lambda - \tilde{\lambda} I)\|y\|_2 \leq \|(\Lambda - \tilde{\lambda} I)y\|_2 = \|X^{-1} EX y\|_2 \leq \|X^{-1} EX\|_2 \|y\|_2. \] Since \( \Lambda - \tilde{\lambda} I \) is diagonal, \[ \sigma_{\min}(\Lambda - \tilde{\lambda} I) = \min_j |\lambda_j - \tilde{\lambda}|. \] Suppose the minimum is attained when \( j = i \). Then \[ |\lambda_i - \tilde{\lambda}| \leq \|X^{-1} EX\|_2 \leq \kappa_2(X) \|E\|_2. \]
"Lecture note No. 22"
Definition (Forbenius norm). For a matrix \( A \in \mathbb{C}^{m \times n} \), the Forbenius norm is defined as \[ \|A\|_F = \sqrt{\sum_{i = 1}^m \sum_{j = 1}^n |a_{ij}|^2}. \]
Definition (\(p\)-norm). For a matrix \( A \in \mathbb{C}^{m \times n} \), the \(p\)-norm is defined as \[ \|A\|_p = \max_{x \neq 0} \frac{\|Ax\|_p}{\|x\|_p} = \max_{\|x\|_p = 1} \|Ax\|_p, \] where \(x\) is a vector in \(\mathbb{C}^n\).
Theorem. For any \( A \in \mathbb{C}^{m \times n} \) and unitary \( Q \in \mathbb{C}^{m \times m} \), we have \[ \|QA\|_2 = \|A\|_2, \quad \|QA\|_F = \|A\|_F. \]
"Numerical Linear Algebra", Page. 23
Theorem. For any \( A \in \mathbb{C}^{m \times n} \) and unitary \( Q \in \mathbb{C}^{n \times n} \), we have \[ \|AQ\|_2 = \|A\|_2, \quad \|AQ\|_F = \|A\|_F. \]
The following definitions all involve the term \( x^* M x \). Notice that this is always a real number for any Hermitian square matrix \( M \). An \( n \times n \) Hermitian complex matrix \( M \) is said to be positive-definite if \( x^* M x > 0 \) for all non-zero \( x \) in \( \mathbb{C}^n \). Formally, \[ M \text{positive-definite} \iff x^* M x > 0 \text{ for all } x \in \mathbb{C}^n \setminus \{0\} \] An \( n \times n \) Hermitian complex matrix \( M \) is said to be positive semi-definite or non-negative-definite if \( x^* M x \geq 0 \) for all \( x \) in \( \mathbb{C}^n \). Formally, \[ M \text{positive semi-definite} \iff x^* M x \geq 0 \text{ for all } x \in \mathbb{C}^n \]
Theorem. A matrix \( M \) is positive-definite if and only if it satisfies any of the following equivalent conditions:
Theorem [Row version Gerschgorin’s Theorem] Let \( A \in \mathbb{C}^{m \times m} \). Consider \( m \) disks in the complex plane: \[ D_i = \left\{ z \mid |z - a_{ii}| \leq \sum_{j \neq i} |a_{ij}| = |a_{i1}| + \ldots + |a_{i,i-1}| + |a_{i,i+1}| + \ldots + |a_{im}| \right\}. \] Then we have the following results.
Theorem [Column version Gerschgorin’s Theorem] Let \( A \in \mathbb{C}^{m \times m} \). Consider \( m \) disks in the complex plane: \[ \tilde{D}_i = \left\{ z \mid |z - a_{ii}| \leq \sum_{j \neq i} |a_{ji}| = |a_{1i}| + \ldots + |a_{i-1,i}| + |a_{i+1,i}| + \ldots + |a_{mi}| \right\}. \] Then we have the following results.
Theorem [Row version Generalized Gerschgorin’s Theorem] Let \( A \in \mathbb{C}^{m \times m} \). Consider \( m \) disks in the complex plane: \[ D_i = \left\{ z \mid |z - a_{ii}| \leq \frac{1}{|d_i|} \sum_{j \neq i} |a_{ij}||d_j| \right\}, \] where \( d_1, \ldots, d_n \) are arbitrary nonzero numbers. Then we have the following results.
"Lecture note No. 22"
Definition. Let \( A \in \mathbb{C}^{m \times m} \). For a given vector \( 0 \neq x \in \mathbb{C}^m \), the Rayleigh quotient of \( x \) is the scalar defined by \[ r(x) = \frac{x^* A x}{x^* x} \]
Algorithm: Rayleigh quotient iteration
Choose an initial vector \( x^{(0)} \in \mathbb{C}^m \) with \( \|x^{(0)}\|_2 = 1 \).
Compute \( \mu^{(0)} = (x^{(0)})^* A x^{(0)} \).
For \( k = 1, 2, \ldots \)
End
Definition. A matrix \( H = [h_{ij}] \in \mathbb{C}^{m \times m} \) is upper Hessenberg if \( h_{ij} = 0 \) for all \( i \geq j + 2 \), or equivalently, \[ H = \begin{bmatrix} h_{11} & h_{12} & \cdots & h_{1m} \\ h_{21} & h_{22} & \cdots & h_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ h_{m,m-1} & h_{mm} \end{bmatrix} \] \( H \) is called unreduced upper Hessenberg if \( h_{21}, \ldots, h_{m,m-1} \neq 0 \). Otherwise \( H \) is reduced upper Hessenberg.