My name is Hanzhang Yin, and I have developed this website as a resource to facilitate the review of key concepts in abstract algebra, and it is not fully completed. Feel free to email me at hanyin@ku.edu if there are any errors or suggestions for improvement.
1. (a) Write Newton’s iteration scheme for finding a root of an equation \( f(x) = 0 \).
(b) Suppose that \( f(x) \in C^m(a, b) \) for some integer \( m \geq 1 \) and \( p \in (a, b) \) is a root of \( f(x) = 0 \). Assume that the sequence \(\{x_n\} \subset (a, b) \) generated by Newton’s iteration converges to \( p \). Prove if \( f'(p) \ne 0 \), and if \( f''(p) = \ldots = f^{(m-1)}(p) = 0 \) when \( m \geq 3 \), then \[ \lim_{n \to \infty} \left| \frac{p - x_n}{(p - x_{n-1})^m} \right| = \frac{m - 1}{m!} \left| \frac{f^{(m)}(p)}{f'(p)} \right| \]
Solution (a). Newton's iteration scheme for finding a root of an equation \( f(x) = 0 \) is given by \[ x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}, \] where \(x_0\) is close to the root of \(f(x) = 0\).
Solution (b). Suppose that \(p- x_{n-1} = e\). Since the sequence \(\{x_n\}\) generated by Newton's iteration converges to \(p\), we have \(e\to 0\) as \(n\to \infty\). Then we can have \[ \begin{align} f(x_{n-1}) &= f(p) + f'(p)(-e) + \frac{f''(p)}{2}(-e)^2 + \cdots + \frac{f^{(m-1)}(p)}{(m-1)!}(-e)^{m-1} + \frac{f^{(m)}(p)}{m!}(-e)^{m} + \mathcal{O}(e^{m+1}) \\ &= 0 + f'(p)(-e) + 0 + \cdots + 0 + \frac{f^{(m)}(p)}{m!}(-e)^{m} + \mathcal{O}(e^{m+1}) \\ &= f'(p)(-e) + \frac{f^{(m)}(p)}{m!}(-e)^{m} + \mathcal{O}(e^{m+1}). \\ f'(x_{n-1}) &= f'(p) + f''(p)(-e) + \cdots + \frac{f^{(m-1)}(p)}{(m-2)!}(-e)^{m-2} + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m}) \\ &= f'(p) + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m}), \end{align} \] for \(f''(p) = \cdots = f^{(m-1)}(p) = 0\). Then, we have \[ \begin{align} p - x_n &= p - x_{n-1} + \frac{f(x_{n-1})}{f'(x_{n-1})} \\ &= e + \dfrac{f'(p)(-e) + \frac{f^{(m)}(p)}{m!}(-e)^{m} + \mathcal{O}(e^{m+1})}{f'(p) + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m})} \\ &= \dfrac{f'(p)e -\frac{f^{(m)}(p)}{(m-1)!}(-e)^{m} + \mathcal{O}(e^{m+1})}{f'(p) + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m})} + \dfrac{f'(p)(-e) + \frac{f^{(m)}(p)}{m!}(-e)^{m} + \mathcal{O}(e^{m+1})}{f'(p) + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m})} \\ &= \dfrac{ \frac{f^{(m)}(p)}{m!}(-e)^{m}-\frac{f^{(m)}(p)}{(m-1)!}(-e)^{m} + \mathcal{O}(e^{m+1})}{f'(p) + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m})} \\ &= \dfrac{\left(\frac{f^{(m)}(p)}{m!}(-e)^{m}-\frac{f^{(m)}(p)}{(m-1)!}(-e)^{m} + \mathcal{O}(e^{m+1})\right)m!}{\left(f'(p) + \frac{f^{(m)}(p)}{(m-1)!}(-e)^{m-1} + \mathcal{O}(e^{m})\right)m!} \\ &= \dfrac{f^{(m)}(p)(-e)^{m}-mf^{(m)}(p)(-e)^{m} + \mathcal{O}(e^{m+1})}{f'(p)m! + mf^{(m)}(p)(-e)^{m-1} + \mathcal{O}(e^{m})} \\ \dfrac{p - x_{n}}{p - x_{n-1}} &= \dfrac{f^{(m)}(p)(-e)^{m}-mf^{(m)}(p)(-e)^{m} + \mathcal{O}(e^{m+1})}{f'(p)m! + mf^{(m)}(p)(-e)^{m-1} + \mathcal{O}(e^{m})}\cdot \dfrac{1}{e^m} \\ &= \dfrac{f^{(m)}(p)(-1)^{m}-mf^{(m)}(p)(-1)^{m} + \mathcal{O}(e)}{f'(p)m! + mf^{(m)}(p)(-e)^{m-1} + \mathcal{O}(e^{m})} \\ &= (-1)^{m}\dfrac{f^{(m)}(p)-mf^{(m)}(p) + \mathcal{O}(e)}{f'(p)m! + mf^{(m)}(p)(-e)^{m-1} + \mathcal{O}(e^{m})} \\ \end{align} \] Hence, we have \[ \left|\dfrac{p - x_{n}}{p - x_{n-1}}\right| = \left|\dfrac{f^{(m)}(p)-mf^{(m)}(p) + \mathcal{O}(e)}{f'(p)m! + mf^{(m)}(p)(-e)^{m-1} + \mathcal{O}(e^{m})}\right| = \dfrac{|f^{(m)}(p)-mf^{(m)}(p) + \mathcal{O}(e)|}{|f'(p)m! + mf^{(m)}(p)(-e)^{m-1} + \mathcal{O}(e^{m})|}. \] Since \(\lim_{n\to\infty} e = 0\), we have \[ \lim_{n\to\infty} \left|\dfrac{p - x_{n}}{p - x_{n-1}}\right| = \dfrac{|f^{(m)}(p)-mf^{(m)}(p)|}{|f'(p)m!|} = \dfrac{m - 1}{m!} \left| \dfrac{f^{(m)}(p)}{f'(p)} \right|. \tag*{\(\blacksquare\)} \]
Problem 4. Let \( A \in \mathbb{C}^{n \times n} \) be a non-singular matrix, and let \( 0 \neq b \in \mathbb{C}^n \) be a fixed vector. Suppose that you have an algorithm for solving \( Ax = b \) that produces an approximate solution \( \tilde{x} (\neq 0) \) that satisfies \[ (A + \Delta A) \tilde{x} = b, \] for some matrix \( \Delta A \) such that \[ \frac{\|\Delta A\|}{\|A\|} \leq \epsilon, \] with any consistent matrix norm \( \|\cdot\| \).
(a) Prove that for every \( \epsilon > 0 \) \[ \frac{\|\tilde{x} - x\|}{\|\tilde{x}\|} \leq \kappa(A) \epsilon, \] where \( x \in \mathbb{C}^n \) is the exact solution of \( Ax = b \), and \( \kappa(A) \) is the condition number of \( A \).
(b) Suppose that (a) holds asymptotically as \( \epsilon \to 0 \). Prove that \[ \frac{\|\tilde{x} - x\|}{\|x\|} \leq \kappa(A) \epsilon + \mathcal{O}(\epsilon^2), \quad \text{as} \quad \epsilon \to 0. \]
Proof. Given that \((A + \Delta A) \tilde{x} = b \) and \(A x = b\), we have \[ \begin{align} Ax - (A + \Delta A) \tilde{x} &= 0 \\ Ax - A\tilde{x} - \Delta A \tilde{x} &= 0 \\ A(x - \tilde{x}) &= \Delta A \tilde{x}. \end{align} \] Since \(A\) is non-singular, we have \[ x - \tilde{x} = A^{-1} \Delta A \tilde{x}. \] By Triangle Inequality, we have \[ \|x - \tilde{x}\| = \|A^{-1} \Delta A \tilde{x}\| \leq \|A^{-1}\| \|\Delta A\| \|\tilde{x}\|. \] Since \(A\) is non-singular and \(b\neq 0\), if \(x = 0\) then \(b = 0\) since \(A\cdot x = 0\), which is a contradiction. Thus, we can know that \(x\neq 0\), which implies that \(\|x\|>0\). Therefore, we have \[ \frac{\|x - \tilde{x}\|}{\|\tilde{x}\|} \leq \|A^{-1}\| \|\Delta A\|. \] Since \(\frac{\|\Delta A\|}{\|A\|} \leq \epsilon\), we have \[ \begin{align} \|\Delta A\| &\leq \epsilon \|A\| \\ \frac{\|x - \tilde{x}\|}{\|\tilde{x}\|} \leq \|A^{-1}\| \|\Delta A\| &\leq \|A^{-1}\| \|A\| \epsilon = \kappa(A) \epsilon. \tag*{\(\square\)} \end{align} \]
5. Let matrix \( A \in \mathbb{R}^{n \times n} \) be positive definite and matrix \( X \in \mathbb{R}^{n \times k} \) be of full column rank, i.e., \(\text{rank}(X) = k \).
Solution (a). We say \(A\) is positive definite if for all \(x\in \mathbb{R}^n\setminus \{0\}\), we have \(x^T A x > 0\).
Proof (b). Let \(v\in \mathbb{R}^k\setminus \{0\}\). We denote \(X\) as \[ X = \begin{bmatrix} \vec{x_1} & \vec{x_2} & \cdots & \vec{x_k} \end{bmatrix}, \] where \(x_i\in \mathbb{R}^n\). Since \(X\) is of full column rank, we have \(\vec{x_1}, \vec{x_2}, \ldots, \vec{x_k}\) are linearly independent. Now, let \(v = \begin{bmatrix} v_1 & v_2 & \cdots & v_k \end{bmatrix}^T\). Then, we have \[ v^T\cdot X^T = \begin{bmatrix} v_1 & v_2 & \cdots & v_k \end{bmatrix} \begin{bmatrix} \vec{x_1}^T \\ \vec{x_2}^T \\ \vdots \\ \vec{x_k}^T \end{bmatrix} = v_1 \vec{x_1} + v_2 \vec{x_2} + \cdots + v_k \vec{x_k}. \] Since \(\{ \vec{x_1}, \vec{x_2}, \ldots, \vec{x_k} \}\) are linearly independent, we have \[ a_1\vec{x_1} + a_2 \vec{x_2} + \cdots + a_k \vec{x_k} = 0 \quad \text{if and only if} \quad a_1 = a_2 = \cdots = a_k = 0. \] And we know that there exists some \(v_i\neq 0\). Therefore, we have \[ v^T\cdot X^T \neq 0,\text{ and } (Xv)^T\neq 0, \] which implies that \(Xv\neq 0\). Now, we denote \(u = Xv\). Since \(A\) is positive definite and \(u\neq 0\), we have \[ v^T\cdot X^TAXv = (Xv)^TA(Xv) = u^TAu > 0. \tag*{\(\square\)} \]
Proof. According to the \(LU\) factorization, we can know that the diagonal entries of \(L\) are all ones. We denote \(L_k, U_k\) as the \(k\)th leading principal sub-matrix of \(L, U\), respectively. And we can know that \(A_k = L_kU_k\). Since \(\det A_k>0\) and \(\det L_k = 1\) and \(\det A_k = \det U_k \det L_k\), we can know that \(\det U_k > 0\). Since \(\det U_k = u_{11} \dots u_{kk}\), we can know that \(u_{11}\cdots u_{kk} > 0\). Now, we want to prove that \(u_{ii}>0\) by induction. Since \(\det U_1 > 0\), we can know that \(u_{11} > 0\). Suppose that \(u_{11}\cdots u_{ii} > 0\). Then, we have \(\det U_{i+1} = u_{11}\cdots u_{ii}u_{i+1,i+1} > 0\). It forces that \(u_{i+1,i+1} > 0\). Therefore, we can know that the diagonal elements of \(U\) are positive. Thus, \(D = \text{diag}(\sqrt{u_{11}}, \dots, \sqrt{u_{mm}})\) exists and is non-singular. Let \(R = D^{-1}U\). Since \(D^{-1} = \text{diag}(\frac{1}{\sqrt{u_{11}}}, \dots, \frac{1}{\sqrt{u_{mm}}})\), we can know that \(D^{-1}\) is an upper triangular matrix. Since \(U\) is an upper triangular matrix, we can know that \(R\), which is a product of two upper triangular matrices, is an upper triangular matrix. Now, we want to show that \(A = R^*R\). Since \(A\) is Hermitian, we have \(A = A^*\). Let \(A = LU\), then \(A = (LU)^* = U^*L^* = LU\). Since the diagonal entries of \(L\) are \(U\) are positive, we can know that \(L, U\) are non-singular, which implies that \(U^*, L^*\) are also non-singular. Then, given \(U^*L^* = LU\), we can know that \(U(L^*)^{-1} = L^{-1}U^*\). Again, according to the properties of upper and lower triangular matrices, we can know that \((L^*)^{-1}\) is an upper triangular matrix and \( U^*\) is a lower triangular matrix. Hence, \(U(L^*)^{-1}\) is an upper triangular matrix and \(L^{-1}U^*\) is a lower triangular matrix. Since \(U(L^*)^{-1} = L^{-1}U^*\), we can know that \(U(L^*)^{-1} = L^{-1}U^*\) is a diagonal matrix. Now, we look at the entries of \(L^{-1}U^*\). Firstly, the diagonal entries of \(L^{-1}\) are still ones and the diagonal entries of \(U^*\) are the same as the diagonal entries of \(U\). Now, by the matrix multiplication, we see that the diagonal entries of \(L^{-1}U^*\) are \(u_{11}, \dots, u_{mm}\). Since itself is a diagonal matrix, we can know that \[ L^{-1}U^* = \text{diag}(u_{11}, \dots, u_{mm}) = D^2. \] It shows that \(U^* = LD^2\). Given \(D^{-1}\) is a diagonal matrix, we can know that \((D^{-1})^* = D^{-1}\). \begin{align*} (D^{-1}U)^* &= U^*(D^{-1})^* = U^*D^{-1} = LD^2D^{-1} = LD. \end{align*} Therefore, we can see that \[ R^* R = (D^{-1}U)^*(D^{-1}U) = LDD^{-1}U = LU = A. \tag*{\(\square\)} \]