Friday, February 8, 2019

Algorithms for updating Cholesky factorization

A problem of updating Cholesky factorization was treated in [1], [2] and [3]. Here algorithms taken from those works are described. These algorithms are for computing a factorization $\tilde A = \tilde L \tilde L^T$ where $\tilde A$ is a matrix received of the matrix $A = L L^T$ after a row and the symmetric column were added or deleted from $A$.
Let $A \in R^{n \times n}$ is a symmetric positive definite matrix. Applying Cholesky method to $A$ yields the factorization $A = L L^T$ where $L$ is a lower triangular matrix with positive diagonal elements.
Suppose $\tilde A$ is a positive definite matrix created by adding a row and the symmetric column to $A$:
\begin{eqnarray*}
    \tilde A = \left( \begin{array}{cc}
       A & d \\
       d^T & \gamma
      \end{array}
      \right) \ , \qquad d^T  \in R^n
\end{eqnarray*}
Then its Cholesky factorization is [2]:
\begin{eqnarray*}
    \tilde L = \left( \begin{array}{cc}
       L  \\
       e^T & \alpha
      \end{array}
      \right) \ ,
\end{eqnarray*}
where
\begin{eqnarray*}
     e = L^{-1} d \ ,
\end{eqnarray*}
and
\begin{eqnarray*}
    \alpha = \sqrt{ \tau } \ , \quad \tau = \gamma - e^T e, \quad \tau > 0.
\end{eqnarray*}
If $\tau \le 0$ then $\tilde A$ is not positive definite.

Now assume $\tilde A$ is obtained of $A$ by deleting the $r^{th}$ row and column from $A$. Matrix $\tilde A$ is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get $\tilde L$ from $L$ in such case. Let us partition $A$ and $L$ along the $r^{th}$ row and column:
\begin{eqnarray*}
    A = \left( \begin{array}{ccc}
          A_{11} & a_{1r} & A_{12} \\
          a_{1r}^T & a_{rr} & a_{2r}^T \\
          A_{21} & a_{2r} & A_{22}
      \end{array}
     \right) \ , \qquad
    L = \left( \begin{array}{ccc}
   L_{11}   \\
   l_{1r}^T & l_{rr}   \\
   L_{21} & l_{2r} & L_{22}
    \end{array}
     \right) \ .
\end{eqnarray*}
Then $\tilde A$ can be written as
\begin{eqnarray*}
    \tilde A = \left( \begin{array}{cc}
    A_{11}  & A_{12} \\
    A_{21}  & A_{22}
    \end{array}
     \right) \ .
\end{eqnarray*}
By deleting the $r^{th}$ row from $L$ we get the matrix
\begin{eqnarray*}
    H = \left( \begin{array}{ccc}
  L_{11}     \\
  L_{21} & l_{2r} & L_{22}
    \end{array}
     \right) \
\end{eqnarray*}
with the property that $H H^T = \tilde A$. Factor $\tilde L$ can be received from $H$ by applying Givens rotations to the matrix elements $h_{r, r+1}, h_{r+1, r+2},$ $\dots , h_{n-1, n}$ and deleting the last column of the obtained matrix $\tilde H$:
\begin{eqnarray*}
    \tilde H = H R = \left( \begin{array}{ccc}
  L_{11} \\
  L_{21} &  \tilde L_{22} & 0
    \end{array}
     \right) =  \Bigl( \tilde L \ 0 \Bigr) \ ,
\end{eqnarray*}
where $R=R_0 R_1 \dots R_{n - 1 - r}$ is the matrix of Givens rotations, $L_{11}$ and $\tilde L_{22}\, - $ lower triangle matrices, and
\begin{eqnarray*}
&&  \tilde L = \left( \begin{array}{cc}
  L_{11}    \\
  L_{21} &  \tilde L_{22}
    \end{array}  \right) \, ,  \\ && \tilde H  \tilde H^T = H R R^T H^T = H H^T = \tilde A \, , \\ && \tilde H  \tilde H^T = \tilde L \tilde L^T \, , \quad \tilde L \tilde L^T = \tilde A \, .
\end{eqnarray*}
Orthogonal matrix $R_t \ (0 \le t \le n - 1 -r), R_t \in R^{n \times n}$ which defines the Givens rotation annulating $(r+t , r+t+1)$th element of the $H_k R_0 \dots R_{t-1}$ matrix is defined as
\begin{eqnarray*}
    R_t = \left( \begin{array}{cccccc}
   1      & \ldots & 0      & 0      & \ldots & 0 \\
   \vdots & \ldots & \ldots & \ldots & \ldots & \vdots \\
   0      & \ldots & c      & s      & \ldots & 0 \\
   0      & \ldots & -s     & c      & \ldots & 0 \\
   \vdots & \ldots & \ldots & \ldots & \ldots & \vdots \\
   0      & \ldots & 0      & 0      & \ldots & 1
    \end{array}
     \right) \ .
\end{eqnarray*}
where entries $( r+t , r+t )$ and $( r+t+1 , r+t+1 )$ equal $c$,  $( r+t , r+t+1 )$ entry equals $s$, and $( r+t+1 , r+t )$ entry equals $-s$, here $c^2 + s^2 = 1$. Let's $\tilde l_{ij}^{t-1} - $ coefficients of matrix $H_k R_0 \dots R_{t-1}$ ($\tilde l_{ij}^{-1} - $ coefficients of $H_k$) and
\begin{eqnarray*}
  c = \frac{ \tilde l_{r+t,r+t}^{t-1} }
{ \sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } } \ ,
\quad
  s = \frac{ \tilde l_{r+t,r+t+1}^{t-1} }
{ \sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } } \ ,
\end{eqnarray*}
Then matrix $H_k R_0 \dots R_t$ will differ from $H_k R_0 \dots R_{t-1}$ with entries of $( r+t )$ и $( r+t+1 )$ columns only, thereby
\begin{eqnarray*}
 &&  \tilde l_{i,r+t}^t = \tilde l_{i,r+t}^{t-1} = 0 \, , \
     \tilde l_{i,r+t+1}^t = \tilde l_{i,r+t+1}^{t-1} = 0 \, ,
     \quad   1 \le i \le r + t  - 1  \, ,
\\
 &&  \tilde l_{i,r+t}^t = c \tilde l_{i,r+t}^{t-1} +  s \tilde l_{i,r+t+1}^{t-1} \, ,
\\
\nonumber
 &&  \tilde l_{i,r+t+1}^t = -s \tilde l_{i,r+t}^{t-1} +  c \tilde l_{i,r+t+1}^{t-1} \,
     \quad  r + t \le i \le n - 1 \, .
\end{eqnarray*}
   Where
\begin{eqnarray*}
%\label{a1_ltrt}
  \tilde l_{r+t,r+t}^t =
\sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } \ ,
\qquad
  \tilde l_{r+t,r+t+1}^t = 0   \,  .
\end{eqnarray*}
Also, coefficient $\tilde l_{r+t,r+t}^t$ is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of $c$ and $s$, it was recommended [3] to calculate value of the square root $w = \sqrt{x^2 + y^2}$ as folows:
\begin{eqnarray*}
 &&  v = \max \{ |x| , |y| \} \ , \qquad u = \min \{ |x| , |y| \} \ ,
\\
\nonumber
 && w = \cases{ v \sqrt{ 1 + \left( \frac{u}{v} \right)^2 } \ ,
                    \quad v \ne 0 \cr \cr
              0 \ , \quad v = 0 \cr
            } \ .
\end{eqnarray*}
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about $n^2$ flops instead of about $\frac {(n + 1) ^ 3} {3}$ flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about $3(n - r)^2$ flops (the worst case requires about $3 (n - 1) ^ 2$ operations) instead of about $\frac {(n - 1) ^ 3} {3} $ flops required for its direct calculation.

References

[1] T. F. Coleman, L. A. Hulbert, A direct active set  algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.


No comments:

Post a Comment