Processing math: 64%

Friday, February 8, 2019

Algorithms for updating Cholesky factorization

A problem of updating Cholesky factorization was treated in [1], [2] and [3]. Here algorithms taken from those works are described. These algorithms are for computing a factorization ˜A=˜L˜LT where ˜A is a matrix received of the matrix A=LLT after a row and the symmetric column were added or deleted from A.
Let ARn×n is a symmetric positive definite matrix. Applying Cholesky method to A yields the factorization A=LLT where L is a lower triangular matrix with positive diagonal elements.
Suppose ˜A is a positive definite matrix created by adding a row and the symmetric column to A:
˜A=(AddTγ) ,dTRn
Then its Cholesky factorization is [2]:
˜L=(LeTα) ,
where
e=L1d ,
and
α=τ ,τ=γeTe,τ>0.
If τ0 then ˜A is not positive definite.

Now assume ˜A is obtained of A by deleting the rth row and column from A. Matrix ˜A is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get ˜L from L in such case. Let us partition A and L along the rth row and column:
A=(A11a1rA12aT1rarraT2rA21a2rA22) ,L=(L11lT1rlrrL21l2rL22) .
Then ˜A can be written as
˜A=(A11A12A21A22) .
By deleting the rth row from L we get the matrix
H=(L11L21l2rL22) 
with the property that HHT=˜A. Factor ˜L can be received from H by applying Givens rotations to the matrix elements hr,r+1,hr+1,r+2, ,hn1,n and deleting the last column of the obtained matrix ˜H:
˜H=HR=(L11L21˜L220)=(˜L 0) ,
where R=R0R1Rn1r is the matrix of Givens rotations, L11 and ˜L22 lower triangle matrices, and
˜L=(L11L21˜L22),˜H˜HT=HRRTHT=HHT=˜A,˜H˜HT=˜L˜LT,˜L˜LT=˜A.
Orthogonal matrix Rt (0tn1r),RtRn×n which defines the Givens rotation annulating (r+t,r+t+1)th element of the HkR0Rt1 matrix is defined as
Rt=(10000cs00sc00001) .
where entries (r+t,r+t) and (r+t+1,r+t+1) equal c(r+t,r+t+1) entry equals s, and (r+t+1,r+t) entry equals s, here c2+s2=1. Let's ˜lt1ij coefficients of matrix HkR0Rt1 (˜l1ij coefficients of Hk) and
c=˜lt1r+t,r+t(˜lt1r+t,r+t)2+(˜lt1r+t,r+t+1)2 ,s=˜lt1r+t,r+t+1(˜lt1r+t,r+t)2+(˜lt1r+t,r+t+1)2 ,
Then matrix HkR0Rt will differ from HkR0Rt1 with entries of (r+t) и (r+t+1) columns only, thereby
˜lti,r+t=˜lt1i,r+t=0, ˜lti,r+t+1=˜lt1i,r+t+1=0,1ir+t1,˜lti,r+t=c˜lt1i,r+t+s˜lt1i,r+t+1,˜lti,r+t+1=s˜lt1i,r+t+c˜lt1i,r+t+1r+tin1.
   Where
˜ltr+t,r+t=(˜lt1r+t,r+t)2+(˜lt1r+t,r+t+1)2 ,˜ltr+t,r+t+1=0.
Also, coefficient ˜ltr+t,r+t is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of c and s, it was recommended [3] to calculate value of the square root w=x2+y2 as folows:
v=max
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about n^2 flops instead of about \frac {(n + 1) ^ 3} {3} flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about 3(n - r)^2 flops (the worst case requires about 3 (n - 1) ^ 2 operations) instead of about \frac {(n - 1) ^ 3} {3} flops required for its direct calculation.

References

[1] T. F. Coleman, L. A. Hulbert, A direct active set  algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.


Simple normal splines examples

\setCounter{0}
In this post we illustrate the normal spline interpolation approach with a few simple examples.
Let there is the following information of a smooth function \varphi (x,y), \, (x,y) \in R^2:
\begin{eqnarray}  && \varphi (0, 0) = 0 \, , \\  &&  \frac{ \partial{\varphi} }{ \partial{x} } (0, 0) =  1 \, , \\   && \frac{ \partial{\varphi} }{ \partial{y} } (0, 0) =  1 \, , \end{eqnarray}
and it is necessary to reconstruct \varphi using this data.
Assume this function is an element of Hilbert space H^{2.5}_\varepsilon (R^2), thus it can be treated as a continuously differentiable function \varphi \in C^1 (R^2), and construct a normal spline \sigma_1^{(2.5)} approximating \varphi:
\begin{eqnarray} \sigma_1^{(2.5)} = {\rm arg\,min}\{  \| \varphi \|^2_{H^{2.5}_\varepsilon} : (1), (2), (3),  \forall \varphi \in {H^{2.5}_\varepsilon} (R^2) \} \, . \end{eqnarray}
This spline can be presented as
\begin{eqnarray} \sigma_1^{(2.5)} = \mu_1  h_1 +  \mu'_1  h'_1 +  \mu'_2  h'_2 \, , \end{eqnarray}
here
\begin{eqnarray} &&  h_1 (\eta_1, \eta_2, \varepsilon) = \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (1 + \varepsilon \sqrt{\eta_1^2 + \eta_2^2}) \, , \\  &&  h'_1 (\eta_1, \eta_2, \varepsilon) = \varepsilon^2 \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (\eta_1 + \eta_2) \, , \\  &&  h'_2 (\eta_1, \eta_2, \varepsilon)  =   h'_1 (\eta_1, \eta_2, \varepsilon) \,  , \  (\eta_1, \eta_2) \in R^2 \, , \end{eqnarray}
and coefficients (\mu_1, \mu'_1, \mu'_2) are defined from the system (see the previous post):
\begin{eqnarray} \begin{bmatrix}    1 & 0 & 0  \\    0 & 2\varepsilon^2 & 0  \\    0 & 0 & 2\varepsilon^2  \\    \end{bmatrix} \left[ \begin{array}{c} \mu_1 \\ \mu'_1 \\ \mu'_2 \end{array} \right] = \left[ \begin{array}{c} 0 \\ 1 \\ 1 \end{array} \right] \, . \end{eqnarray}
Eventually
\begin{eqnarray} \sigma_1^{(2.5)} (x, y, \varepsilon) = \exp (-\varepsilon \sqrt{x^2 + y^2}) (x + y) \, , \quad (x,y) \in R^2. \end{eqnarray}
Fig.1 Spline \sigma_1^{(2.5)}, \, \varepsilon = 1

Fig.2 Spline \sigma_1^{(2.5)}, \, \varepsilon = 0.1

Now let function \varphi (x,y), \, (x,y) \in R^2 is a twice continuously differentiable function which satisfies constraints:
\begin{eqnarray}  && \varphi (0, 0) = 0 \, , \\  &&  \frac{ \partial{\varphi} }{ \partial{x} } (0, 0) + \frac{ \partial{\varphi} }{ \partial{y} } (0, 0)  =  2 \, . \end{eqnarray}
We approximate it by constructing a normal spline \sigma_1^{(3.5)} in H^{3.5}_\varepsilon (R^2)
\begin{eqnarray} \sigma_1^{(3.5)} &=& {\rm arg\,min}\{  \| \varphi \|^2_{H^{3.5}_\varepsilon} : (11), (12),  \forall \varphi \in {H^{3.5}_\varepsilon} (R^2) \} \, , \\ \sigma_1^{(3.5)} &=& \mu_1  h_1 +  \mu'_1  h'_1 \, , \end{eqnarray}
where
\begin{eqnarray} &&  h_1 (\eta_1, \eta_2, \varepsilon) = \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (3 + 3\varepsilon \sqrt{\eta_1^2 + \eta_2^2} + \varepsilon^2 (\eta_1^2 + \eta_2^2)) \, , \\  &&  h'_1 (\eta_1, \eta_2, \varepsilon) = \varepsilon^2 \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (1 +\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (\eta_1 + \eta_2) \, , \end{eqnarray}
and coefficients (\mu_1, \mu'_1) are defined from the system:
\begin{eqnarray} \begin{bmatrix}    3 &  0  \\    0 & 2\varepsilon^2 \\    \end{bmatrix} \left[ \begin{array}{c} \mu_1 \\ \mu'_1 \end{array} \right] = \left[ \begin{array}{c} 0 \\ 2 \end{array} \right] \, . \end{eqnarray}
Therefore
\begin{eqnarray} \sigma_1^{(3.5)} (x, y, \varepsilon) &=& \exp (-\varepsilon \sqrt{x^2 + y^2}) (1 + \varepsilon \sqrt{x^2 + y^2}) (x + y) \, , \\ \nonumber && (x,y) \in R^2. \end{eqnarray}

As the last example consider a problem of reconstructing a continuously differentiable function \varphi (x), x \in R, which satisfies constraint
\begin{eqnarray} \frac {d\varphi} {dx} (0) = 1 \, , \end{eqnarray}
and it is closest to function z(x) = 2 x, \, x \in R. We approximate it by constructing a normal spline \sigma_1^{(2)} in H^{2}_\varepsilon (R):
\begin{eqnarray} \sigma_1^{(2)} &=& {\rm arg\,min}\{  \| \varphi  - z \|^2_{H^{2}_\varepsilon} : (19)\, ,  \forall \varphi \in {H^{2}_\varepsilon} (R) \} \, , \\ \sigma_1^{(2)} &=& z + \mu'_1  h'_1 = 2x + \mu'_1  h'_1  \, , \end{eqnarray}
Performing calculations analogous to previous ones, we'll receive:
\begin{eqnarray} \sigma_1^{(2)} (x, \varepsilon) = 2 x - x \exp (-\varepsilon |x|) \, , \quad x \in R. \end{eqnarray}