Processing math: 100%

Friday, February 8, 2019

Algorithms for updating Cholesky factorization

A problem of updating Cholesky factorization was treated in [1], [2] and [3]. Here algorithms taken from those works are described. These algorithms are for computing a factorization ˜A=˜L˜LT where ˜A is a matrix received of the matrix A=LLT after a row and the symmetric column were added or deleted from A.
Let ARn×n is a symmetric positive definite matrix. Applying Cholesky method to A yields the factorization A=LLT where L is a lower triangular matrix with positive diagonal elements.
Suppose ˜A is a positive definite matrix created by adding a row and the symmetric column to A:
˜A=(AddTγ) ,dTRn
Then its Cholesky factorization is [2]:
˜L=(LeTα) ,
where
e=L1d ,
and
α=τ ,τ=γeTe,τ>0.
If τ0 then ˜A is not positive definite.

Now assume ˜A is obtained of A by deleting the rth row and column from A. Matrix ˜A is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get ˜L from L in such case. Let us partition A and L along the rth row and column:
A=(A11a1rA12aT1rarraT2rA21a2rA22) ,L=(L11lT1rlrrL21l2rL22) .
Then ˜A can be written as
˜A=(A11A12A21A22) .
By deleting the rth row from L we get the matrix
H=(L11L21l2rL22) 
with the property that HHT=˜A. Factor ˜L can be received from H by applying Givens rotations to the matrix elements hr,r+1,hr+1,r+2, ,hn1,n and deleting the last column of the obtained matrix ˜H:
˜H=HR=(L11L21˜L220)=(˜L 0) ,
where R=R0R1Rn1r is the matrix of Givens rotations, L11 and ˜L22 lower triangle matrices, and
˜L=(L11L21˜L22),˜H˜HT=HRRTHT=HHT=˜A,˜H˜HT=˜L˜LT,˜L˜LT=˜A.
Orthogonal matrix Rt (0tn1r),RtRn×n which defines the Givens rotation annulating (r+t,r+t+1)th element of the HkR0Rt1 matrix is defined as
Rt=(10000cs00sc00001) .
where entries (r+t,r+t) and (r+t+1,r+t+1) equal c(r+t,r+t+1) entry equals s, and (r+t+1,r+t) entry equals s, here c2+s2=1. Let's ˜lt1ij coefficients of matrix HkR0Rt1 (˜l1ij coefficients of Hk) and
c=˜lt1r+t,r+t(˜lt1r+t,r+t)2+(˜lt1r+t,r+t+1)2 ,s=˜lt1r+t,r+t+1(˜lt1r+t,r+t)2+(˜lt1r+t,r+t+1)2 ,
Then matrix HkR0Rt will differ from HkR0Rt1 with entries of (r+t) и (r+t+1) columns only, thereby
˜lti,r+t=˜lt1i,r+t=0, ˜lti,r+t+1=˜lt1i,r+t+1=0,1ir+t1,˜lti,r+t=c˜lt1i,r+t+s˜lt1i,r+t+1,˜lti,r+t+1=s˜lt1i,r+t+c˜lt1i,r+t+1r+tin1.
   Where
˜ltr+t,r+t=(˜lt1r+t,r+t)2+(˜lt1r+t,r+t+1)2 ,˜ltr+t,r+t+1=0.
Also, coefficient ˜ltr+t,r+t is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of c and s, it was recommended [3] to calculate value of the square root w=x2+y2 as folows:
v=max{|x|,|y|} ,u=min{|x|,|y|} ,w={v1+(uv)2 ,v00 ,v=0 .
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about n2 flops instead of about (n+1)33 flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about 3(nr)2 flops (the worst case requires about 3(n1)2 operations) instead of about (n1)33 flops required for its direct calculation.

References

[1] T. F. Coleman, L. A. Hulbert, A direct active set  algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.


Simple normal splines examples


In this post we illustrate the normal spline interpolation approach with a few simple examples.
Let there is the following information of a smooth function φ(x,y),(x,y)R2:
φ(0,0)=0,φx(0,0)=1,φy(0,0)=1,
and it is necessary to reconstruct φ using this data.
Assume this function is an element of Hilbert space H2.5ε(R2), thus it can be treated as a continuously differentiable function φC1(R2), and construct a normal spline σ(2.5)1 approximating φ:
σ(2.5)1=argmin{φ2H2.5ε:(1),(2),(3),φH2.5ε(R2)}.
This spline can be presented as
σ(2.5)1=μ1h1+μ1h1+μ2h2,
here
h1(η1,η2,ε)=exp(εη21+η22)(1+εη21+η22),h1(η1,η2,ε)=ε2exp(εη21+η22)(η1+η2),h2(η1,η2,ε)=h1(η1,η2,ε), (η1,η2)R2,
and coefficients (μ1,μ1,μ2) are defined from the system (see the previous post):
[10002ε20002ε2][μ1μ1μ2]=[011].
Eventually
σ(2.5)1(x,y,ε)=exp(εx2+y2)(x+y),(x,y)R2.
Fig.1 Spline σ(2.5)1,ε=1

Fig.2 Spline σ(2.5)1,ε=0.1

Now let function φ(x,y),(x,y)R2 is a twice continuously differentiable function which satisfies constraints:
φ(0,0)=0,φx(0,0)+φy(0,0)=2.
We approximate it by constructing a normal spline σ(3.5)1 in H3.5ε(R2)
σ(3.5)1=argmin{φ2H3.5ε:(11),(12),φH3.5ε(R2)},σ(3.5)1=μ1h1+μ1h1,
where
h1(η1,η2,ε)=exp(εη21+η22)(3+3εη21+η22+ε2(η21+η22)),h1(η1,η2,ε)=ε2exp(εη21+η22)(1+εη21+η22)(η1+η2),
and coefficients (μ1,μ1) are defined from the system:
[3002ε2][μ1μ1]=[02].
Therefore
σ(3.5)1(x,y,ε)=exp(εx2+y2)(1+εx2+y2)(x+y),(x,y)R2.

As the last example consider a problem of reconstructing a continuously differentiable function φ(x),xR, which satisfies constraint
dφdx(0)=1,
and it is closest to function z(x)=2x,xR. We approximate it by constructing a normal spline σ(2)1 in H2ε(R):
σ(2)1=argmin{φz2H2ε:(19),φH2ε(R)},σ(2)1=z+μ1h1=2x+μ1h1,
Performing calculations analogous to previous ones, we'll receive:
σ(2)1(x,ε)=2xxexp(ε|x|),xR.