Splines in Bessel Potential Spaces: February 2019

A problem of updating Cholesky factorization was treated in [1], [2] and [3]. Here algorithms taken from those works are described. These algorithms are for computing a factorization $\tilde A = \tilde L \tilde L^T$ where $\tilde A$ is a matrix received of the matrix $A = L L^T$ after a row and the symmetric column were added or deleted from $A$.
Let $A \in R^{n \times n}$ is a symmetric positive definite matrix. Applying Cholesky method to $A$ yields the factorization $A = L L^T$ where $L$ is a lower triangular matrix with positive diagonal elements.
Suppose $\tilde A$ is a positive definite matrix created by adding a row and the symmetric column to $A$:
\begin{eqnarray*}
\tilde A = \left( \begin{array}{cc}
A & d \\
d^T & \gamma
\end{array}
\right) \ , \qquad d^T \in R^n
\end{eqnarray*}
Then its Cholesky factorization is [2]:
\begin{eqnarray*}
\tilde L = \left( \begin{array}{cc}
L \\
e^T & \alpha
\end{array}
\right) \ ,
\end{eqnarray*}
where
\begin{eqnarray*}
e = L^{-1} d \ ,
\end{eqnarray*}
and
\begin{eqnarray*}
\alpha = \sqrt{ \tau } \ , \quad \tau = \gamma - e^T e, \quad \tau > 0.
\end{eqnarray*}
If $\tau \le 0$ then $\tilde A$ is not positive definite.

Now assume $\tilde A$ is obtained of $A$ by deleting the $r^{th}$ row and column from $A$. Matrix $\tilde A$ is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get $\tilde L$ from $L$ in such case. Let us partition $A$ and $L$ along the $r^{th}$ row and column:
\begin{eqnarray*}
A = \left( \begin{array}{ccc}
A_{11} & a_{1r} & A_{12} \\
a_{1r}^T & a_{rr} & a_{2r}^T \\
A_{21} & a_{2r} & A_{22}
\end{array}
\right) \ , \qquad
L = \left( \begin{array}{ccc}
L_{11} \\
l_{1r}^T & l_{rr} \\
L_{21} & l_{2r} & L_{22}
\end{array}
\right) \ .
\end{eqnarray*}
Then $\tilde A$ can be written as
\begin{eqnarray*}
\tilde A = \left( \begin{array}{cc}
A_{11} & A_{12} \\
A_{21} & A_{22}
\end{array}
\right) \ .
\end{eqnarray*}
By deleting the $r^{th}$ row from $L$ we get the matrix
\begin{eqnarray*}
H = \left( \begin{array}{ccc}
L_{11} \\
L_{21} & l_{2r} & L_{22}
\end{array}
\right) \
\end{eqnarray*}
with the property that $H H^T = \tilde A$. Factor $\tilde L$ can be received from $H$ by applying Givens rotations to the matrix elements $h_{r, r+1}, h_{r+1, r+2},$ $\dots , h_{n-1, n}$ and deleting the last column of the obtained matrix $\tilde H$:
\begin{eqnarray*}
\tilde H = H R = \left( \begin{array}{ccc}
L_{11} \\
L_{21} & \tilde L_{22} & 0
\end{array}
\right) = \Bigl( \tilde L \ 0 \Bigr) \ ,
\end{eqnarray*}
where $R=R_0 R_1 \dots R_{n - 1 - r}$ is the matrix of Givens rotations, $L_{11}$ and $\tilde L_{22}\, - $ lower triangle matrices, and
\begin{eqnarray*}
&& \tilde L = \left( \begin{array}{cc}
L_{11} \\
L_{21} & \tilde L_{22}
\end{array} \right) \, , \\ && \tilde H \tilde H^T = H R R^T H^T = H H^T = \tilde A \, , \\ && \tilde H \tilde H^T = \tilde L \tilde L^T \, , \quad \tilde L \tilde L^T = \tilde A \, .
\end{eqnarray*}
Orthogonal matrix $R_t \ (0 \le t \le n - 1 -r), R_t \in R^{n \times n}$ which defines the Givens rotation annulating $(r+t , r+t+1)$th element of the $H_k R_0 \dots R_{t-1}$ matrix is defined as
\begin{eqnarray*}
R_t = \left( \begin{array}{cccccc}
1 & \ldots & 0 & 0 & \ldots & 0 \\
\vdots & \ldots & \ldots & \ldots & \ldots & \vdots \\
0 & \ldots & c & s & \ldots & 0 \\
0 & \ldots & -s & c & \ldots & 0 \\
\vdots & \ldots & \ldots & \ldots & \ldots & \vdots \\
0 & \ldots & 0 & 0 & \ldots & 1
\end{array}
\right) \ .
\end{eqnarray*}
where entries $( r+t , r+t )$ and $( r+t+1 , r+t+1 )$ equal $c$, $( r+t , r+t+1 )$ entry equals $s$, and $( r+t+1 , r+t )$ entry equals $-s$, here $c^2 + s^2 = 1$. Let's $\tilde l_{ij}^{t-1} - $ coefficients of matrix $H_k R_0 \dots R_{t-1}$ ($\tilde l_{ij}^{-1} - $ coefficients of $H_k$) and
\begin{eqnarray*}
c = \frac{ \tilde l_{r+t,r+t}^{t-1} }
{ \sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } } \ ,
\quad
s = \frac{ \tilde l_{r+t,r+t+1}^{t-1} }
{ \sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } } \ ,
\end{eqnarray*}
Then matrix $H_k R_0 \dots R_t$ will differ from $H_k R_0 \dots R_{t-1}$ with entries of $( r+t )$ и $( r+t+1 )$ columns only, thereby
\begin{eqnarray*}
&& \tilde l_{i,r+t}^t = \tilde l_{i,r+t}^{t-1} = 0 \, , \
\tilde l_{i,r+t+1}^t = \tilde l_{i,r+t+1}^{t-1} = 0 \, ,
\quad 1 \le i \le r + t - 1 \, ,
\\
&& \tilde l_{i,r+t}^t = c \tilde l_{i,r+t}^{t-1} + s \tilde l_{i,r+t+1}^{t-1} \, ,
\\
\nonumber
&& \tilde l_{i,r+t+1}^t = -s \tilde l_{i,r+t}^{t-1} + c \tilde l_{i,r+t+1}^{t-1} \,
\quad r + t \le i \le n - 1 \, .
\end{eqnarray*}
Where
\begin{eqnarray*}
%\label{a1_ltrt}
\tilde l_{r+t,r+t}^t =
\sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } \ ,
\qquad
\tilde l_{r+t,r+t+1}^t = 0 \, .
\end{eqnarray*}
Also, coefficient $\tilde l_{r+t,r+t}^t$ is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of $c$ and $s$, it was recommended [3] to calculate value of the square root $w = \sqrt{x^2 + y^2}$ as folows:
\begin{eqnarray*}
&& v = \max \{ |x| , |y| \} \ , \qquad u = \min \{ |x| , |y| \} \ ,
\\
\nonumber
&& w = \cases{ v \sqrt{ 1 + \left( \frac{u}{v} \right)^2 } \ ,
\quad v \ne 0 \cr \cr
0 \ , \quad v = 0 \cr
} \ .
\end{eqnarray*}
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about $n^2$ flops instead of about $\frac {(n + 1) ^ 3} {3}$ flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about $3(n - r)^2$ flops (the worst case requires about $3 (n - 1) ^ 2$ operations) instead of about $\frac {(n - 1) ^ 3} {3} $ flops required for its direct calculation.

References

[1] T. F. Coleman, L. A. Hulbert, A direct active set algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.

$\setCounter{0}$
In this post we illustrate the normal spline interpolation approach with a few simple examples.
Let there is the following information of a smooth function $\varphi (x,y), \, (x,y) \in R^2$:
\begin{eqnarray}
&& \varphi (0, 0) = 0 \, ,
\\
&& \frac{ \partial{\varphi} }{ \partial{x} } (0, 0) = 1 \, ,
\\
&& \frac{ \partial{\varphi} }{ \partial{y} } (0, 0) = 1 \, ,
\end{eqnarray}
and it is necessary to reconstruct $\varphi$ using this data.
Assume this function is an element of Hilbert space $H^{2.5}_\varepsilon (R^2)$, thus it can be treated as a continuously differentiable function $\varphi \in C^1 (R^2)$, and construct a normal spline $\sigma_1^{(2.5)}$ approximating $\varphi$:

\begin{eqnarray}
\sigma_1^{(2.5)} = {\rm arg\,min}\{ \| \varphi \|^2_{H^{2.5}_\varepsilon} : (1), (2), (3), \forall \varphi \in {H^{2.5}_\varepsilon} (R^2) \} \, .
\end{eqnarray}
This spline can be presented as

\begin{eqnarray}
\sigma_1^{(2.5)} = \mu_1 h_1 + \mu'_1 h'_1 + \mu'_2 h'_2 \, ,
\end{eqnarray}
here

\begin{eqnarray}
&& h_1 (\eta_1, \eta_2, \varepsilon) = \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (1 + \varepsilon \sqrt{\eta_1^2 + \eta_2^2}) \, ,
\\
&& h'_1 (\eta_1, \eta_2, \varepsilon) = \varepsilon^2 \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (\eta_1 + \eta_2) \, ,
\\
&& h'_2 (\eta_1, \eta_2, \varepsilon) = h'_1 (\eta_1, \eta_2, \varepsilon) \, , \ (\eta_1, \eta_2) \in R^2 \, ,
\end{eqnarray}

and coefficients $(\mu_1, \mu'_1, \mu'_2)$ are defined from the system (see the previous post):

\begin{eqnarray}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 2\varepsilon^2 & 0 \\
0 & 0 & 2\varepsilon^2 \\
\end{bmatrix} \left[ \begin{array}{c} \mu_1 \\ \mu'_1 \\ \mu'_2 \end{array} \right] =
\left[ \begin{array}{c} 0 \\ 1 \\ 1 \end{array} \right] \, .
\end{eqnarray}

Eventually

\begin{eqnarray}
\sigma_1^{(2.5)} (x, y, \varepsilon) = \exp (-\varepsilon \sqrt{x^2 + y^2}) (x + y) \, , \quad (x,y) \in R^2.
\end{eqnarray}

Fig.1 Spline $\sigma_1^{(2.5)}, \, \varepsilon = 1$

Fig.2 Spline $\sigma_1^{(2.5)}, \, \varepsilon = 0.1$

Now let function $\varphi (x,y), \, (x,y) \in R^2$ is a twice continuously differentiable function which satisfies constraints:
\begin{eqnarray}
&& \varphi (0, 0) = 0 \, ,
\\
&& \frac{ \partial{\varphi} }{ \partial{x} } (0, 0) + \frac{ \partial{\varphi} }{ \partial{y} } (0, 0) = 2 \, .
\end{eqnarray}

We approximate it by constructing a normal spline $\sigma_1^{(3.5)}$ in $H^{3.5}_\varepsilon (R^2)$:

\begin{eqnarray}
\sigma_1^{(3.5)} &=& {\rm arg\,min}\{ \| \varphi \|^2_{H^{3.5}_\varepsilon} : (11), (12), \forall \varphi \in {H^{3.5}_\varepsilon} (R^2) \} \, , \\
\sigma_1^{(3.5)} &=& \mu_1 h_1 + \mu'_1 h'_1 \, ,
\end{eqnarray}
where

\begin{eqnarray}
&& h_1 (\eta_1, \eta_2, \varepsilon) = \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (3 + 3\varepsilon \sqrt{\eta_1^2 + \eta_2^2} + \varepsilon^2 (\eta_1^2 + \eta_2^2)) \, ,
\\
&& h'_1 (\eta_1, \eta_2, \varepsilon) = \varepsilon^2 \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (1 +\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (\eta_1 + \eta_2) \, ,
\end{eqnarray}

and coefficients $(\mu_1, \mu'_1)$ are defined from the system:

\begin{eqnarray}
\begin{bmatrix}
3 & 0 \\
0 & 2\varepsilon^2 \\
\end{bmatrix} \left[ \begin{array}{c} \mu_1 \\ \mu'_1 \end{array} \right] =
\left[ \begin{array}{c} 0 \\ 2 \end{array} \right] \, .
\end{eqnarray}

Therefore

\begin{eqnarray}
\sigma_1^{(3.5)} (x, y, \varepsilon) &=& \exp (-\varepsilon \sqrt{x^2 + y^2}) (1 + \varepsilon \sqrt{x^2 + y^2}) (x + y) \, , \\ \nonumber && (x,y) \in R^2.
\end{eqnarray}

As the last example consider a problem of reconstructing a continuously differentiable function $\varphi (x), x \in R$, which satisfies constraint
\begin{eqnarray}
\frac {d\varphi} {dx} (0) = 1 \, ,
\end{eqnarray}
and it is closest to function $z(x) = 2 x, \, x \in R$. We approximate it by constructing a normal spline $\sigma_1^{(2)}$ in $H^{2}_\varepsilon (R)$:
\begin{eqnarray}
\sigma_1^{(2)} &=& {\rm arg\,min}\{ \| \varphi - z \|^2_{H^{2}_\varepsilon} : (19)\, , \forall \varphi \in {H^{2}_\varepsilon} (R) \} \, , \\
\sigma_1^{(2)} &=& z + \mu'_1 h'_1 = 2x + \mu'_1 h'_1 \, ,
\end{eqnarray}
Performing calculations analogous to previous ones, we'll receive:

\begin{eqnarray}
\sigma_1^{(2)} (x, \varepsilon) = 2 x - x \exp (-\varepsilon |x|) \, , \quad x \in R.
\end{eqnarray}

Splines in Bessel Potential Spaces

Friday, February 8, 2019

Algorithms for updating Cholesky factorization

Simple normal splines examples