Splines in Bessel Potential Spaces

Tuesday, March 24, 2020

NormalSplines.jl package: 1-D Normal Splines in Julia

View on GitHub

Friday, November 29, 2019

Interpolating Normal Splines: One-dimensional case

$\setCounter{0}$
This topic is concerned with numerical solution of the following interpolation problems:

Problem 1. Given points $x_1 \lt x_2 \lt \dots \lt x_{n_1}$ find a function $f$ such that
\begin{eqnarray}
f(x_i) = u_i , \qquad i = 1, 2, \dots, n_1 \ ,
\end{eqnarray}
Problem 2. Given points $x_1 \lt x_2 \lt \dots \lt x_{n_1}$, $s_1 \lt s_2 \lt \dots \lt s_{n_2}$ find a function $f$ such that
\begin{eqnarray}
f(x_i) &=& u_i , \qquad i = 1, 2, \dots, n_1 \ ,
\\ \nonumber
f'(s_j) &=& v_j , \qquad j = 1, 2, \dots, n_2 \ , \quad n_1 \ge 0 \, , \ \ \ n_2 \ge 0 \, ,
\end{eqnarray}
Problem 3. Given points $x_1 \lt x_2 \lt \dots \lt x_{n_1}$, $s_1 \lt s_2 \lt \dots \lt s_{n_2}$ and $t_1 \lt t_2 \lt \dots \lt t_{n_3}$ find a function $f$ such that
\begin{eqnarray}
f(x_i) &=& u_i , \qquad i = 1, 2, \dots, n_1 \ ,
\\ \nonumber
f'(s_j) &=& v_j , \qquad j = 1, 2, \dots, n_2 \ ,
\\ \nonumber
f''(t_k) &=& w_k , \qquad k = 1, 2, \dots, n_3 \, , \quad n_1 \ge 0 \, , \ \ n_2 \ge 0 \, , \ \ n_3 \ge 0 \, .
\end{eqnarray} Note that knots $\{x_i\}$, $\{s_j\}$ and $\{t_k\}$ may coincide.

Assume that function $f$ is an element of Hilbert space $H = H(X)$ over a set $X$ (here $X$ is $R$ or an interval from $R$) and Hilbert space is selected in a such way that it is continuously embedded in the space $C^2(X)$ of functions continuous with their second derivatives and therefore functionals $F_i$, $F'_j$, and $F''_k$
\begin{eqnarray*}
&& F_i(\varphi) = \varphi (x_i) \, , \ \ F'_j(\varphi) = \varphi' (s_j) \, , \ \ F''_k(\varphi) = \varphi''(t_k) \, , \forall \varphi \in H \, , \\ \nonumber && x_i, s_j, t_k \in X \, , \ \ i = 1, 2, \dots, n_1 \, , \ \ j = 1, 2, \dots, n_2 \, , \ \ k = 1, 2, \dots, n_3 \, .
\end{eqnarray*} are linear continuous functionals in $H$. It is obvious that all these functionals are linear independent. In accordance with Riesz representation theorem [1] these linear continuous functionals can be represented in the form of inner product of some elements $h_i, h′_j, h''_k \in H$ and $\varphi \in H$, for any $\varphi \in H$:

\begin{eqnarray*}
&& f(x_i) = F_i(\varphi) = {\langle h_i, \varphi \rangle}_H \, , \quad F'_j(\varphi) = {\langle h'_j, \varphi \rangle}_H \, , \quad F''_k(\varphi) = {\langle h''_k, \varphi \rangle}_H \, , \\ \nonumber && \forall \varphi \in H \, , \ \ i = 1, 2, \dots, n_1 \, , \ \ j = 1, 2, \dots, n_2 \, , \ \ k = 1, 2, \dots, n_3 \, .
\end{eqnarray*} Elements $h_i, h′_j$ and $h''_k$ are twice continuously differentiable functions.

Original constraint systems of Problems 1—3 can be written in form:
\begin{eqnarray}
&& f(x_i) = F_i(f) = {\langle h_i, f \rangle}_H = u_i \, , \\ \nonumber && f \in H \, , \ \ i = 1, 2, \dots, n_1 \, , \\ \nonumber
\\ && f(x_i) = F_i(f) = {\langle h_i, f \rangle}_H = u_i \, , \\ \nonumber && f'(s_j) = F'_j(f) = {\langle h'_j, f \rangle}_H = v_j \, , \\ \nonumber && f \in H \, , \ \ i = 1, 2, \dots, n_1 \, , \ \ j = 1, 2, \dots, n_2 \, , \\ \nonumber \\
&& f(x_i) = F_i(f) = {\langle h_i, f \rangle}_H = u_i \, , \\ \nonumber && f'(s_j) = F'_j(f) = {\langle h'_j, f \rangle}_H = v_j \, , \\ \nonumber && f''(t_k) = F''_k(f) = {\langle h''_k, f \rangle}_H = w_k\, , \\ \nonumber && f \in H \, , \ \ i = 1, 2, \dots, n_1 \, , \ \ j = 1, 2, \dots, n_2 \, , \\ \nonumber && k = 1, 2, \dots, n_3 \, ,
\end{eqnarray} here all functions $h_i, h'_j, h''_k \in H$ are linear independent and every system of constrains (4), (5), (6) defines a nonempty convex and closed set (as intersection of hyper-planes in the Hilbert space $H$).

Problem of reconstruction of function $f$ satisfying system of constraints (4); (5) or (6) is undetermined. We reformulate it as the problem of finding solution of the corresponding system of constraints that has minimal norm:

\begin{eqnarray}
&& \sigma_1 = {\rm arg\,min}\{ \| f - z \|^2_H : (4), z \in H, \forall f \in H \} \, , \\ && \sigma_2 = {\rm arg\,min}\{ \| f - z \|^2_H : (5), z \in H, \forall f \in H \} \, , \\ && \sigma_3 = {\rm arg\,min}\{ \| f - z \|^2_H : (6), z \in H, \forall f \in H \} \, ,
\end{eqnarray} where $z, z \in H$ is a "prototype" function. Solution of every problem (7), (8) and (9) exists and it is unique [1, 5] as a projection of element $z$ on the nonempty convex closed set in Hilbert space $H$. Elements $\sigma_1, \sigma_2$ and $\sigma_3$ are interpolating normal splines.

In accordance with generalized Lagrange method ([4], [5]) solution of the problem (7) can be written as follows:
\begin{eqnarray}
\sigma_1 = z + \sum _{i=1}^{n_1} \mu_i h_i \ ,
\end{eqnarray} where coefficients $\mu_i$ are defined by the system

\begin{eqnarray}
\sum _{l=1}^{n_1} g_{il} \mu_l &=& u_i - {\langle h_i, z \rangle}_H \, , \quad 1 \le i \le n_1 \, ,
\end{eqnarray} solution of the problem (8) can be written as follows:
\begin{eqnarray}
\sigma_2 = z + \sum _{i=1}^{n_1} \mu_i h_i + \sum _{j=1}^{n_2} \mu'_j h'_j \ ,
\end{eqnarray} where coefficients $\mu_i$ and $\mu'_j$ are defined by the system

\begin{eqnarray}
\sum _{l=1}^{n_1} g_{il} \mu_l + \sum _{j=1}^{n_2} g'_{ij} \mu'_j &=& u_i - {\langle h_i, z \rangle}_H \, , \quad 1 \le i \le n_1 \, , \\ \nonumber \sum _{i=1}^{n_1} g'_{ij} \mu_i + \sum _{l=1}^{n_2} g''_{jl} \mu'_l &=& v_j - {\langle h'_j, z \rangle}_H \, , \quad 1 \le j \le n_2 \, ,
\end{eqnarray} and solution of the problem (9) can be presented as:

\begin{eqnarray}
\sigma_3 = z + \sum _{i=1}^{n_1} \mu_i h_i + \sum _{j=1}^{n_2} \mu'_j h'_j + \sum _{k=1}^{n_3} \mu''_k h''_k \ ,
\end{eqnarray} where coefficients $\mu_i$, $\mu'_j$ and $\mu''_k$ are defined by the system

\begin{eqnarray}
&& \sum _{l=1}^{n_1} g_{il} \mu_l + \sum _{j=1}^{n_2} g'_{ij} \mu'_j + \sum _{k=1}^{n_3} g''_{ik} \mu''_k &=& u_i - {\langle h_i, z \rangle}_H \, , \quad 1 \le i \le n_1 \, , \\ \nonumber && \sum _{i=1}^{n_1} g'_{ij} \mu_i + \sum _{l=1}^{n_2} g''_{jl} \mu'_l + \sum _{k=1}^{n_3} g'''_{jk} \mu''_k &=& v_j - {\langle h'_j, z \rangle}_H \, , \quad 1 \le j \le n_2 \, , \\ \nonumber && \sum _{i=1}^{n_1} g''_{ik} \mu_i + \sum _{j=1}^{n_2} g'''_{jk} \mu'_j + \sum _{l=1}^{n_3} g^{\rm iv}_{kl} \mu''_l &=& w - {\langle h''_k, z \rangle}_H \, , \quad 1 \le k \le n_3 \, ,
\end{eqnarray} Matrix of every system (11), (13) and (15) is a positive definite symmetric Gram matrix of the corresponding set of linearly independent elements $\{h_i\}$, $\{h_i\}, \{h'_j\}$ and $\{h_i\}, \{h'_j\}$, $\{h''_k\}$ and coefficients $ g_{il}, g'_{ij}, g''_{ik}, g''_{jl}, g'''_{jk}, g^{\rm iv}_{kl}$ are defined as follows:

\begin{eqnarray}
&& g_{il} = {\langle h_i, h_l \rangle}_H \, , \ \ g'_{ij} = {\langle h_i, h'_j \rangle}_H \, , \ \ g''_{ik} = {\langle h_i, h''_k \rangle}_H \\ \nonumber && g''_{jl} = {\langle h'_j, h'_l \rangle}_H \, , \ \ g'''_{jk} = {\langle h'_j, h''_k \rangle}_H \, , \ \ g^{\rm iv}_{kl} = {\langle h''_k, h''_l \rangle}_H \, .
\end{eqnarray}

Let $H = H(X)$ be a reproducing kernel Hilbert space with reproducing kernel $V(\eta, \xi)$. Remind ([2], [3]) the definition of the reproducing kernel. The reproducing kernel is a such function $V(\eta, \xi)$ that

for every $\xi \in X$, $V(\eta, \xi)$ as function of $\eta$ belongs to $H$
for every $\xi \in X$ and every function $\varphi \in H$

\begin{eqnarray}
\varphi(\xi) = {\langle V(\eta, \xi), \varphi(\eta) \rangle}_H
\end{eqnarray} Reproducing kernel is a symmetric function:
\begin{eqnarray*}
V(\eta, \xi) = V(\xi, \eta) \, ,
\end{eqnarray*} also, in the considered here case it is twice continuously differentiable function by $\xi$ and by $\eta$. Differentiating the identity (17) allows to get the identities for derivatives:
\begin{eqnarray}
\frac {d \varphi(\xi)}{d \xi} = {\left \langle \frac{\partial V(\cdot, \xi)} {\partial \xi}, \varphi \right \rangle}_H , \ \frac {d^2 \varphi(\xi)}{d \xi^2} = {\left \langle \frac{\partial^2 V(\cdot, \xi)} {\partial \xi^2}, \varphi \right \rangle}_H .
\end{eqnarray} Now it is possible to express functions $h_i, h'_j, h''_k$ via the reproducing kernel $V$. Comparing (4), (5), (6) with (17) and (18) we receive:

\begin{eqnarray}
&& h_i (\eta) = V(\eta, x_i) \, , \qquad i = 1, 2, \dots, n_1 \, \\ \nonumber
&& h'_j (\eta) = \frac{\partial V(\eta, s_j)}{\partial \xi} \, , \quad j = 1, 2, \dots, n_2 \ , \\ \nonumber
&& h''_k (\eta) = \frac{\partial^2 V(\eta, t_k)}{\partial \xi^2} \, , \ \ k = 1, 2, \dots, n_3 \ .
\end{eqnarray}

The coefficients (16) of the Gram matrices can be presented as ([3], [11], [12]):

\begin{eqnarray}
&& g_{il} = {\langle h_i, h_l \rangle}_H = {\langle V(\cdot, x_i), V(\cdot, x_l) \rangle}_H = V(x_i, x_l) \, , \\ && g'_{ij} = {\langle h_i, h'_j \rangle}_H = {\left \langle V(\cdot, x_i), \frac{\partial V(\cdot, s_j)}{\partial \xi} \right \rangle}_H = \frac{\partial V(x_i, s_j)}{\partial \xi} \, , \\ && g''_{ik} = {\langle h_i, h''_k \rangle}_H = {\left \langle V(\cdot, x_i), \frac{\partial^2 V(\cdot, t_k)}{\partial \xi^2} \right \rangle}_H = \frac{\partial^2 V(x_i, t_k)}{\partial \xi^2} \, .
\end{eqnarray} With the help of (17) and (21), we can also calculate $g''_{jl}$ ([11], [12]):

\begin{eqnarray}
g''_{jl} = {\langle h'_j, h'_l \rangle}_H &=& {\left \langle \frac{\partial V(\cdot, s_j)}{\partial \xi}, \frac{\partial V(\cdot, s_l)}{\partial \xi} \right \rangle}_H = \\ \nonumber && \left . \frac {d} {d \xi} {\left \langle V(\cdot, \xi), \frac{\partial V(\cdot, s_l)}{\partial \xi} \right \rangle}_H \right \vert_{\xi = s_j} = \\ \nonumber && \left . \frac {d} {d \xi} \left ( \frac{\partial V(\xi, s_l)} {\partial \xi} \right ) \right |_{\xi = s_j} = \frac {\partial^2 V(s_j, s_l)} {\partial \eta \partial \xi} \, .
\end{eqnarray} Further

\begin{eqnarray}
g'''_{jk} = {\langle h'_j, h''_k \rangle}_H = \frac {\partial^3 V(s_j, t_k)} {\partial \eta \partial \xi^2} \, , \\ g^{\rm iv}_{kl} = {\langle h''_k, h''_l \rangle}_H = \frac {\partial^4 V(t_k, t_l)}{\partial \eta^2 \partial \xi^2} \, ,
\end{eqnarray} and

\begin{eqnarray}
&& {\langle h_i, z \rangle}_H = {\langle V(\cdot, x_i), z \rangle}_H = z(x_i) \, , \\ \nonumber && {\langle h'_j, z \rangle}_H = z'(s_j) \, , \\ \nonumber &&{\langle h''_k, z \rangle}_H = z''(t_k) \, .
\end{eqnarray}

Here normal splines will be constructed in Sobolev spaces $W^3_2 [a, b]$, $W^3_2 [a, \infty)$ and in Bessel potential space $H^3_\varepsilon (R)$ (See [6, 7, 8, 9] for details). Elements of these spaces can be treated as twice continuously differentiable functions.
Reproducing kernel for Sobolev spaces $W^l_2 [0,1]$ (here $l$ — any positive integer) was constructed in work [10]. Thus, reproducing kernel for Sobolev space $W^3_2 [0, 1]$ with norm

\begin{eqnarray*}
\| f \| = \left ( \sum_{i=0}^2 (f^{(i)}(0))^2 + \int_0^1 (f^{(3)}(s))^2 ds \right )^{1/2} \, ,
\end{eqnarray*} can be presented as

$$
\nonumber
V(\eta, \xi) =
\begin{cases}
\sum_{i=0}^2 \frac{\xi^i}{!i} \left ( \frac{\eta^i}{!i} + (-1)^{i} \frac {\eta^{5 - i}}{(5 - i)!} \right ) \, , & 0 \le \eta \le \xi \le 1
\\
\sum_{i=0}^2 \frac{\eta^i}{!i} \left ( \frac{\xi^i}{!i} + (-1)^{i} \frac {\xi^{5 - i}}{(5 - i)!} \right ) \, , & 0 \le \xi \le \eta \le 1 \end{cases}
$$
or

$$
V(\eta, \xi) =
\begin{cases}
1 + \eta \xi + \frac{(\eta^5 - 5 \eta^4 \xi + 10 \eta^3 \xi^2 + 30 \eta^2 \xi^2 )}{120} \, , & 0 \le \eta \le \xi \le 1 \\
1 + \eta \xi + \frac{(\xi^5 - 5 \xi^4 \eta + 10 \xi^3 \eta^2 + 30 \xi^2 \eta^2 )}{120} \, , & 0 \le \xi \le \eta \le 1 \, .
\end{cases}
$$ Correspondingly
\begin{eqnarray}
&& \frac{\partial V(\eta, \xi)}{\partial \xi} = \frac{\eta (4 \eta \xi (\eta+3) - \eta^3)}{24} + \eta \, , \\ \nonumber && \frac{\partial^2 V(\eta, \xi)}{\partial \eta \partial \xi} = -\frac{\eta^3}{6} + \frac{\eta \xi (\eta + 2)}{2} + 1 \, , \\ \nonumber && \frac{\partial^2 V(\eta, \xi)}{\partial \xi^2} = \frac{\eta^2 (\eta + 3)}{6} \, , \\ \nonumber && \frac{\partial^3 V(\eta, \xi)}{\partial \eta \partial \xi^2} = \frac{\eta^2}{2} + \eta \, , \\ && \nonumber \frac{\partial^4 V(\eta, \xi)}{\partial \eta^2 \partial \xi^2} = \eta + 1 \, , \\ \nonumber && 0 \le \eta \le \xi \le 1 \, . \end{eqnarray} In addition, the following formulae are required for computing the normal spline derivatives

\begin{eqnarray}
&& \frac{\partial V(\eta, \xi)}{\partial \eta} = \frac{\eta^4 - 4\xi (\eta^3 - 6) + 6 \eta \xi^2 (\eta + 2)}{24} \, , \\ \nonumber && \frac{\partial^2 V(\eta, \xi)}{\partial \eta^2} = \frac{\eta ^3 - 3 \eta^2 \xi + 3 \xi^2 (\eta + 1)}{6} \, , \\ \nonumber && \frac{\partial^3 V(\eta, \xi)}{\partial \eta^2 \partial \xi} = -\frac{\eta^2}{2} + \eta \xi + \xi \, , \\ \nonumber && 0 \le \eta \le \xi \le 1 \, .
\end{eqnarray} Thereby we can construct a normal interpolating spline in interval $[0, 1]$. Solving the interpolating Problems 1 — 3 in an arbitrary interval can be done by mapping the latter to $[0, 1]$ through an affine change of variable. Let's do it for the Problem 3.
Define constants $a$ and $b$ as
\begin{eqnarray*}
a =\min (x_1, s_1, t_1), \qquad b = \max (x_{n_1}, s_{n_2}, t_{n_3}) \, ,
\end{eqnarray*} and introduce values $\bar x_i, \bar s_j, \bar t_k$:

\begin{eqnarray*}
&& \bar x_i = \frac{x_i - a}{b - a}, \quad \bar s_j = \frac{s_j - a}{b - a}, \quad \bar t_k = \frac{t_k - a}{b - a} \, , \\ \nonumber && i = 1, \dots, {n_1}, \quad j = 1, \dots, n_2, \quad k = 1, \dots, n_3 \, .
\end{eqnarray*} Then original Problem 3 is transformed to

Problem $\bar 3$. Given points $0 \le \bar x_1 \lt \bar x_2 \lt \dots \lt \bar x_{n_1} \le 1$, $0 \le \bar s_1 \lt \bar s_2 \lt \dots \lt \bar s_{n_2} \le 1$ and $0 \le \bar t_1 \lt \bar t_2 \lt \dots \lt \bar t_{n_3} \le 1$ find a smooth function $\bar f$ such that
\begin{eqnarray*}
\bar f(\bar x_i) &=& u_i , \qquad \qquad \ i = 1, 2, \dots, n_1 \ ,
\\ \nonumber
\bar f'(\bar s_j) &=& v_j (b - a) , \quad \ \ j = 1, 2, \dots, n_2 \ ,
\\ \nonumber
\bar f''(\bar t_k) &=& w_k (b - a)^2 , \quad k = 1, 2, \dots, n_3 \ .
\end{eqnarray*} Assuming $\bar \sigma_3 (\bar \eta)$ is a normal spline constructed for the Problem $\bar 3$, the normal spline $\sigma_3 (\eta)$ can be received as
\begin{eqnarray*}
\sigma_3 (\eta) = \bar \sigma_3 \left( \frac{\eta - a}{b - a} \right), \qquad a \le \eta \le b \, .
\end{eqnarray*}
Reproducing kernel for Sobolev spaces $W^3_2 [0, \infty)$ with norm

\begin{eqnarray*}
\| f \| = \left ( \int_0^\infty \left[ (f(s))^2 + \left(f^{(3)}(s) \right)^2 \right] ds \right )^{1/2} \, ,
\end{eqnarray*} was received in [11]. It is the symmetric function

\begin{eqnarray}
V(\eta, \xi) = \begin{cases} \sum_{i=1}^6 y_i(\eta) c_i(\xi) \, , \quad 0 \le \eta \le \xi \lt \infty \, , \\ \sum_{i=1}^6 y_i(\xi) c_i(\eta) \, , \quad 0 \le \xi \le \eta \lt \infty \, , \end{cases} \end{eqnarray} where

\begin{eqnarray}
&& y_1(\eta) = \exp (\eta) \, , \quad y_2(\eta) = \exp (-\eta) \, , \\ \nonumber && y_3(\eta) = \exp (-\eta / 2) \cos (\sqrt 3 \eta /2) \, , \\ \nonumber && y_4(\eta) = \exp (-\eta / 2) \sin (\sqrt 3 \eta /2) \, , \\ \nonumber && y_5(\eta) = \exp (\eta / 2) \cos (\sqrt 3 \eta /2) \, , \\ \nonumber && y_6(\eta) = \exp (\eta / 2) \sin (\sqrt 3 \eta /2) \, , \\ \nonumber && c_1(\xi) = -\exp(-\xi) / 6 \, , \\ \nonumber && c_2(\xi) = \exp(-\xi /2) ( \sqrt 3 \sin (\sqrt 3 \xi /2) - \cos(\sqrt 3 \xi /2) ) / 3 - \exp(-\xi) / 2 \, , \\ \nonumber && c_3(\xi) = \exp(-\xi /2) ( \sin (\sqrt 3 \xi /2) / \sqrt 3 - \cos(\sqrt 3 \xi /2) ) / 2 - \exp(-\xi) / 3 \, , \\ \nonumber && c_4(\xi) = \exp(-\xi /2) (\sqrt 3 \cos (\sqrt 3 \xi /2) - 5 \sin(\sqrt 3 \xi /2) ) / 6 \, + \\ \nonumber && \qquad \qquad \exp(-\xi) / \sqrt 3 \, , \\ \nonumber && c_5(\xi) = -\exp(-\xi /2) (\cos (\sqrt 3 \xi /2) - \sqrt 3 \sin(\sqrt 3 \xi /2) ) / 6 \, , \\ \nonumber && c_6(\xi) = -\exp(-\xi /2) (\sin (\sqrt 3 \xi /2) + \sqrt 3 \cos(\sqrt 3 \xi /2) ) / 6 \, . \end{eqnarray} Correspondingly

\begin{eqnarray}
&& \frac{\partial V(\eta, \xi)}{\partial \xi} = \sum_{i=1}^6 y_i(\eta) c'_i(\xi) \, , \quad \frac{\partial^2 V(\eta, \xi)}{\partial \eta \partial \xi} = \sum_{i=1}^6 y'_i(\eta) c'_i(\xi) \, , \\ \nonumber && \frac{\partial^2 V(\eta, \xi)}{\partial \xi^2} = \sum_{i=1}^6 y_i(\eta) c''_i(\xi) \, , \quad \frac{\partial^3 V(\eta, \xi)}{\partial \eta \partial \xi^2} = \sum_{i=1}^6 y'_i(\eta) c''_i(\xi) \, , \\ && \nonumber \frac{\partial^4 V(\eta, \xi)}{\partial \eta^2 \partial \xi^2} = \sum_{i=1}^6 y''_i(\eta) c''_i(\xi) \, , \quad \frac{\partial V(\eta, \xi)}{\partial \eta} = \sum_{i=1}^6 y'_i(\eta) c_i(\xi) \, , \\ \nonumber && \frac{\partial^2 V(\eta, \xi)}{\partial \eta^2} = \sum_{i=1}^6 y''_i(\eta) c_i(\xi) \, , \quad \frac{\partial^3 V(\eta, \xi)}{\partial \eta^2 \partial \xi} = \sum_{i=1}^6 y''_i(\eta) c'_i(\xi) \, , \\ \nonumber && 0 \le \eta \le \xi \le 1 \, . \end{eqnarray} Where

\begin{eqnarray}
&& y'_1(\eta) = \exp (\eta) \, , \quad y'_2(\eta) = -\exp (-\eta) \, , \\ \nonumber &&   y'_3(\eta) = -\exp (-\eta / 2) \sin (\sqrt 3 \eta /2 + \pi/6) \, , \\ \nonumber && y'_4(\eta) = \exp (-\eta / 2) \cos (\sqrt 3 \eta /2 + \pi/6)  \, ,  \\ \nonumber &&   y'_5(\eta) = \exp (\eta / 2) \sin (\pi/6 - \sqrt 3 \eta /2)  \, , \\ \nonumber && y'_6(\eta) =  \exp (\eta / 2) \cos (\pi/6 - \sqrt 3 \eta /2)  \, ,
\\ \nonumber && y''_1(\eta) = \exp (\eta) \, , \quad y''_2(\eta) = \exp (-\eta) \, , \\ \nonumber &&   y''_3(\eta) = \exp (-\eta / 2) (\sin (\sqrt 3 \eta /2 + \pi/6) - \sqrt 3 \cos (\sqrt 3 \eta /2 + \pi/6))/2 \, , \\ \nonumber && y''_4(\eta) = -\exp (-\eta / 2) (\sin (\sqrt 3 \eta /2 + \pi/6) + \sqrt 3 \cos (\sqrt 3 \eta /2 + \pi/6))/2 \, ,  \\ \nonumber &&   y''_5(\eta) = -\exp (\eta / 2) (\sqrt 3 \sin (\sqrt 3 \eta /2) + \cos (\sqrt 3 \eta /2))/2 \, , \\ \nonumber && y''_6(\eta) = \exp (\eta / 2) (\sqrt 3 \cos (\sqrt 3 \eta /2) - \sin (\sqrt 3 \eta /2))/2 \, ,
\\ \nonumber && c'_1(\xi) = \exp(-\xi) / 6 \, , \\ \nonumber && c'_2(\xi) = 2\exp(-\xi /2) \cos (\sqrt 3 \xi /2)/3 + \exp(-\xi) / 2 \, , \\ \nonumber && c'_3(\xi) = \exp(-\xi /2) \cos(\pi/6 - \sqrt 3 \xi /2) / \sqrt 3 + \exp(-\xi) / 3 \, , \\ \nonumber && c'_4(\xi) = \exp(-\xi /2) (-3\sqrt 3 \cos (\sqrt 3 \xi /2) + 5 \sin(\sqrt 3 \xi /2) ) / 6 \, - \\ \nonumber && \qquad \qquad \exp(-\xi) / \sqrt 3 \, , \\ \nonumber && c'_5(\xi) = \exp(-\xi /2) \cos (\sqrt 3 \xi /2) / 3 \, , \\ \nonumber && c'_6(\xi) = \exp(-\xi /2) \sin (\sqrt 3 \xi /2) / 3  \, ,
\\ \nonumber && c''_1(\xi) = -\exp(-\xi) / 6 \, , \\ \nonumber && c''_2(\xi) = -2\exp(-\xi /2) \sin (\sqrt 3 \xi /2 + \pi/6)/3 - \exp(-\xi) / 2 \, , \\ \nonumber && c''_3(\xi) = -\exp(-\xi /2) \sin (\sqrt 3 \xi /2)/\sqrt 3 - \exp(-\xi) / 3 \, , \\ \nonumber && c''_4(\xi) = \exp(-\xi /2) (\sqrt 3 \cos (\sqrt 3 \xi /2) + 2 \sin(\sqrt 3 \xi /2) ) / 3 \, + \\ \nonumber && \qquad \qquad \exp(-\xi) / \sqrt 3 \, , \\ \nonumber && c''_5(\xi) = -\exp(-\xi /2) (\cos (\sqrt 3 \xi /2) + \sqrt 3 \sin(\sqrt 3 \xi /2) ) / 6 \, , \\ \nonumber && c''_6(\xi) =  \exp(-\xi /2) (\sqrt 3 \cos (\sqrt 3 \xi /2) - \sin(\sqrt 3 \xi /2) ) / 6 \, . \end{eqnarray}

Reproducing kernel for Bessel potential space was presented in [8] and its simplified variant in [16, 15, 17,18]. Normal splines will be constructed in Bessel potential space $H_\varepsilon^3(R)$ defined as
\begin{eqnarray*}
H^3_\varepsilon (R) = \left\{ f | f \in S' ,
( \varepsilon ^2 + | s |^2 )^{3/2}{\cal F} [f] \in L_2 (R) \right\} \, , \varepsilon \gt 0 ,
\end{eqnarray*} where $S' (R)$ is space of L. Schwartz tempered distributions and $\cal F [f]$ is a Fourier transform of the $f$ [8, 19, 20]. The parameter $\varepsilon$ introduced here may be treated as a "scaling parameter". It allows to control approximation properties of the normal spline which usually are getting better with smaller values of $\varepsilon$, also it may be used to reduce the illconditioness of the related computational problem (in traditional theory $\varepsilon = 1$) (see [15, 9]). This is a Hilbert space with norm
\begin{eqnarray*}
\| f \|_ {H^3_\varepsilon} = \| ( \varepsilon ^2 + | s |^2 )^{3/2} {\cal F} [\varphi ] \|_{L_2} \ .
\end{eqnarray*} It is continuously embedded in the Hölder space $C_b^3(R)$ that consists of all functions having bounded continuous derivatives up to order $2$ ([19]). The reproducing kernel of this space is defined up to a constant as follows ([15, 9])
\begin{eqnarray*}
V(\eta , \xi, \varepsilon) = \exp (-\varepsilon |\xi - \eta|)
(3 + 3\varepsilon |\xi - \eta| + \varepsilon ^2 |\xi - \eta| ^2 ) \, .
\end{eqnarray*} Correspondingly

\begin{eqnarray}
&& \frac{\partial V(\eta , \xi, \varepsilon)}{\partial \xi} = - \varepsilon^2 \exp (-\varepsilon |\xi - \eta|)(\xi - \eta)(\varepsilon |\xi - \eta) + 1| \, ,
\\ \nonumber &&
\frac{\partial^2 V(\eta , \xi, \varepsilon)}{\partial \eta \partial \xi} = -\varepsilon^2 \exp (-\varepsilon |\xi - \eta|)(\varepsilon |\xi - \eta|(\varepsilon |\xi - \eta| - 1) - 1) \, ,
\\ \nonumber &&
\frac{\partial^2 V(\eta , \xi, \varepsilon)}{\partial \xi^2} = \varepsilon^2 \exp (-\varepsilon |\xi - \eta|)(\varepsilon |\xi - \eta|(\varepsilon |\xi - \eta| - 1) - 1) \, ,
\\ \nonumber && \frac{\partial^3 V(\eta , \xi, \varepsilon)}{\partial \eta \partial \xi^2} = \varepsilon^4 \exp (-\varepsilon |\xi - \eta|)(\xi - \eta)(\varepsilon |\xi - \eta| - 3) \, , \\ && \nonumber \frac{\partial^4 V(\eta , \xi, \varepsilon)}{\partial \eta^2 \partial \xi^2} = \varepsilon^4 \exp (-\varepsilon |\xi - \eta|)(\varepsilon |\xi - \eta|(\varepsilon |\xi - \eta| - 5) +3) \, .\end{eqnarray} In addition, the following formulae are required for computing the normal spline derivatives

\begin{eqnarray}
&& \frac{\partial V(\eta , \xi, \varepsilon)}{\partial \eta} = \varepsilon^2 \exp (-\varepsilon |\xi - \eta|)(\xi - \eta)(\varepsilon |\xi - \eta| + 1) \, , \\ \nonumber && \frac{\partial^2 V(\eta , \xi, \varepsilon)}{\partial \eta^2} = \varepsilon^2 \exp (-\varepsilon |\xi - \eta|)(\varepsilon |\xi - \eta|(\varepsilon |\xi - \eta| - 1) - 1) \, , \\ \nonumber && \frac{\partial^3 V(\eta , \xi, \varepsilon)}{\partial \eta^2 \partial \xi} = -\varepsilon^4 \exp (-\varepsilon |\xi - \eta|)(\xi - \eta)(\varepsilon |\xi - \eta| - 3) \, .
\end{eqnarray}

The normal splines method for one-dimensional function interpolation and linear ordinary differential and integral equations was proposed in [13] and developed in [10, 11, 12]. General formula of reproducing kernel for Bessel potential spaces was published in [8] and its simplified version was given in works [15, 17, 18]. Multidimensional normal splines method was developed for two-dimensional problem of low-range computerized tomography in [16] and applied for solving a mathematical economics problem in [14]. Further results were reported on seminars and conferences.

References

[1] A. Balakrishnan, Applied Functional Analysis, New York, Springer-Verlag, 1976.
[2] N. Aronszajn, Theory of reproducing kernels, Tranzactions of the AMS.– 950 – Vol.68.
[3] A. Bezhaev, V. Vasilenko, Variational Theory of Splines, Springer US, 2001.
[4] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[5] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[6] R. Adams, J. Fournier, Sobolev Spaces. Pure and Applied Mathematics. (2nd ed.). Boston, MA: Academic Press, 2003.
[7] D.R. Adams, L.I. Hedberg, Function spaces and potential theory. Berlin, Springer, 1996.

[8] N. Aronszajn, K.T. Smith, Theory of bessel potentials I, Ann.Inst.Fourier, 11, 1961.

[9] Reproducing Kernel of Bessel Potential space
[10] V. Gorbunov, V. Petrishchev, Improvement of the normal spline collocation method for linear differential equations, Comput. Math. Math. Phys., 43:8 (2003), 1099–1108
[11] V. Gorbunov, V. Sviridov, The method of normal splines for linear DAEs on the number semi-axis. Applied Numerical Mathematics, 59(3-4), 2009, 656–670.
[12] V. Gorbunov, Extremum Problems of Measurements Data Processing, Ilim, 1990 (in Russian).

[13] V. Gorbunov, The method of normal spline collocation, USSR Computational Mathematics and Mathematical Physics,Volume 29, Issue 1, 1989, Pages 145-154.

[14] V. Gorbunov, I. Kohanovsky, K. Makedonsky, Normal splines in reconstruction of multi-dimensional dependencies. Papers of WSEAS International Conference on Applied Mathematics, Numerical Analysis Symposium, Corfu, 2004. (http://www.wseas.us/e-library/conferences/corfu2004/papers/488-312.pdf)

[15] I. Kohanovsky, Multidimensional Normal Splines and Problem of Physical Field Approximation, International Conference on Fourier Analysis and its Applications, Kuwait, 1998.

[16] I. Kohanovsky, Normal Splines in Computing Tomography, Avtometriya, 1995, N 2. (https://www.iae.nsk.su/images/stories/5_Autometria/5_Archives/1995/2/84-89.pdf)

[17] H. Wendland, Scattered Data Approximation. Cambridge University Press, 2005.

[18] R. Schaback, Kernel-based Meshless Methods, Lecture Notes, Goettingen, 2011.
[19] M.S. Agranovich, Sobolev Spaces, Their Generalizations and Elliptic Problems in Smooth and Lipschitz Domains, Springer, Switzerland, 2015.

[20] H. Triebel, Interpolation. Function Spaces. Differential Operators. North-Holland, Amsterdam, 1978.

Thursday, October 24, 2019

Quadratic Programming in Hilbert space. Part III. General case.

$\setCounter{0}$

The algorithm described in the previous post can be applied for solving a more general quadratic programming problem [1], [2].

Let $H$ is a Hilbert space with inner product ${\langle \cdot \, , \cdot \rangle}_H$ and norm $\| \cdot \|_H$ and $A$ is a linear positive-definite self-adjoint operator acting in $H$. Let linear manifold $D = D(A)$, $D \subset H$ is the domain of definition of this operator and the element $z$ belongs to $D$.

It is required to minimize functional $J$:
\begin{eqnarray}
(J, \varphi) = {\langle A \varphi \, , \varphi \rangle}_H - 2 {\langle z \, , \varphi \rangle}_H \to \min \ , \
\end{eqnarray}

subject to constraints

\begin{eqnarray}
{\langle h_i , \varphi \rangle}_H &=& u_i \, ,
\quad 1 \le i \le L \ ,
\\
{\langle h_{i+L} , \varphi \rangle}_H &\le& u_{i+L} \, ,
\quad 1 \le i \le M \, ,
\\ \nonumber
\varphi &\in& D \, , \quad D \subset H \, .
\end{eqnarray}

The elements $h_i \in D, \, 1 \le i \le M+L$ are assumed to be linearly independent (thereby the system of constraints is compatible).

Let $H_A$ be an energetic space generated by the operator $A$ [2], [3]. This space is a completion of linear manifold $D$ with a scalar product
\begin{eqnarray*}
[\varphi, \vartheta]_A = {\langle A \varphi \, , \vartheta \rangle}_H \ , \quad \varphi, \vartheta \in H_A \, ,
\end{eqnarray*}
and corresponding norm $[\varphi, \varphi]_A$.
We introduce elements $\bar z$ and $\bar h_i$ from $H_A$ as solutions of the following equations
\begin{eqnarray*}
A \bar z &=& z, \quad \bar z \in H_A \, ,
\\
A \bar h_i &=& h_i \, , \ \ \, \bar h_i \in H_A \, , \quad 1 \le i \le M+L \, ,
\end{eqnarray*}

and quadratic functional

\begin{eqnarray}
(\bar J, \varphi) = \| \varphi - z \|_A^2 \, .
\end{eqnarray}

It is easy to see that

\begin{eqnarray*}
(J, \varphi) = (\bar J, \varphi) - \| z \|_A^2 \, .
\end{eqnarray*}

and constraints (2), (3) are transformed to

\begin{eqnarray}
[\bar h_i , \varphi]_A &=& u_i \, , \quad 1 \le i \le L \ ,
\\
[\bar h_{i+L} , \varphi]_A &\le& u_{i+L} \, , \ 1 \le i \le M+L \, ,
\\ \nonumber
\varphi &\in& H_A \, , \quad \bar h_i \in H_A.
\end{eqnarray}
Thereby the problem of minimization of functional (1) on linear manihold $D$ subject to constraints (2), (3) is equivalent to the problem of minimization of functional (4) in Hilbert space $H_A$ under constraints (5), (6). This later problem is a problem of finding a projection of element $z$ onto a nonempty convex closed set defined by (5), (6) and it has an unique solution. The solution can be found by using algorithms described in the previous posts.

References

[1] V. Gorbunov, Extremum Problems of Measurements Data Processing, Ilim, 1990 (in Russian).
[2] V.Lebedev, An Introduction to Functional Analysis in Computational Mathematics, Birkhäuser, 1997
[3] V. Agoshkov, P. Dubovski, V. Shutyaev, Methods for Solving Mathematical Physics Problems. Cambridge International Science Publishing Ltd, 2006.

Friday, July 19, 2019

Quadratic Programming in Hilbert space. Part II. Adaptive algorithm.

This is a continuation to the previous post.
$\setCounter{27}$
One disadvantage of the algorithm described in the previous post is that it requires an initial feasible point. Getting such initial point may require a significant amount of computations. Also in a case when number of the inequality constraints is large the rate of convergence may be slow. Often occurs a situation when an active set corresponding to the problem solution includes a small number of the inequality constraints, significantly less than their total number. In such cases it makes sense to apply a different approach for solving the problem (1),(4),(5). The essence of it is as follows: we construct a sequence of $\sigma^k$ approximations, each of which is a solution to the auxiliary minimization problem of the functional (1) on a certain subset of constraints (4), (5). The auxiliary minimization problems are defined in such a way that the value of the functional (1) strictly increases for each subsequent approximation. These minimization problems are solved with the algorithm outlined in the previous post. Thus, in order to get solution of the original problem (1),(4),(5), it is necessary to solve a series of problems with a small number of constraints. Let's describe this in details.
Assume there is a subset of constraints (4), (5):
\begin{eqnarray}
&& {\langle h_i , \varphi \rangle}_H = b_i \,\quad i \in I_1 \, ,
\\
&& {\langle h_i , \varphi \rangle}_H \le b_i \,\quad i \in I_2^k \, ,
\\ \nonumber
&& I_2^k \subset I_2 \ , \qquad I^k = I_1 \cup I_2^k \, ,
\end{eqnarray}

where $I_1$ and $I_2$ are defined in (7). Denote by $\Phi^k$ the convex set defined by constraints (28) and (29). Obviously $\Phi$ (a set defined by constraints (4) and (5)) is a subset of $\Phi^k$.
Let $\hat\sigma^k$ be a normal solution of system (28), (29). Assume that $\hat\sigma^k$ is not satisfying all constraints (5), i.e. $\hat\sigma^k \not\in \Phi$ (otherwise $\hat\sigma^k$ is the solution of the problem (1),(4),(5)), and $i_t \in I_2 \setminus I_2^k$ is a number of the most violated constraint. We set

\begin{eqnarray*}
I_2^{k+1} = I_2^k \cup \{ i_t \} \, , \qquad I^{k+1} = I_1 \cup I_2^{k+1} \, .
\end{eqnarray*}

Let $\Phi_{k+1}$ be the convex set defined by constraints with numbers from $I^{k+1}$, then $\Phi_{k+1}$ is a subset of $\Phi^k$:

\begin{eqnarray*}
\Phi_{k+1} \subset \Phi^k \, .
\end{eqnarray*}
and therefore the following inequality holds

\begin{eqnarray}
\| \hat\sigma^k \|_H ^2 < \| \hat\sigma^{k+1} \|_H ^2 \, .
\label{eq:sk}
\end{eqnarray}

This is the strict inequality because $\hat\sigma^k \notin \Phi^{k + 1}$.

In a case when $\hat\sigma^{k+1}$ does not satisfy a constraint with a number from $I_2 \setminus I_2^{k+1}$ the described process is repeated. Once the inequality (30) always holds and the number of constraints is finite, the algorithm converges to the normal solution of the system (4),(5) in a finite number of steps.
The computational details of the algorithm are described below.
Let $\hat A^k$ be an active set corresponding to the normal solution $\hat\sigma^k$ of system (28), (29):
\begin{eqnarray*}
\hat A^k = \{ i \in I^k : \langle h_i , \hat\sigma^k \rangle_H = b_i \} \, ,
\end{eqnarray*}

then
\begin{eqnarray}
&& \hat\sigma^k = \sum_{j \in \hat A^k } \mu_j h_j \, ,
\\
&& \sum_{j \in \hat A^k } g_{ij} \mu_j = b_i \ , \quad i \in \hat A^k \, ,
\\ \nonumber
&& g_{ij} = {\langle h_i , h_j \rangle}_H \ , \quad i \in \hat A^k \, , \ \ j \in \hat A^k \, .
\end{eqnarray}
For every $i \in I_2 \setminus I_2^k$ we compute values
\begin{eqnarray*}
\hat e_i^k = \langle h_i, \hat\sigma^k \rangle_H - b_i = \sum _{j \in \hat A^k } g_{ij} \mu _j - b_i \, ,
\end{eqnarray*}

and find among them the largest, whose index is denoted as $ i_t $, i.e.:

\begin{eqnarray*}
\hat e_{i_t}^k = \max _{i \in I_2 \setminus I_2^k} \{ \hat e_i^k \} \, .
\end{eqnarray*}

If $\hat e_{i_t}^k \le 0$, then $\hat\sigma^k$ satisfies to all constraints (5) and it is a solution of the original problem (1), (4), (5), so $\sigma = \hat\sigma^k$. Otherwise a new auxiliary problem, which consists in minimizing the functional (1) on solutions of the augmented constraint subsystem

\begin{eqnarray*}
&& {\langle h_i , \varphi \rangle}_H = b_i \,\quad i \in I_1 \, ,
\\
&& {\langle h_i , \varphi \rangle}_H \le b_i \,\quad i \in I_2^k \, ,
\\
&& {\langle h_{i_t} , \varphi \rangle}_H \le b_{i_t} \, ,
\\
&& I_2^{k+1} = I_2^k \cup \{ i_t \} \ , \quad I^{k+1} = I_1 \cup I_2^{k+1} \, ,
\end{eqnarray*}

should be solved. An initial feasible point required to solve this problem can be defined as the normal solution $\tilde\sigma^{k+1}$ of the system

\begin{eqnarray*}
&& {\langle h_i , \varphi \rangle}_H = b_i \, \quad i \in \hat A^k \, ,
\\
&& {\langle h_{i_t} , \varphi \rangle}_H = b_{i_t} \, ,
\end{eqnarray*}
Let $\tilde I^{k+1}=\hat A^k \cup \{i_t\}$ then $\tilde\sigma^{k+1}$ is presented as

\begin{eqnarray}
&& \tilde \sigma^{k+1} = \sum _{j \in \tilde I^{k+1} } \mu _j h_j \ ,
\\
&& \sum _{j \in \tilde I^{k+1}} g_{ij} \mu _j = b_i \ , \quad i \in \tilde I^{k+1} \ .
\end{eqnarray}

Since matrix of the system (34) differs from the matrix of the system (32) in only the last row and column, its Cholesky decomposition can be obtained by modifying the Cholesky decomposition of the system (32) matrix as it is described in Algorithms for updating a Cholesky factorization post.
Initial set $I^0$ can be defined as follows:
\begin{eqnarray*}
I^0 = I_1 \cup \{i_0, i_0 + M\} \, ,
\end{eqnarray*}
where $i_0$ is such that

\begin{eqnarray*}
\delta_{i_0} = \min_{L+1 \le i \le S} \delta_i \, .
\end{eqnarray*}

The idea of this algorithm was originally published in the paper [2] where it was applied for constructing of "splines in convex set". In that case, the minimizing objective function of the related variational problem was a semi-norm, and existing and uniqueness of the problem solution was not trivial. Corresponding auxiliary quadratic programming problems were not strictly convex ones and for their numerical solution a variant of coordinate descent algorithm was used. The algorithm showed good results when solving one-dimensional problems.

Some numerical results and comparing of the described algorithms will be presented in the next post.

References

[1] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[2] A. Kovalkov, On an algorithm for the construction on splines with discrete constraints of the type of inequalities, Serdica Bulgariacae mathematicae publicationes, Vol.9, N.4, 1983

Wednesday, April 17, 2019

Quadratic Programming in Hilbert space. Part I.

$\setCounter{0}$
This post describes an algorithm for the normal solution of a system of linear inequalities in a Hilbert space. An original version of this algorithm was developed by V. Gorbunov and published in [3]. A slightly modified version of that algorithm was described in [6].

Let $H$ be a Hilbert space with inner product ${\langle \cdot \, , \cdot \rangle}_H$ and the induced norm $\| \cdot \|_H$, there are elements $h_i \in H$, numbers $u_i, \, 1 \le i \le M+L$ and positive numbers $\delta _i, \, 1 \le i \le M$.
It is required to minimize functional $J$:
\begin{eqnarray}
(J, \varphi) = {\| \varphi \|}_H ^2 \to \min \ ,
\end{eqnarray}

subject to constraints

\begin{eqnarray}
{\langle h_i , \varphi \rangle}_H &=& u_i \, ,
\quad 1 \le i \le L \ ,
\\
| {\langle h_{i+L} , \varphi \rangle}_H - u_{i+L} | &\le& \delta _i \, ,
\quad 1 \le i \le M \, , \ \varphi \in H.
\end{eqnarray}

The elements $h_i, \, 1 \le i \le M+L$ are assumed to be linearly independent (thereby the system (2), (3) is compatible). Solution of the problem (1)—(3) obviously exists and is unique as a solution of the problem of finding a projection of zero element of Hilbert space onto a nonempty convex closed set [1, 5]. Expanding the modules in (3) we rewrite the system (2)—(3) in form:

\begin{eqnarray}
{\langle h_i , \varphi \rangle}_H &=& b_i \,\quad 1 \le i \le L \ ,
\\
{\langle h_i , \varphi \rangle}_H &\le& b_i \,\quad L+1 \le i \le N \ ,
\end{eqnarray}

where:

\begin{eqnarray}
&& N = L + 2M \, ,
\\ \nonumber
&& S = L + M \, ,
\\ \nonumber
&& b_i = u_i \, , \quad 1 \le i \le L \, ,
\\ \nonumber
&& b_{i+L} = u_{i+L} + \delta _i \ , \
b_{i+S} = -u_{i+L} + \delta _i \ , \
h_{i+S} = -h_{i+L} \ , \ 1 \leq i \leq M \ .
\end{eqnarray}

Let $\Phi$ is a convex set defined by constraints (4) and (5). We denote

\begin{eqnarray}
&& I_1 = \lbrace 1, \dots , L \rbrace \, , \quad I_2 = \lbrace L+1, \dots , N \rbrace \, ,
\\ \nonumber
&& I = I_1 \cup I_2 = \lbrace 1, \dots , N \rbrace \, ,
\\ \nonumber
&& A(\varphi) = \lbrace i \in I : \langle h_i , \varphi \rangle_H = b_i \rbrace \, ,
\ P(\varphi) = I \setminus A(\varphi) \, ,
\\ \nonumber
&& \Psi(\varphi) = \{ \psi \in H : \langle h_i, \psi \rangle_H = b_i \, , \ i \in A(\varphi) \} \ ,
\\ \nonumber
&& g_{ij} = \langle h_i , h_j \rangle_H \, , \quad 1 \le i,j \le S \ .
\end{eqnarray}

Feasible part of set $\Psi(\varphi), \, \varphi \in \Phi$ is a face of set $\Phi$ containing $\varphi$. If $A(\varphi)$ is empty (it is possible when $L =0$) then $\Psi(\varphi) = H$.

In accordance with optimality conditions [4, 5] the solution of the problem (1), (4), (5) can be represented as:

\begin{eqnarray}
\sigma &=& \sum _{i=1} ^L \mu _i h_i + \sum _{i=L+1} ^{M+L} (\mu _i - \mu _{i+M}) h_i \ ,
\\ \nonumber
&& \mu_i \le 0 \ , \quad \mu_{i+M} \le 0 \ , \quad L+1 \le i \le L+M \, ,
\\ \nonumber
\\
&& \mu_i (\langle h_i , \sigma \rangle_H - b_i ) = 0 \, , \quad L+1 \le i \le N \, ,
\\
&& \mu_i \, \mu_{i+M} = 0 \ , \qquad \qquad \ \ \ L+1 \le i \le S \, .
\end{eqnarray}

Here complementary slackness conditions (9) means that $\mu_k = 0$ for $k \in P(\sigma)$ and the relations (10) reflect the fact that any pair of constraints (5) with indices $i$ and $i + M$ cannot be simultaneously active on the solution.Thereby there are at most $S$ non-trivial coefficients $\mu_i$ in formula (8) and actual dimension of the problem (1), (4), (5) is $S$.

Let $\hat\sigma$ be the normal solution of the system (4), then it can be written as
\begin{eqnarray}
\hat\sigma = \sum _{i=1} ^L \mu _i h_i \ ,
\end{eqnarray}
where coefficients $\mu_i$ are determined from the system of linear equations with symmetric positive definite Gram matrix $\{g_{ij}\}$ of the linear independent elements $h_i$:
\begin{eqnarray}
\sum _{j=1} ^L g_{ij} \mu _j = b_i \ , \quad 1 \le i \le L \ .
\end{eqnarray}
If $L = 0$ then there are no constraints (4) and $\hat\sigma = 0$. If $\hat\sigma$ satisfies constraints (5), that is, the inequalities
\begin{eqnarray*}
{\langle h_i , \hat\sigma \rangle}_H \le b_i \ , \qquad L+1 \le i \le N \ ,
\end{eqnarray*}
which can be written like this:
\begin{eqnarray}
\sum _{j=1} ^L g_{ij} \mu _j \le b_i \ , \qquad L+1 \le i \le N \ ,
\end{eqnarray}
then $\hat\sigma$ is a solution to the original problem (1), (4), (5), i.e. $\sigma = \hat\sigma$.

Consider nontrivial case when $\hat\sigma$ does not satisfy constraints (5). In this case $\sigma$ belongs to the boundary of the set $\Phi$ and is the projection of zero element of space $H$ onto the set $\Psi (\sigma)$.
The projection $\vartheta$ of zero element of space $H$ onto the set $\Psi (\varphi)$ can be presented as
\begin{eqnarray}
\vartheta = \sum _{i \in A(\varphi)} \mu _i h_i \ ,
\end{eqnarray}
here factors $\mu_i$ are defined from the system of linear equations with symmetric positive definite matrix
\begin{eqnarray}
\sum _{j \in A(\varphi)} g_{ij} \mu _j = b_i \ , \qquad i \in A(\varphi) \ ,
\end{eqnarray}
If factors $\mu_i$, $i \in I_2 \cap A(\varphi)$ corresponding to the inequality constraints are nonpositive, then we can set $\mu_i = 0$ for $i \in P(\varphi)$ and get all conditions (8)—(10) satisfied thereby $\vartheta = \sigma$ is a solution of the problem under consideration. The following algorithm is based on this remark. Let's describe its iteration.

Let $\sigma^ k$ be a feasible point of the system (4),(5):
\begin{eqnarray}
\sigma^k = \sum _{i = 1}^N \mu_i^k h_i \ ,
\end{eqnarray}
In this presentation there are at most $S$ non-zero multipliers $\mu_i^k$.
We denote $\vartheta^k$ as a projection of zero element of space $H$ onto the set $\Psi (\vartheta^k)$, $A_k = A(\sigma^k)$ and $P_k = P(\sigma^k) = I \setminus A_k$. Then $\vartheta^k$ can be resented as
\begin{eqnarray}
&& \vartheta^k = \sum _{i \in A_k} \lambda_i^k h_i = \sum _{i = 1}^N \lambda_i^k h_i \ , \quad \lambda_i^k = 0 \, , \ \forall i \in P_k \ ,
\\
&& \sum _{j \in A_k} g_{ij} \lambda_j^k = b_i \, , \quad i \in A_k \ .
\end{eqnarray}

There are two possible cases: the point $\vartheta^k$ is feasible or it is not feasible.
In the first case we check the optimality conditions, namely: if $\lambda_i^k \le 0, \ \forall i \in I_2$ then $\vartheta^k$ is the solution of the problem. If the optimality conditions are not satisfied then we set $\sigma^{k+1} = \vartheta^k$, find an index $i \in A_k$ such that $\lambda_i^k > 0$ and remove it from $A_k$.
In the second case $\sigma^{k+1}$ will be defined as a feasible point of the ray $\vartheta (t)$
\begin{eqnarray}
\vartheta (t) = \sigma^k + t (\vartheta^k - \sigma^k) \ ,
\end{eqnarray} such that it is closest to the $\vartheta^k$. Denote $t^k_{min}$ the corresponding value of $t$ and $i_k \in P_k$ — a related number of the violated constraint. This index $i_k$ will be added to $A_k$ forming $A_{k+1}$.
Thereby all $\sigma^k$ are feasible points, that is, the minimization process proceeds within the feasible region and value of $\| \sigma^k \|_H$ is not increasing. The minimization process is finite because of linear independence of the constraints, it eliminates the possibility of the algorithm cycling.
The feasibility of the $\vartheta^k$ can be checked as follows. Introduce the values ${e_i^k}, \, k \in P_k$:
\begin{eqnarray}
e_i^k = \langle h_i , \vartheta^k - \sigma^k \rangle_H = \sum _{j=1}^N g_{ij}(\lambda_j^k - \mu _j^k) \ , \quad i \in P_k \ ,
\end{eqnarray}

If ${e_i^k} < 0, \ \forall i \in P_k$ then $\vartheta^k$ is a feasible point. Additionally, in a case when $\lambda_i^k \le 0, \ i \in I_2 \cap A_k$ it is solution of the problem.

In a case when exist some $e_i^k \gt 0$ we calculate values

\begin{eqnarray}
&& t^k_i = \frac {b_i - \langle h_i , \sigma^k \rangle_H}{e_i^k} \ , \\ \nonumber && \langle h_i , \sigma^k \rangle_H = \sum _{j=1}^N g_{ij} \mu _j^k \quad \forall i \in P_k \ : \ e_i^k > 0 \, .
\end{eqnarray}

Here all $ t_i^k \gt 0$. Now the maximum feasible step $t^k_{min}$ is computed as

\begin{eqnarray}
t^k_{min} = \min_{i \in P_k} \{ t_i^k \} \
\end{eqnarray}
and $\vartheta^k$ is feasible if and only if $t^k_{min} \ge 1$ (see 19).

Thereby the algorithm's iteration consist of seven steps:

Initialization. Let $\sigma^0$ be a feasible point of the system (4),(5) and $\mu_i^0$ are corresponding multipliers (16).
Compute multipliers $\lambda_i^k, \ i \in A_k$ as solution of the system (18).
If $| A_k | = S$ then go to Step 6.
Compute $t_i^k, \ \forall i \in P_k \ : \ e_i^k > 0$. Find $t^k_{min}$ and the corresponding index $i_k$.
If $t^k_{min} < 1$ (projection $\vartheta^k$ is not feasible) then set
\begin{eqnarray*}
&& \mu_i^{k+1} = \mu_i^k + t^k_{min} (\lambda_i^k - \mu_i^k) \ , \quad i \in A_k \ ,
\\ && A_{k+1} = A_k \cup \{ i_k \} \ ,
\end{eqnarray*}
and return to Step 2.
(projection $\vartheta^k$ is feasible). If exists index $i_p, \ i_p \in I_2 \cap A_k$ such that $\lambda_{i_p}^k > 0$ then set
\begin{eqnarray*}
&& A_{k+1} = A_k \setminus \{ i_p \} \ ,
\\ && \mu_i^{k+1} = \lambda_i^k \ , \quad i \in A_k \ ,
\end{eqnarray*}
and return to Step 2. Otherwise go to Step 7.
Set $\sigma = \vartheta^k$. Stop.

The algorithm starts from an initial feasible point of the system (4),(5). Such point $\sigma^0$ can be defined as the normal solution of the system

\begin{eqnarray}
{\langle h_i , \varphi \rangle}_H = u_i \, \quad 1 \le i \le S \ ,
\end{eqnarray}
and can be presented as
\begin{eqnarray}
\sigma^0 = \sum _{i = 1}^S \mu_i^0 h_i \ ,
\end{eqnarray}

where multipliers $\mu_i^0$ are defined from the system

\begin{eqnarray}
\sum _{j=1} ^S g_{ij} \mu _j^0 = u_i \ , \qquad 1 \le i \le S \ .
\end{eqnarray}

The algorithm described here implements a variant of the active set method [2] with the simplest form of the minimized functional (1). It can also be treated as a variant of the conjugate gradient method [7] and allows to solve the considered problem by a finite number of operations. The conjugate gradient method in this case is reduced into the problem of finding the projection $\vartheta^ k$, that is, into a solution of system (18). Infinite dimensionality of the space $H$ for this algorithm is irrelevant.

The most computationally costly part of the algorithm is calculation of projections $\vartheta^k$ by solving systems (18). Matrices of all systems (18) are positive definite (by virtue of linear independence of elements $h_i$) and it is convenient to use Cholesky decomposition to factorize them. At every step of the algorithm a constraint is added to or removed from the current active set and the system (18) matrix gets modified by the corresponding row and symmetric column addition or deletion. It is possible to get Cholesky factor of the modified matrix without computing the full matrix factorization (see previous post), it allows greatly reduce the total volume of computations.

This algorithm can be also applied in a case when instead of modular constraints (3) there are one-sided inequality constraints
\begin{eqnarray}
{\langle h_{i+L} , \varphi \rangle}_H \le u_{i+L} \, , \quad 1 \le i \le M \, ,
\end{eqnarray}

for that it is enough to set

\begin{eqnarray}
b_{i+L} = u_{i+L} \ , \quad b_{i+S} = -\infty \, , \quad 1 \le i \le M \, .
\end{eqnarray}

in (6).

To be continued on the next post.

References

[1] A. Balakrishnan, Applied Functional Analysis, Springer-Verlag, New York, 1976.
[2] P. Gill, W. Murray, M. Wright, Practical Optimization, Academic Press, London, 1981.
[3] V. Gorbunov, The method of reduction of unstable computational problems (Metody reduktsii neustoichivykh vychislitel'nkh zadach), Ilim, 1984.
[4] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[5] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[6] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[7] B. Pshenichnyi, Yu. Danilin, Numerical methods in extremal problems, Mir Publishers, Moscow, 1978.

Friday, February 8, 2019

Algorithms for updating Cholesky factorization

A problem of updating Cholesky factorization was treated in [1], [2] and [3]. Here algorithms taken from those works are described. These algorithms are for computing a factorization $\tilde A = \tilde L \tilde L^T$ where $\tilde A$ is a matrix received of the matrix $A = L L^T$ after a row and the symmetric column were added or deleted from $A$.
Let $A \in R^{n \times n}$ is a symmetric positive definite matrix. Applying Cholesky method to $A$ yields the factorization $A = L L^T$ where $L$ is a lower triangular matrix with positive diagonal elements.
Suppose $\tilde A$ is a positive definite matrix created by adding a row and the symmetric column to $A$:
\begin{eqnarray*}
\tilde A = \left( \begin{array}{cc}
A & d \\
d^T & \gamma
\end{array}
\right) \ , \qquad d^T \in R^n
\end{eqnarray*}
Then its Cholesky factorization is [2]:
\begin{eqnarray*}
\tilde L = \left( \begin{array}{cc}
L \\
e^T & \alpha
\end{array}
\right) \ ,
\end{eqnarray*}
where
\begin{eqnarray*}
e = L^{-1} d \ ,
\end{eqnarray*}
and
\begin{eqnarray*}
\alpha = \sqrt{ \tau } \ , \quad \tau = \gamma - e^T e, \quad \tau > 0.
\end{eqnarray*}
If $\tau \le 0$ then $\tilde A$ is not positive definite.

Now assume $\tilde A$ is obtained of $A$ by deleting the $r^{th}$ row and column from $A$. Matrix $\tilde A$ is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get $\tilde L$ from $L$ in such case. Let us partition $A$ and $L$ along the $r^{th}$ row and column:
\begin{eqnarray*}
A = \left( \begin{array}{ccc}
A_{11} & a_{1r} & A_{12} \\
a_{1r}^T & a_{rr} & a_{2r}^T \\
A_{21} & a_{2r} & A_{22}
\end{array}
\right) \ , \qquad
L = \left( \begin{array}{ccc}
L_{11} \\
l_{1r}^T & l_{rr} \\
L_{21} & l_{2r} & L_{22}
\end{array}
\right) \ .
\end{eqnarray*}
Then $\tilde A$ can be written as
\begin{eqnarray*}
\tilde A = \left( \begin{array}{cc}
A_{11} & A_{12} \\
A_{21} & A_{22}
\end{array}
\right) \ .
\end{eqnarray*}
By deleting the $r^{th}$ row from $L$ we get the matrix
\begin{eqnarray*}
H = \left( \begin{array}{ccc}
L_{11} \\
L_{21} & l_{2r} & L_{22}
\end{array}
\right) \
\end{eqnarray*}
with the property that $H H^T = \tilde A$. Factor $\tilde L$ can be received from $H$ by applying Givens rotations to the matrix elements $h_{r, r+1}, h_{r+1, r+2},$ $\dots , h_{n-1, n}$ and deleting the last column of the obtained matrix $\tilde H$:
\begin{eqnarray*}
\tilde H = H R = \left( \begin{array}{ccc}
L_{11} \\
L_{21} & \tilde L_{22} & 0
\end{array}
\right) = \Bigl( \tilde L \ 0 \Bigr) \ ,
\end{eqnarray*}
where $R=R_0 R_1 \dots R_{n - 1 - r}$ is the matrix of Givens rotations, $L_{11}$ and $\tilde L_{22}\, - $ lower triangle matrices, and
\begin{eqnarray*}
&& \tilde L = \left( \begin{array}{cc}
L_{11} \\
L_{21} & \tilde L_{22}
\end{array} \right) \, , \\ && \tilde H \tilde H^T = H R R^T H^T = H H^T = \tilde A \, , \\ && \tilde H \tilde H^T = \tilde L \tilde L^T \, , \quad \tilde L \tilde L^T = \tilde A \, .
\end{eqnarray*}
Orthogonal matrix $R_t \ (0 \le t \le n - 1 -r), R_t \in R^{n \times n}$ which defines the Givens rotation annulating $(r+t , r+t+1)$th element of the $H_k R_0 \dots R_{t-1}$ matrix is defined as
\begin{eqnarray*}
R_t = \left( \begin{array}{cccccc}
1 & \ldots & 0 & 0 & \ldots & 0 \\
\vdots & \ldots & \ldots & \ldots & \ldots & \vdots \\
0 & \ldots & c & s & \ldots & 0 \\
0 & \ldots & -s & c & \ldots & 0 \\
\vdots & \ldots & \ldots & \ldots & \ldots & \vdots \\
0 & \ldots & 0 & 0 & \ldots & 1
\end{array}
\right) \ .
\end{eqnarray*}
where entries $( r+t , r+t )$ and $( r+t+1 , r+t+1 )$ equal $c$, $( r+t , r+t+1 )$ entry equals $s$, and $( r+t+1 , r+t )$ entry equals $-s$, here $c^2 + s^2 = 1$. Let's $\tilde l_{ij}^{t-1} - $ coefficients of matrix $H_k R_0 \dots R_{t-1}$ ($\tilde l_{ij}^{-1} - $ coefficients of $H_k$) and
\begin{eqnarray*}
c = \frac{ \tilde l_{r+t,r+t}^{t-1} }
{ \sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } } \ ,
\quad
s = \frac{ \tilde l_{r+t,r+t+1}^{t-1} }
{ \sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } } \ ,
\end{eqnarray*}
Then matrix $H_k R_0 \dots R_t$ will differ from $H_k R_0 \dots R_{t-1}$ with entries of $( r+t )$ и $( r+t+1 )$ columns only, thereby
\begin{eqnarray*}
&& \tilde l_{i,r+t}^t = \tilde l_{i,r+t}^{t-1} = 0 \, , \
\tilde l_{i,r+t+1}^t = \tilde l_{i,r+t+1}^{t-1} = 0 \, ,
\quad 1 \le i \le r + t - 1 \, ,
\\
&& \tilde l_{i,r+t}^t = c \tilde l_{i,r+t}^{t-1} + s \tilde l_{i,r+t+1}^{t-1} \, ,
\\
\nonumber
&& \tilde l_{i,r+t+1}^t = -s \tilde l_{i,r+t}^{t-1} + c \tilde l_{i,r+t+1}^{t-1} \,
\quad r + t \le i \le n - 1 \, .
\end{eqnarray*}
Where
\begin{eqnarray*}
%\label{a1_ltrt}
\tilde l_{r+t,r+t}^t =
\sqrt{ (\tilde l_{r+t,r+t}^{t-1})^2 + (\tilde l_{r+t,r+t+1}^{t-1})^2 } \ ,
\qquad
\tilde l_{r+t,r+t+1}^t = 0 \, .
\end{eqnarray*}
Also, coefficient $\tilde l_{r+t,r+t}^t$ is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of $c$ and $s$, it was recommended [3] to calculate value of the square root $w = \sqrt{x^2 + y^2}$ as folows:
\begin{eqnarray*}
&& v = \max \{ |x| , |y| \} \ , \qquad u = \min \{ |x| , |y| \} \ ,
\\
\nonumber
&& w = \cases{ v \sqrt{ 1 + \left( \frac{u}{v} \right)^2 } \ ,
\quad v \ne 0 \cr \cr
0 \ , \quad v = 0 \cr
} \ .
\end{eqnarray*}
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about $n^2$ flops instead of about $\frac {(n + 1) ^ 3} {3}$ flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about $3(n - r)^2$ flops (the worst case requires about $3 (n - 1) ^ 2$ operations) instead of about $\frac {(n - 1) ^ 3} {3} $ flops required for its direct calculation.

References

[1] T. F. Coleman, L. A. Hulbert, A direct active set algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.

Simple normal splines examples

$\setCounter{0}$
In this post we illustrate the normal spline interpolation approach with a few simple examples.
Let there is the following information of a smooth function $\varphi (x,y), \, (x,y) \in R^2$:
\begin{eqnarray}
&& \varphi (0, 0) = 0 \, ,
\\
&& \frac{ \partial{\varphi} }{ \partial{x} } (0, 0) = 1 \, ,
\\
&& \frac{ \partial{\varphi} }{ \partial{y} } (0, 0) = 1 \, ,
\end{eqnarray}
and it is necessary to reconstruct $\varphi$ using this data.
Assume this function is an element of Hilbert space $H^{2.5}_\varepsilon (R^2)$, thus it can be treated as a continuously differentiable function $\varphi \in C^1 (R^2)$, and construct a normal spline $\sigma_1^{(2.5)}$ approximating $\varphi$:

\begin{eqnarray}
\sigma_1^{(2.5)} = {\rm arg\,min}\{ \| \varphi \|^2_{H^{2.5}_\varepsilon} : (1), (2), (3), \forall \varphi \in {H^{2.5}_\varepsilon} (R^2) \} \, .
\end{eqnarray}
This spline can be presented as

\begin{eqnarray}
\sigma_1^{(2.5)} = \mu_1 h_1 + \mu'_1 h'_1 + \mu'_2 h'_2 \, ,
\end{eqnarray}
here

\begin{eqnarray}
&& h_1 (\eta_1, \eta_2, \varepsilon) = \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (1 + \varepsilon \sqrt{\eta_1^2 + \eta_2^2}) \, ,
\\
&& h'_1 (\eta_1, \eta_2, \varepsilon) = \varepsilon^2 \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (\eta_1 + \eta_2) \, ,
\\
&& h'_2 (\eta_1, \eta_2, \varepsilon) = h'_1 (\eta_1, \eta_2, \varepsilon) \, , \ (\eta_1, \eta_2) \in R^2 \, ,
\end{eqnarray}

and coefficients $(\mu_1, \mu'_1, \mu'_2)$ are defined from the system (see the previous post):

\begin{eqnarray}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 2\varepsilon^2 & 0 \\
0 & 0 & 2\varepsilon^2 \\
\end{bmatrix} \left[ \begin{array}{c} \mu_1 \\ \mu'_1 \\ \mu'_2 \end{array} \right] =
\left[ \begin{array}{c} 0 \\ 1 \\ 1 \end{array} \right] \, .
\end{eqnarray}

Eventually

\begin{eqnarray}
\sigma_1^{(2.5)} (x, y, \varepsilon) = \exp (-\varepsilon \sqrt{x^2 + y^2}) (x + y) \, , \quad (x,y) \in R^2.
\end{eqnarray}

Fig.1 Spline $\sigma_1^{(2.5)}, \, \varepsilon = 1$

Fig.2 Spline $\sigma_1^{(2.5)}, \, \varepsilon = 0.1$

Now let function $\varphi (x,y), \, (x,y) \in R^2$ is a twice continuously differentiable function which satisfies constraints:
\begin{eqnarray}
&& \varphi (0, 0) = 0 \, ,
\\
&& \frac{ \partial{\varphi} }{ \partial{x} } (0, 0) + \frac{ \partial{\varphi} }{ \partial{y} } (0, 0) = 2 \, .
\end{eqnarray}

We approximate it by constructing a normal spline $\sigma_1^{(3.5)}$ in $H^{3.5}_\varepsilon (R^2)$:

\begin{eqnarray}
\sigma_1^{(3.5)} &=& {\rm arg\,min}\{ \| \varphi \|^2_{H^{3.5}_\varepsilon} : (11), (12), \forall \varphi \in {H^{3.5}_\varepsilon} (R^2) \} \, , \\
\sigma_1^{(3.5)} &=& \mu_1 h_1 + \mu'_1 h'_1 \, ,
\end{eqnarray}
where

\begin{eqnarray}
&& h_1 (\eta_1, \eta_2, \varepsilon) = \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (3 + 3\varepsilon \sqrt{\eta_1^2 + \eta_2^2} + \varepsilon^2 (\eta_1^2 + \eta_2^2)) \, ,
\\
&& h'_1 (\eta_1, \eta_2, \varepsilon) = \varepsilon^2 \exp (-\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (1 +\varepsilon \sqrt{\eta_1^2 + \eta_2^2}) (\eta_1 + \eta_2) \, ,
\end{eqnarray}

and coefficients $(\mu_1, \mu'_1)$ are defined from the system:

\begin{eqnarray}
\begin{bmatrix}
3 & 0 \\
0 & 2\varepsilon^2 \\
\end{bmatrix} \left[ \begin{array}{c} \mu_1 \\ \mu'_1 \end{array} \right] =
\left[ \begin{array}{c} 0 \\ 2 \end{array} \right] \, .
\end{eqnarray}

Therefore

\begin{eqnarray}
\sigma_1^{(3.5)} (x, y, \varepsilon) &=& \exp (-\varepsilon \sqrt{x^2 + y^2}) (1 + \varepsilon \sqrt{x^2 + y^2}) (x + y) \, , \\ \nonumber && (x,y) \in R^2.
\end{eqnarray}

As the last example consider a problem of reconstructing a continuously differentiable function $\varphi (x), x \in R$, which satisfies constraint
\begin{eqnarray}
\frac {d\varphi} {dx} (0) = 1 \, ,
\end{eqnarray}
and it is closest to function $z(x) = 2 x, \, x \in R$. We approximate it by constructing a normal spline $\sigma_1^{(2)}$ in $H^{2}_\varepsilon (R)$:
\begin{eqnarray}
\sigma_1^{(2)} &=& {\rm arg\,min}\{ \| \varphi - z \|^2_{H^{2}_\varepsilon} : (19)\, , \forall \varphi \in {H^{2}_\varepsilon} (R) \} \, , \\
\sigma_1^{(2)} &=& z + \mu'_1 h'_1 = 2x + \mu'_1 h'_1 \, ,
\end{eqnarray}
Performing calculations analogous to previous ones, we'll receive:

\begin{eqnarray}
\sigma_1^{(2)} (x, \varepsilon) = 2 x - x \exp (-\varepsilon |x|) \, , \quad x \in R.
\end{eqnarray}

Tuesday, January 29, 2019

Hermite-Birkhoff Interpolation of Scattered Data

$\setCounter{0}$

Suppose $P_1 = \{p_i, p_i \in R^n\}_{i=1}^{M_1}$ and $P_2 = \{p'_i, p'_i \in R^n\}_{i=1}^{M_2}$ are sets of data locations lying in a domain $\Omega \subset R^n$.
Let functional $f_i$ is a value of a function $\varphi$ at a node $p_i \in P_1$ and functional $f'_i$ is a value of that function directional derivative (a slope of the function) at a node $p'_i \in P_2$
Assume the values of functionals $f_i$ and $f'_i$ at nodes belonging to $P_1$ and $P_2$ are known:
\begin{eqnarray}
(f_i , \varphi ) &=& \varphi (p_i) = u_i \ , \quad p_i \in P_1 \ , 1 \le i \le M_1,
\\
(f'_i , \varphi ) &=& \frac{ \partial{\varphi} }{ \partial{e_i} } (p'_i) = u'_i \ , \, p'_i \in P_2 \ , 1 \le i \le M_2,
\end{eqnarray}
where $\frac{ \partial{\varphi} }{ \partial{e_i} } (p'_i) = \sum _{k=1}^{n} \frac{ \partial{\varphi} }{ \partial{x_k} } (p'_i) \cos \theta _{ik}$, and $\cos \theta _{ik} , 1 \le k \le n \, -$ directional cosines of a unit vector $e_i$.

Consider a problem of the function $\varphi$ reconstruction. Let function to be approximated $\varphi$ belongs to Hilbert space $H^s_\varepsilon (R^n)$ where $s > r + n/2, \, r = 1, 2, \dots$. In that case space $H^s_\varepsilon$ is continuously embedded in Hölder space $C_b^r(R^n)$, $\varphi (x) \in C^r(R^n)$ and values of function $\varphi (x)$ and the function partial derivatives at the fixed points are linear continuous functionals in $H^s_\varepsilon$.
We assume that $P_1$ does not contain coincide nodes and there is no directional derivatives defined in a node in $P_2$ which directions are linearly dependent, also a node from $P_1$ may coincide with a node from $P_2$. Under this restriction functionals $f_i, f'_i$ are linearly independent.
In accordance with Riesz representation theorem [1] linear continuous functionals $f_i \, , f'_i$ can be represented in the form of inner product of some elements $h_i \, , h'_i \in H^s_\varepsilon$ and $\varphi \in H^s_\varepsilon$, for any $\varphi \in H^s_\varepsilon$.
\begin{eqnarray}
(f_i, \varphi ) = {\langle h_i, \varphi \rangle}_{H^{s_r}_\varepsilon} \, ,
\quad
(f'_i, \varphi) = {\langle h'_i, \varphi \rangle}_{H^{s_r}_\varepsilon} \, , \quad \forall \varphi \in H^s_\varepsilon \ .
\end{eqnarray}
The notation ${\langle \cdot , \cdot \rangle}_{H^{s_r}_\varepsilon}$ in (3) denotes the inner product in space ${H^{s_r}_\varepsilon}$ (see the previous post for details). It is easy to see that under above assumptions elements $h_i, h'_i$ are linearly independent.
Let $V_{s_r}(\eta , x, \varepsilon), \, s_r = r + n/2 + 1/2, \, r=1, 2, \dots$ is a reproduction kernel of Hilbert space $H^{s_r}_\varepsilon (R^n)$ (see the previous post). The kernel $V_{s_r}(\eta , x, \varepsilon)
\in C^r(R^n)$ as a function of $\eta \in R^n$. In accordance with reproducing property of the reproducing kernel for any $\varphi \in H^{s_r}_\varepsilon (R^n)$ and $x \in R^n$ the identity
\begin{eqnarray}
\varphi (x) = {\langle V_{s_r}(\cdot , x, \varepsilon), \varphi \rangle}_{H^{s_r}_\varepsilon}
\end{eqnarray}
holds. This means that $V_{s_r}(\cdot , x)$ is the Riesz representer of the evaluation functional $\delta_ x$ defined as value of a function $\varphi \in H^{s_r}_\varepsilon$ at a given point $x \in R^n$. As $V_{s_r}$ is a radial basis continuously differentiable function we can differentiate the identity (4) and receive the following identity
\begin{eqnarray}
\frac{ \partial {\varphi (x)} }{\partial{x_i}} = {\left \langle \frac{ \partial {V_{s_r}(\cdot , x, \varepsilon)} }{\partial{x_i}} , \varphi \right \rangle}_{H^{s_r}_\varepsilon}
\end{eqnarray}
which holds for any $\varphi \in H^{s_r}_\varepsilon$ and $x \in R^n$. It means that function $\frac{\partial {V_{s_r}(\cdot , x)} }{\partial{x_i}}$ represents a point-wise functional defined as value of function $\frac{ \partial {\varphi (\cdot)} }{\partial{x_i}}$ at a point $x$.
Thereby
\begin{eqnarray}
h_i (\eta, \varepsilon) &=& V_{s_r}(\eta , p_i, \varepsilon) \, , \quad p_i \in P_1 \ , 1 \le i \le M_1, \ \,
\\
h'_i (\eta, \varepsilon) &=& \frac{ \partial {V_{s_r}(\eta , p'_i, \varepsilon)} }{\partial{e_i}} \, , \ p'_i \in P_2 \ , 1 \le i \le M_2, \ \eta \in R^n \, .
\end{eqnarray}
and system of constraints (1), (2) can be presented as a system of linear equations in Hilbert space $H^{s_r}_\varepsilon$:
\begin{eqnarray}
{\langle h_i, \varphi \rangle}_{H^{s_r}_\varepsilon} &=& u_i \, , \quad 1 \le i \le M_1, \ \,
\\
{\langle h'_i, \varphi \rangle}_{H^{s_r}_\varepsilon} &=& u'_i \, , \ \ \, 1 \le i \le M_2 \, , \quad h_i \, , h'_i \, , \varphi \in H^{s_r}_\varepsilon (R^n).
\end{eqnarray}
This system of constraints defines a convex and closed set in Hilbert space — as an intersection of a finite number of hyperplanes (this set is not an empty set because of linear independence of functionals $h_i, h'_i$).

The problem of reconstruction of function $\varphi$ satisfying system of constraints (8) and (9) is an undetermined problem. We reformulate it as the problem of finding a solution of the system (8), (9) that has minimal norm:
\begin{eqnarray}
\sigma_1 = {\rm arg\,min}\{ \| \varphi \|^2_{H^{s_r}_\varepsilon} : (8), (9), \forall \varphi \in {H^{s_r}_\varepsilon} (R^n) \} \, ,
\end{eqnarray}
such solution exists and it is unique [1, 6]. The element $\sigma_1$ is an interpolating normal spline.
In accordance with generalized Lagrange method ([5], [6]) the solution of the problem (10) can be written as follows:
\begin{eqnarray}
\sigma_1 = \sum _{j=1}^{M_1} \mu_j h_j + \sum _{j=1}^{M_2} \mu'_j h'_j \ ,
\end{eqnarray}
here coefficients $\mu_i, \mu'_i$ are defined by the system
\begin{eqnarray}
\sum _{j=1}^{M_1} g_{ij} \mu_j + \sum _{j=1}^{M_2} g'_{ij} \mu'_j &=& u_i \, , \quad 1 \le i \le M_1 \, ,
\\
\sum _{j=1}^{M_1} g'_{ji} \mu_j + \sum _{j=1}^{M_2} g''_{ij} \mu'_j &=& u'_i \, , \quad 1 \le i \le M_2 \, ,
\end{eqnarray}
where $g_{ij}, g'_{ij}, g''_{ij}$ are coefficients of the positive definite symmetric Gram matrix of the set of linearly independent elements $h_i, h'_i \in H^{s_r}_\varepsilon (R^n)$:
\begin{eqnarray}
g_{ij} = {\langle h_i, h_j \rangle}_{H^{s_r}_\varepsilon} = (f_i, h_j) \, , \quad 1 \le i \le M_1 \, , \ 1 \le j \le M_1 \, ,
\\
g'_{ij} = {\langle h_i, h'_j \rangle}_{H^{s_r}_\varepsilon} = (f_i, h'_j) \, , \quad 1 \le i \le M_1 \, , \ 1 \le j \le M_2 \, ,
\\
g''_{ij} = {\langle h'_i, h'_j \rangle}_{H^{s_r}_\varepsilon} = (f'_i, h'_j) \, , \quad 1 \le i \le M_2 \, , \ 1 \le j \le M_2 \, ,
\end{eqnarray} Taking into account (3), (6), (7) we receive:
\begin{eqnarray}
g_{ij} &=& V_{s_r}(p_i , p_j, \varepsilon) \, , \quad p_i, p_j \in P_1 , \ 1 \le i \le M_1 , \ 1 \le j \le M_1 ,
\\ \nonumber
\\
g'_{ij} &=& \frac{ \partial {V_{s_r}(p_i, p'_j, \varepsilon)} }{\partial{e_j}} = \sum_{k=1}^n \frac{ \partial {V_{s_r}(p_i, p'_j, \varepsilon)} }{\partial{x_k}} \cos \theta _{jk} \, , \\ \nonumber && p_i \in P_1 \, , \ 1 \le i \le M_1 , \ p'_j \in P_2 , \ 1 \le j \le M_2 \, ,
\\
g''_{ij} &=& \frac{ \partial^2 {V_{s_r}(p'_i, p'_j, \varepsilon)} } {\partial{e_i} \partial{e_j}} = \sum_{l=1}^n \sum_{k=1}^n \frac{ \partial^2 {V_{s_r}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_l} \partial{x_k}} \cos \theta _{il} \cos \theta _{jk} \, ,\\ \nonumber && p'_i, p'_j \in P_2 , \ 1 \le i \le M_2 \, , \ 1 \le j \le M_2 \, .
\end{eqnarray}
Thus the problem (10) is reduced to solving a system of linear equations (12), (13) with Gram matrix defined by (18) — (19) and constructing the normal spline (11).

Assume the function to be approximated $\varphi$ belongs to Hilbert space $H^{s_1}_\varepsilon (R^n)$, $s_1 = 1 + n/2 + 1/2$ (see the previous post). It is a continuously differentiable function, $\varphi \in C^1 (R^n)$. Write down the expressions for $h_j, h'_j, g_{ij}, g'_{ij}, g''_{ij}$ in space $H^{s_1}_\varepsilon (R^n)$:
\begin{eqnarray}
h_j (\eta, \varepsilon) &=& \exp (-\varepsilon |\eta - p_j |) (1 + \varepsilon |\eta - p_j |) \ ,
\\ \nonumber
&& p_j \in P_1 \ , \ 1 \le j \le M_1 \, ,
\\
h'_j (\eta, \varepsilon) &=& \varepsilon^2 \exp (-\varepsilon | \eta - p'_j | ) \sum _{k=1}^n (\eta_k - p'_{jk})\cos \theta_{jk} \ ,
\\ \nonumber
&& p'_j \in P_2 \ , \ 1 \le j \le M_2 \, , \quad \eta \in R^n \, ,
\end{eqnarray}
then
\begin{eqnarray}
g_{ij} &=& \exp (-\varepsilon | p_i - p_j | )(1 + \varepsilon | p_i - p_j | ) \ , \ p_i, p_j \in P_1 \ , 1 \le i \le M_1 \, ,
\\
g'_{ij} &=& \varepsilon^2 \exp (-\varepsilon | p_i - p'_j | ) \sum _{k=1}^n (p_{ik} - p'_{jk})\cos \theta_{jk} \ ,
\\ \nonumber
&& p_i \in P_1 \, , \ p'_j \in P_2 \ , 1 \le i \le M_1 \, , \ 1 \le j \le M_2 \, , \\ g''_{ij} &=&  \sum_{l=1}^n \sum_{k=1}^n \frac{ \partial^2 {V_{s_1}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_l} \partial{x_k}} \cos \theta _{il} \cos \theta _{jk} \, ,\\ \nonumber && p'_i, p'_j \in P_2 , \ 1 \le i \le M_2 \, , \ 1 \le j \le M_2 \, , i \ne j \, , \\ \nonumber && \text{where} \\ \nonumber && \frac{ \partial^2 {V_{s_1}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_k} \partial{x_k}} = \varepsilon^2 \exp (-\varepsilon | p'_i - p'_j | ) \left( 1 - \varepsilon \frac {(p'_{ik} - p'_{jk})^2}{| p'_i - p'_j |} \right) \, , \\ \nonumber && \frac{ \partial^2 {V_{s_1}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_l} \partial{x_k}} = -\varepsilon^3 \exp (-\varepsilon | p'_i - p'_j | ) \frac {(p'_{ik} - p'_{jk})(p'_{il} - p'_{jl})}{| p'_i - p'_j |} \, , \ \ l \ne k \, ,
\\
g''_{ii} &=& \varepsilon^2 \sum _{l=1}^n\ (\cos \theta_{il})^2 = \varepsilon^2 \, , \quad 1 \le i \le M_2 \, .
\end{eqnarray}

Consider a case when the function to be approximated $\varphi$ may be treated as an element of Hilbert space $H^{s_2}_\varepsilon (R^n)$, $s_2 = 2 + n/2 + 1/2$. In this case it is a twice continuously differentiable function, $\varphi \in C^2 (R^n)$. Let's write down the expressions for $h_j, h'_j$ in space $H^{s_2}_\varepsilon (R^n)$:
\begin{eqnarray}
h_j (\eta, \varepsilon) &=& \exp (-\varepsilon |\eta - p_j |) (3 + 3 \varepsilon |\eta - p_j | + \varepsilon^2 |\eta - p_j |^2) ) \ ,
\\ \nonumber
&& p_j \in P_1 \ , 1 \le j \le M_1 \, ,
\\
h'_j (\eta, \varepsilon) &=& \varepsilon^2 \exp (-\varepsilon |\eta - p'_j | ) (1 + \varepsilon |\eta - p'_j |) \sum _{k=1}^n (\eta_k - p'_{jk})\cos \theta_{jk} \ ,
\\ \nonumber
&& p'_j \in P_2 \, , \ \ 1 \le j \le M_2 \, , \ \eta \in R^n \, ,
\end{eqnarray}
and corresponding Gram matrix elements $g_{ij}, g'_{ij}, g''_{ij}$ are as follows
\begin{eqnarray}
g_{ij} &=& \exp (-\varepsilon |p_i - p_j |) (3 + 3 \varepsilon |p_i - p_j | + \varepsilon^2 |p_i - p_j |^2) ) \ , \\ \nonumber && p_i, p_j \in P_1 \ , 1 \le i \le M_1 \, ,
\\
g'_{ij} &=& \varepsilon^2 \exp (-\varepsilon |p_i - p'_j | ) (1 + \varepsilon |p_i - p'_j |) \sum _{k=1}^n (p_{ik} - p'_{jk})\cos \theta_{jk} \ ,
\\ \nonumber
&& p_i \in P_1 \, , \ p'_j \in P_2 \ , 1 \le i \le M_1 \, , 1 \le j \le M_2 \, ,
\\ g''_{ij} &=&  \sum_{l=1}^n \sum_{k=1}^n \frac{ \partial^2 {V_{s_2}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_l} \partial{x_k}} \cos \theta _{il} \cos \theta _{jk} \, ,\\ \nonumber && p'_i, p'_j \in P_2 , \ 1 \le i \le M_2 \, , \ 1 \le j \le M_2 \, , i \ne j \, , \\ \nonumber && \text{where} \\ \nonumber && \frac{ \partial^2 {V_{s_2}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_k} \partial{x_k}} = \varepsilon^2 \exp (-\varepsilon | p'_i - p'_j | ) (1 + \varepsilon | p'_i - p'_j | - \varepsilon^2 (p'_{ik} - p'_{jk})^2) \, , \\ \nonumber && \frac{ \partial^2 {V_{s_2}(p'_i, p'_j, \varepsilon)} } {\partial{\eta_l} \partial{x_k}} = -\varepsilon^4 \exp (-\varepsilon | p'_i - p'_j | ) (p'_{ik} - p'_{jk})(p'_{il} - p'_{jl}) \, , \ \ l \ne k \, ,
\\
g''_{ii} &=& \varepsilon^2 \sum _{l=1}^n\ (\cos \theta_{il})^2 = \varepsilon^2 \, , \quad 1 \le i \le M_2 \, ,
\end{eqnarray}
If the original task does not contain constraints including function $\varphi$ derivatives ($P_2$ is an empty set, thereby there are no constraints (2) in (1)—(2)), we may construct an interpolating normal spline assuming $\varphi$ is an element of Hilbert space $H^{s_0}_\varepsilon (R^n)$, $s_0 = n/2 + 1/2$. In this case it is a continuous function, $\varphi \in C (R^n)$, and expressions for $h_i, g_{ij}$ are defined by:
\begin{eqnarray}
h_i (\eta, \varepsilon) &=& \exp (-\varepsilon |\eta - p_i |) \ , \eta \in R^n, \ p_i \in P_1 \ , 1 \le i \le M_1 \, , \\ g_{ij} &=& \exp (-\varepsilon | p_j - p_i | )) \ , \\ \nonumber && \ p_i, p_j \in P_1 \ , 1 \le i \le M_1 \, , \ 1 \le j \le M_1 \, ,
\end{eqnarray}
When value of the parameter $\varepsilon$ is small this spline is similar to multivariate generalization of the one dimensional linear spline.

We now consider the choice of value for parameters $s$ and $\varepsilon$. The parameter $s$ defines smoothness of the spline —  if $s > n/2$ then spline belongs to $C^{[s - n/2 ]}$, where $[\cdot]$ denotes integer part of number. Moreover, discontinuities of derivatives of a greater order are observed only at the nodes $p_i, p'_i$, at all other points of $R^n$ the spline is infinitely differentiable. With increasing $s$ the condition number of the Gram matrix increases and computational properties of the problem of the function $\varphi$ approximation are worsening. Thereby it make sense to choose a reasonably small value for the parameter $s$.
A normal spline is similar to Duchon's $D^m -$spline [3] when value of parameter $\varepsilon$ is small. Approximating properties of the spline are getting better with smaller values of $\varepsilon$, however with decreasing value of $\varepsilon$ the condition number of Gram matrix increases. Therefore, when choosing the value of parameter $\varepsilon$, a compromise is needed. In practice, it is necessary to choose such value of the $\varepsilon$ that condition number of Gram matrix is small enough. Numerical procedures of the matrix condition number estimation are well known (e.g. [4]).
As well, as a rule, it is useful to preprocess the source data of the problem by scaling the domain $\Omega \subset R^n$ in which the interpolation nodes $p_i, p'_i$ are located into a unit hypercube.

Often accurate values of $u_i, u'_i$ in (1), (2) are not known. Assuming that uniform error bounds $\delta_i, \delta'_i$ are given, we can formulate the problem of function $\varphi$ approximation by constructing a uniform smoothing normal spline $\sigma_2$:
\begin{eqnarray}
&& \sigma_2 = {\rm arg\,min}\{ \| \varphi \|^2_{H^{s_r}_\varepsilon} : (35), (36), \forall \varphi \in {H^{s_r}_\varepsilon} (R^n) \} \, ,
\\
&& | \varphi (p_i) - u_i | \le \delta _i \, , \quad p_i \in P_1 \ , 1 \le i \le M_1, \\
&& \left|  \frac{ \partial{\varphi} }{ \partial{e_i} } (p'_i) - u'_i  \right| \le \delta' _i \, , \ p'_i \in P_2 \ , 1 \le i \le M_2 \, .
\end{eqnarray}

In a case when a mean squared error $\delta^2$ is known an approximating mse-smoothing normal spline $\sigma_3$ can be constructed as a solution of the following variational problem:

\begin{eqnarray}
&& \sigma_3 = {\rm arg\,min}\{ \| \varphi \|^2_{H^{s_r}_\varepsilon} : (38)\, , \forall \varphi \in {H^{s_r}_\varepsilon} (R^n) \} \, ,
\\
&& \sum _{i \in P_1} ( \varphi (p_i) - u_i )^2 + \sum _{i \in P_2}\left( \frac{ \partial{\varphi} }{ \partial{e_i} } (p'_i) - u'_i \right)^2 \le \delta ^2 \, .
\end{eqnarray} Constraints (35)—(36) and (38) define convex closed sets in Hilbert space $H^{s_r}_\varepsilon (R^n)$, $s_r = r + n/2 + 1/2, \, r=1, 2, \dots$. Under assumptions made earlier these sets are not empty, thus problems (34) and (37) are problems of finding a projection of zero element of Hilbert space to a non-empty closed convex set. A solution of such problem exists and it is unique.

Algorithms of solving problems (34) and (37) will be described in next posts.

Earlier a problem of multivariate normal splines constructing was treated in ([2], [7], [8]) and presented at [9] and [10].

References

[1] A. Balakrishnan, Applied Functional Analysis, New York, Springer-Verlag, 1976.
[2] V. Gorbunov, I. Kohanovsky, K. Makedonsky, Normal splines in reconstruction of multi-dimensional dependencies. Papers of WSEAS International Conference on Applied Mathematics, Numerical Analysis Symposium, Corfu, 2004. (http://www.wseas.us/e-library/conferences/corfu2004/papers/488-312.pdf)
[3] J. Duchon, Splines minimizing rotation-invariant semi-norms in Sobolev spaces, Lect. Notes in Math., Vol. 571, Springer, Berlin, 1977
[4] W.Hager, Condition Estimates, SIAM Journal on Scientific and Statistical Computing, Vol.5, N.2, 1984
[5] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[6] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[7] I. Kohanovsky, Normal Splines in Computing Tomography, Avtometriya, 1995, N 2. (https://www.iae.nsk.su/images/stories/5_Autometria/5_Archives/1995/2/84-89.pdf)

[8] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.

[9] I. Kohanovsky, Multidimensional Normal Splines and Problem of Physical Field Approximation, International Conference on Fourier Analysis and its Applications, Kuwait, 1998.
[10] I. Kohanovsky, Normal splines in fractional order Sobolev spaces and some of its
applications, The Third Siberian Congress on Applied and Industrial mathematics (INPRIM-98), Novosibirsk, 1998.