Friday, July 19, 2019

Quadratic Programming in Hilbert space. Part II. Adaptive algorithm.

This is a continuation to the previous post.
$\setCounter{27}$
One disadvantage of the algorithm described in the previous post is that it requires an initial feasible point. Getting such initial point may require a significant amount of computations. Also in a case when number of the inequality constraints is large the rate of convergence may be slow. Often occurs a situation when an active set corresponding to the problem solution includes a small number of the inequality constraints, significantly less than their total number. In such cases it makes sense to apply a different approach for solving the problem (1),(4),(5). The essence of it is as follows: we construct a sequence of $\sigma^k$ approximations, each of which is a solution to the auxiliary minimization problem of the functional (1) on a certain subset of constraints (4), (5). The auxiliary minimization problems are defined in such a way that the value of the functional (1) strictly increases for each subsequent approximation. These minimization problems are solved with the algorithm outlined in the previous post. Thus, in order to get solution of the original problem (1),(4),(5), it is necessary to solve a series of problems with a small number of constraints. Let's describe this in details.
Assume there is a subset of constraints (4), (5):
\begin{eqnarray}
&&    {\langle h_i , \varphi \rangle}_H = b_i \,\quad  i \in I_1 \,  ,
\\
&&    {\langle h_i , \varphi \rangle}_H \le b_i \,\quad  i \in I_2^k \,   ,
\\ \nonumber
&& I_2^k \subset I_2 \ , \qquad I^k = I_1 \cup I_2^k \, ,
\end{eqnarray}
where $I_1$ and $I_2$ are defined in (7). Denote by $\Phi^k$ the convex set defined by constraints (28) and (29). Obviously $\Phi$ (a set defined by constraints (4) and (5)) is a subset of $\Phi^k$.
Let $\hat\sigma^k$ be a normal solution of system (28), (29). Assume that $\hat\sigma^k$ is not satisfying all constraints (5), i.e. $\hat\sigma^k \not\in \Phi$ (otherwise $\hat\sigma^k$ is the solution of the problem (1),(4),(5)), and $i_t \in I_2 \setminus I_2^k$ is a number of the most violated constraint. We set
\begin{eqnarray*}
I_2^{k+1} = I_2^k  \cup \{ i_t \} \, , \qquad I^{k+1} = I_1 \cup I_2^{k+1} \, .
\end{eqnarray*}
Let $\Phi_{k+1}$ be the convex set defined by constraints with numbers from $I^{k+1}$, then $\Phi_{k+1}$ is a subset of $\Phi^k$:
\begin{eqnarray*}
\Phi_{k+1} \subset \Phi^k \, .
\end{eqnarray*}
and therefore the following inequality holds
\begin{eqnarray}
\| \hat\sigma^k \|_H ^2 < \| \hat\sigma^{k+1} \|_H ^2  \, .
\label{eq:sk}
\end{eqnarray}
This is the strict inequality because $\hat\sigma^k \notin \Phi^{k + 1}$.
In a case when $\hat\sigma^{k+1}$ does not satisfy a constraint with a number from $I_2 \setminus I_2^{k+1}$ the described process is repeated. Once the inequality (30) always holds and the number of constraints is finite, the algorithm converges to the normal solution of the system (4),(5) in a finite number of steps.
The computational details of the algorithm are described below.
Let $\hat A^k$ be an active set corresponding to the normal solution $\hat\sigma^k$ of system (28), (29):
\begin{eqnarray*}
\hat A^k = \{ i \in I^k : \langle h_i , \hat\sigma^k \rangle_H = b_i \} \, ,
\end{eqnarray*}
then
\begin{eqnarray}
 && \hat\sigma^k = \sum_{j \in \hat A^k } \mu_j h_j \, ,
\\
&& \sum_{j \in \hat A^k } g_{ij} \mu_j = b_i \ , \quad i \in \hat A^k \, ,
\\ \nonumber
&& g_{ij} = {\langle h_i , h_j \rangle}_H \ , \quad i \in \hat A^k \, , \ \  j \in \hat A^k \, .
\end{eqnarray}
For every $i \in I_2 \setminus I_2^k$ we compute values
\begin{eqnarray*}
\hat e_i^k = \langle h_i, \hat\sigma^k \rangle_H - b_i = \sum _{j \in \hat A^k } g_{ij} \mu _j - b_i  \, ,
\end{eqnarray*}
 and find among them the largest, whose index is denoted as $ i_t $, i.e.:
\begin{eqnarray*}
\hat e_{i_t}^k = \max _{i \in I_2 \setminus I_2^k} \{ \hat e_i^k \} \, .
\end{eqnarray*}
If  $\hat e_{i_t}^k \le 0$, then $\hat\sigma^k$ satisfies to all constraints (5) and it is a solution of the original problem (1), (4), (5), so $\sigma = \hat\sigma^k$. Otherwise a new auxiliary problem, which consists in minimizing the functional (1) on solutions of the augmented constraint subsystem 
\begin{eqnarray*}
&& {\langle h_i , \varphi \rangle}_H = b_i \,\quad  i \in I_1 \, ,
\\
&& {\langle h_i , \varphi \rangle}_H \le b_i \,\quad  i \in I_2^k \,  ,
\\
&& {\langle h_{i_t} , \varphi \rangle}_H \le b_{i_t} \, ,
\\
&& I_2^{k+1} = I_2^k \cup \{ i_t \} \ ,  \quad  I^{k+1} = I_1 \cup I_2^{k+1} \, ,
\end{eqnarray*}
should be solved. An initial feasible point required to solve this problem can be defined as the normal solution $\tilde\sigma^{k+1}$ of the system
\begin{eqnarray*}
&& {\langle h_i , \varphi \rangle}_H = b_i \, \quad  i \in \hat A^k \, ,
\\
&& {\langle h_{i_t} , \varphi \rangle}_H = b_{i_t} \, ,
\end{eqnarray*}
Let $\tilde I^{k+1}=\hat A^k \cup \{i_t\}$ then $\tilde\sigma^{k+1}$ is presented as 
\begin{eqnarray}
 && \tilde \sigma^{k+1} = \sum _{j \in \tilde I^{k+1} } \mu _j h_j \ ,
\\
 && \sum _{j \in \tilde I^{k+1}} g_{ij} \mu _j = b_i \ ,  \quad i \in \tilde I^{k+1} \ .
\end{eqnarray}
Since matrix of the system (34) differs from the matrix of the system (32) in only the last row and column, its Cholesky decomposition can be obtained by modifying the Cholesky decomposition of the system (32) matrix as it is described in Algorithms for updating a Cholesky factorization post.
Initial set $I^0$ can be defined as follows:
\begin{eqnarray*}
     I^0 = I_1 \cup \{i_0, i_0 + M\} \, ,
\end{eqnarray*}
where $i_0$ is such that
\begin{eqnarray*}
   \delta_{i_0} = \min_{L+1 \le i \le S} \delta_i \, .
\end{eqnarray*}

The idea of this algorithm was originally published in the paper [2] where it was applied for constructing of "splines in convex set". In that case, the minimizing objective function of the related variational problem was a semi-norm, and existing and uniqueness of the problem solution was not trivial. Corresponding auxiliary quadratic programming problems were not strictly convex ones and for their numerical solution a variant of coordinate descent algorithm was used. The algorithm showed good results when solving one-dimensional problems.

Some numerical results and comparing of the described algorithms will be presented in the next post.

References

[1] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[2] A. Kovalkov, On an algorithm for the construction on splines with discrete constraints of the type of inequalities, Serdica Bulgariacae mathematicae publicationes, Vol.9, N.4, 1983

No comments:

Post a Comment