Processing math: 100%

Wednesday, April 17, 2019

Quadratic Programming in Hilbert space. Part I.


This post describes an algorithm for the normal solution of a system of linear inequalities in a Hilbert space. An original version of this algorithm was developed by V. Gorbunov and published in [3]. A slightly modified version of that algorithm was described in [6].

Let H be a Hilbert space with inner product ,H and the induced norm H, there are elements hiH, numbers ui,1iM+L and positive numbers  δi,1iM.
It is required to minimize functional J:
(J,φ)=φ2Hmin ,

subject to constraints
hi,φH=ui,1iL ,|hi+L,φHui+L|δi,1iM, φH.

The elements hi,1iM+L are assumed to be linearly independent (thereby the system (2), (3) is compatible). Solution of the problem (1)—(3) obviously exists and is unique as a solution of the problem of finding a projection of zero element of Hilbert space onto a nonempty convex closed set [1, 5]. Expanding the modules in (3) we rewrite the system (2)—(3) in form:
hi,φH=bi1iL ,hi,φHbiL+1iN ,
where:
N=L+2M,S=L+M,bi=ui,1iL,bi+L=ui+L+δi , bi+S=ui+L+δi , hi+S=hi+L , 1iM .
Let Φ is a convex set defined by constraints (4) and (5). We denote
I1={1,,L},I2={L+1,,N},I=I1I2={1,,N},A(φ)={iI:hi,φH=bi}, P(φ)=IA(φ),Ψ(φ)={ψH:hi,ψH=bi, iA(φ)} ,gij=hi,hjH,1i,jS .
Feasible part of set Ψ(φ),φΦ is a face of set Φ containing φ. If A(φ) is empty (it is possible when L=0) then Ψ(φ)=H

In accordance with optimality conditions [4, 5] the solution of the problem (1), (4), (5) can be represented as: 
σ=Li=1μihi+M+Li=L+1(μiμi+M)hi ,μi0 ,μi+M0 ,L+1iL+M,μi(hi,σHbi)=0,L+1iN,μiμi+M=0 ,   L+1iS.
Here complementary slackness conditions (9) means that μk=0 for kP(σ) and the relations (10) reflect the fact that any pair of constraints (5) with indices i and i+M cannot be simultaneously active on the solution.Thereby there are at most S non-trivial coefficients μi in formula (8) and actual dimension of the problem (1), (4), (5) is S.

Let ˆσ be the normal solution of the system (4), then it can be written as
ˆσ=Li=1μihi ,

where coefficients μi are determined from the system of linear equations with symmetric positive definite Gram matrix {gij} of the linear independent elements hi:
Lj=1gijμj=bi ,1iL .

If L=0 then there are no constraints (4) and ˆσ=0. If ˆσ satisfies constraints (5), that is, the inequalities
hi,ˆσHbi ,L+1iN ,

which can be written like this:
Lj=1gijμjbi ,L+1iN ,

then ˆσ is a solution to the original problem (1), (4), (5), i.e. σ=ˆσ.

Consider nontrivial case when ˆσ does not satisfy constraints (5). In this case σ belongs to the boundary of the set Φ and is the projection of zero element of space H onto the set Ψ(σ).
The projection ϑ of zero element of space H onto the set Ψ(φ) can be presented as
ϑ=iA(φ)μihi ,

here factors μi are defined from the system of linear equations with symmetric positive definite matrix
jA(φ)gijμj=bi ,iA(φ) ,

If factors μi, iI2A(φ) corresponding to the inequality constraints are nonpositive, then we can set μi=0 for iP(φ) and get all conditions (8)—(10) satisfied thereby ϑ=σ is a solution of the problem under consideration. The following algorithm is based on this remark.  Let's describe its iteration.

Let σk be a feasible point of the system (4),(5):
σk=Ni=1μkihi ,

In this presentation there are at most S non-zero multipliers μki.
We denote ϑk as a projection of zero element of space H onto the set Ψ(ϑk)Ak=A(σk) and Pk=P(σk)=IAk. Then ϑk can be resented as
ϑk=iAkλkihi=Ni=1λkihi ,λki=0, iPk ,jAkgijλkj=bi,iAk .
There are two possible cases: the point ϑk is feasible or it is not feasible.
In the first case we check the optimality conditions, namely: if λki0, iI2 then  ϑk is the solution of the problem. If the optimality conditions are not satisfied then we set σk+1=ϑk, find an index iAk such that λki>0 and remove it from Ak.
In the second case σk+1 will be defined as a feasible point of the ray ϑ(t)
ϑ(t)=σk+t(ϑkσk) ,
                                                                                                                                           such that it is closest to the ϑk. Denote tkmin the corresponding value of t and ikPk — a related number of the violated constraint. This index ik will be added to Ak forming Ak+1.
Thereby all σk are feasible points, that is, the minimization process proceeds within the feasible region and value of σkH is not increasing. The minimization process is finite because of linear independence of the constraints, it eliminates the possibility of the algorithm cycling.
The feasibility of the ϑk can be checked as follows. Introduce the values eki,kPk:
eki=hi,ϑkσkH=Nj=1gij(λkjμkj) ,iPk ,

If eki<0, iPk then ϑk is a feasible point. Additionally, in a case when λki0, iI2Ak it is solution of the problem.
 In a case when exist some eki>0 we calculate values
tki=bihi,σkHeki ,hi,σkH=Nj=1gijμkjiPk : eki>0.

Here all tki>0. Now the maximum feasible step tkmin is computed as
tkmin=miniPk{tki} 

and ϑk is feasible if and only if tkmin1 (see 19).

Thereby the algorithm's iteration consist of seven steps:
  1. Initialization. Let σ0 be a feasible point of the system (4),(5) and μ0i  are corresponding multipliers (16).
  2. Compute multipliers λki, iAk as solution of the system (18).
  3. If |Ak|=S then go to Step 6.
  4. Compute tki, iPk : eki>0. Find tkmin and the corresponding index ik
  5. If tkmin<1 (projection ϑk is not feasible) then set
    μk+1i=μki+tkmin(λkiμki) ,iAk ,Ak+1=Ak{ik} ,

    and return to Step 2.
  6. (projection ϑk is feasible). If exists index ip, ipI2Ak such that λkip>0 then set
    Ak+1=Ak{ip} ,μk+1i=λki ,iAk ,

    and return to Step 2. Otherwise go to Step 7.
  7. Set σ=ϑk. Stop.
The algorithm starts from an initial feasible point of the system (4),(5). Such point σ0 can be defined as the normal solution of the system
hi,φH=ui1iS ,

and can be presented as
σ0=Si=1μ0ihi ,

where multipliers μ0i are defined from the system
Sj=1gijμ0j=ui ,1iS .

The algorithm described here implements a variant of the active set method [2] with the simplest form of the minimized functional (1). It can also be treated as a variant of the conjugate gradient method [7] and allows to solve the considered problem by a finite number of operations. The conjugate gradient method in this case is reduced into the problem of finding the projection ϑk, that is, into a solution of system (18). Infinite dimensionality of the space H for this algorithm is irrelevant.
The most computationally costly part of the algorithm is calculation of projections ϑk by solving systems (18). Matrices of all systems (18) are positive definite (by virtue of linear independence of elements hi) and it is convenient to use Cholesky decomposition to factorize them. At every step of the algorithm a constraint is added to or removed from the current active set and the system (18) matrix gets modified by the corresponding row and symmetric column addition or deletion. It is possible to get Cholesky factor of the modified matrix without computing the full matrix factorization (see previous post), it allows greatly reduce the total volume of computations.

This algorithm can be also applied in a case when instead of modular constraints (3) there are one-sided inequality constraints
hi+L,φHui+L,1iM,

for that it is enough to set 
bi+L=ui+L ,bi+S=,1iM.
References

[1] A. Balakrishnan, Applied Functional Analysis, Springer-Verlag, New York, 1976.
[2]  P. Gill, W. Murray, M. Wright, Practical Optimization, Academic Press, London, 1981.
[3] V. Gorbunov, The method of reduction of unstable computational problems (Metody reduktsii neustoichivykh vychislitel'nkh zadach), Ilim, 1984.
[4] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[5] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[6] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[7] B. Pshenichnyi, Yu. Danilin, Numerical methods in extremal problems,  Mir Publishers, Moscow, 1978.