Splines in Bessel Potential Spaces
A blog dedicated to multivariate normal splines theory and applications.
Monday, September 28, 2020
Tuesday, March 24, 2020
Friday, November 29, 2019
Interpolating Normal Splines: One-dimensional case
This topic is concerned with numerical solution of the following interpolation problems:
Problem 1. Given points x1<x2<⋯<xn1 find a function f such that
f(xi)=ui,i=1,2,…,n1 ,
Problem 2. Given points x1<x2<⋯<xn1, s1<s2<⋯<sn2 find a function f such that
f(xi)=ui,i=1,2,…,n1 ,f′(sj)=vj,j=1,2,…,n2 ,n1≥0, n2≥0,
Problem 3. Given points x1<x2<⋯<xn1, s1<s2<⋯<sn2 and t1<t2<⋯<tn3 find a function f such that
f(xi)=ui,i=1,2,…,n1 ,f′(sj)=vj,j=1,2,…,n2 ,f″(tk)=wk,k=1,2,…,n3,n1≥0, n2≥0, n3≥0. Note that knots {xi}, {sj} and {tk} may coincide.
Assume that function f is an element of Hilbert space H=H(X) over a set X (here X is R or an interval from R) and Hilbert space is selected in a such way that it is continuously embedded in the space C2(X) of functions continuous with their second derivatives and therefore functionals Fi, F′j, and F″k
Fi(φ)=φ(xi), F′j(φ)=φ′(sj), F″k(φ)=φ″(tk),∀φ∈H,xi,sj,tk∈X, i=1,2,…,n1, j=1,2,…,n2, k=1,2,…,n3. are linear continuous functionals in H. It is obvious that all these functionals are linear independent. In accordance with Riesz representation theorem [1] these linear continuous functionals can be represented in the form of inner product of some elements hi,h′j,h″k∈H and φ∈H, for any φ∈H:
f(xi)=Fi(φ)=⟨hi,φ⟩H,F′j(φ)=⟨h′j,φ⟩H,F″k(φ)=⟨h″k,φ⟩H,∀φ∈H, i=1,2,…,n1, j=1,2,…,n2, k=1,2,…,n3. Elements hi,h′j and h″k are twice continuously differentiable functions.
Original constraint systems of Problems 1—3 can be written in form:
f(xi)=Fi(f)=⟨hi,f⟩H=ui,f∈H, i=1,2,…,n1,f(xi)=Fi(f)=⟨hi,f⟩H=ui,f′(sj)=F′j(f)=⟨h′j,f⟩H=vj,f∈H, i=1,2,…,n1, j=1,2,…,n2,f(xi)=Fi(f)=⟨hi,f⟩H=ui,f′(sj)=F′j(f)=⟨h′j,f⟩H=vj,f″(tk)=F″k(f)=⟨h″k,f⟩H=wk,f∈H, i=1,2,…,n1, j=1,2,…,n2,k=1,2,…,n3, here all functions hi,h′j,h″k∈H are linear independent and every system of constrains (4), (5), (6) defines a nonempty convex and closed set (as intersection of hyper-planes in the Hilbert space H).
In accordance with generalized Lagrange method ([4], [5]) solution of the problem (7) can be written as follows:
σ1=z+n1∑i=1μihi , where coefficients μi are defined by the system
Original constraint systems of Problems 1—3 can be written in form:
f(xi)=Fi(f)=⟨hi,f⟩H=ui,f∈H, i=1,2,…,n1,f(xi)=Fi(f)=⟨hi,f⟩H=ui,f′(sj)=F′j(f)=⟨h′j,f⟩H=vj,f∈H, i=1,2,…,n1, j=1,2,…,n2,f(xi)=Fi(f)=⟨hi,f⟩H=ui,f′(sj)=F′j(f)=⟨h′j,f⟩H=vj,f″(tk)=F″k(f)=⟨h″k,f⟩H=wk,f∈H, i=1,2,…,n1, j=1,2,…,n2,k=1,2,…,n3, here all functions hi,h′j,h″k∈H are linear independent and every system of constrains (4), (5), (6) defines a nonempty convex and closed set (as intersection of hyper-planes in the Hilbert space H).
Problem of reconstruction of function f satisfying system of constraints (4); (5) or (6) is undetermined. We reformulate it as the problem of finding solution of the corresponding system of constraints that has minimal norm:
σ1=argmin{‖f−z‖2H:(4),z∈H,∀f∈H},σ2=argmin{‖f−z‖2H:(5),z∈H,∀f∈H},σ3=argmin{‖f−z‖2H:(6),z∈H,∀f∈H}, where z,z∈H is a "prototype" function. Solution of every problem (7), (8) and (9) exists and it is unique [1, 5] as a projection of element z on the nonempty convex closed set in Hilbert space H. Elements σ1,σ2 and σ3 are interpolating normal splines.σ1=z+n1∑i=1μihi , where coefficients μi are defined by the system
n1∑l=1gilμl=ui−⟨hi,z⟩H,1≤i≤n1, solution of the problem (8) can be written as follows:
σ2=z+n1∑i=1μihi+n2∑j=1μ′jh′j , where coefficients μi and μ′j are defined by the system
σ2=z+n1∑i=1μihi+n2∑j=1μ′jh′j , where coefficients μi and μ′j are defined by the system
n1∑l=1gilμl+n2∑j=1g′ijμ′j=ui−⟨hi,z⟩H,1≤i≤n1,n1∑i=1g′ijμi+n2∑l=1g″jlμ′l=vj−⟨h′j,z⟩H,1≤j≤n2, and solution of the problem (9) can be presented as:
σ3=z+n1∑i=1μihi+n2∑j=1μ′jh′j+n3∑k=1μ″kh″k , where coefficients μi, μ′j and μ″k are defined by the system
n1∑l=1gilμl+n2∑j=1g′ijμ′j+n3∑k=1g″ikμ″k=ui−⟨hi,z⟩H,1≤i≤n1,n1∑i=1g′ijμi+n2∑l=1g″jlμ′l+n3∑k=1g‴jkμ″k=vj−⟨h′j,z⟩H,1≤j≤n2,n1∑i=1g″ikμi+n2∑j=1g‴jkμ′j+n3∑l=1givklμ″l=w−⟨h″k,z⟩H,1≤k≤n3, Matrix of every system (11), (13) and (15) is a positive definite symmetric Gram matrix of the corresponding set of linearly independent elements {hi}, {hi},{h′j} and {hi},{h′j}, {h″k} and coefficients gil,g′ij,g″ik,g″jl,g‴jk,givkl are defined as follows:
Let H=H(X) be a reproducing kernel Hilbert space with reproducing kernel V(η,ξ). Remind ([2], [3]) the definition of the reproducing kernel. The reproducing kernel is a such function V(η,ξ) that
gil=⟨hi,hl⟩H, g′ij=⟨hi,h′j⟩H, g″ik=⟨hi,h″k⟩Hg″jl=⟨h′j,h′l⟩H, g‴jk=⟨h′j,h″k⟩H, givkl=⟨h″k,h″l⟩H.
- for every ξ∈X, V(η,ξ) as function of η belongs to H
- for every ξ∈X and every function φ∈H
V(η,ξ)=V(ξ,η), also, in the considered here case it is twice continuously differentiable function by ξ and by η. Differentiating the identity (17) allows to get the identities for derivatives:
dφ(ξ)dξ=⟨∂V(⋅,ξ)∂ξ,φ⟩H, d2φ(ξ)dξ2=⟨∂2V(⋅,ξ)∂ξ2,φ⟩H. Now it is possible to express functions hi,h′j,h″k via the reproducing kernel V. Comparing (4), (5), (6) with (17) and (18) we receive:
hi(η)=V(η,xi),i=1,2,…,n1h′j(η)=∂V(η,sj)∂ξ,j=1,2,…,n2 ,h″k(η)=∂2V(η,tk)∂ξ2, k=1,2,…,n3 .
The coefficients (16) of the Gram matrices can be presented as ([3], [11], [12]):
gil=⟨hi,hl⟩H=⟨V(⋅,xi),V(⋅,xl)⟩H=V(xi,xl),g′ij=⟨hi,h′j⟩H=⟨V(⋅,xi),∂V(⋅,sj)∂ξ⟩H=∂V(xi,sj)∂ξ,g″ik=⟨hi,h″k⟩H=⟨V(⋅,xi),∂2V(⋅,tk)∂ξ2⟩H=∂2V(xi,tk)∂ξ2. With the help of (17) and (21), we can also calculate g″jl ([11], [12]):
g″jl=⟨h′j,h′l⟩H=⟨∂V(⋅,sj)∂ξ,∂V(⋅,sl)∂ξ⟩H=ddξ⟨V(⋅,ξ),∂V(⋅,sl)∂ξ⟩H|ξ=sj=ddξ(∂V(ξ,sl)∂ξ)|ξ=sj=∂2V(sj,sl)∂η∂ξ. Further
g‴jk=⟨h′j,h″k⟩H=∂3V(sj,tk)∂η∂ξ2,givkl=⟨h″k,h″l⟩H=∂4V(tk,tl)∂η2∂ξ2, and
⟨hi,z⟩H=⟨V(⋅,xi),z⟩H=z(xi),⟨h′j,z⟩H=z′(sj),⟨h″k,z⟩H=z″(tk).
Here normal splines will be constructed in Sobolev spaces W32[a,b], W32[a,∞) and in Bessel potential space H3ε(R) (See [6, 7, 8, 9] for details). Elements of these spaces can be treated as twice continuously differentiable functions.
Reproducing kernel for Sobolev spaces Wl2[0,1] (here l — any positive integer) was constructed in work [10]. Thus, reproducing kernel for Sobolev space W32[0,1] with norm
‖f‖=(2∑i=0(f(i)(0))2+∫10(f(3)(s))2ds)1/2, can be presented as
V(η,ξ)={∑2i=0ξi!i(ηi!i+(−1)iη5−i(5−i)!),0≤η≤ξ≤1∑2i=0ηi!i(ξi!i+(−1)iξ5−i(5−i)!),0≤ξ≤η≤1
or
V(η,ξ)={1+ηξ+(η5−5η4ξ+10η3ξ2+30η2ξ2)120,0≤η≤ξ≤11+ηξ+(ξ5−5ξ4η+10ξ3η2+30ξ2η2)120,0≤ξ≤η≤1. Correspondingly
∂V(η,ξ)∂ξ=η(4ηξ(η+3)−η3)24+η,∂2V(η,ξ)∂η∂ξ=−η36+ηξ(η+2)2+1,∂2V(η,ξ)∂ξ2=η2(η+3)6,∂3V(η,ξ)∂η∂ξ2=η22+η,∂4V(η,ξ)∂η2∂ξ2=η+1,0≤η≤ξ≤1. In addition, the following formulae are required for computing the normal spline derivatives
or
V(η,ξ)={1+ηξ+(η5−5η4ξ+10η3ξ2+30η2ξ2)120,0≤η≤ξ≤11+ηξ+(ξ5−5ξ4η+10ξ3η2+30ξ2η2)120,0≤ξ≤η≤1. Correspondingly
∂V(η,ξ)∂ξ=η(4ηξ(η+3)−η3)24+η,∂2V(η,ξ)∂η∂ξ=−η36+ηξ(η+2)2+1,∂2V(η,ξ)∂ξ2=η2(η+3)6,∂3V(η,ξ)∂η∂ξ2=η22+η,∂4V(η,ξ)∂η2∂ξ2=η+1,0≤η≤ξ≤1. In addition, the following formulae are required for computing the normal spline derivatives
∂V(η,ξ)∂η=η4−4ξ(η3−6)+6ηξ2(η+2)24,∂2V(η,ξ)∂η2=η3−3η2ξ+3ξ2(η+1)6,∂3V(η,ξ)∂η2∂ξ=−η22+ηξ+ξ,0≤η≤ξ≤1. Thereby we can construct a normal interpolating spline in interval [0,1]. Solving the interpolating Problems 1 — 3 in an arbitrary interval can be done by mapping the latter to [0,1] through an affine change of variable. Let's do it for the Problem 3.
Define constants a and b as
a=min(x1,s1,t1),b=max(xn1,sn2,tn3), and introduce values ˉxi,ˉsj,ˉtk:
Define constants a and b as
a=min(x1,s1,t1),b=max(xn1,sn2,tn3), and introduce values ˉxi,ˉsj,ˉtk:
ˉxi=xi−ab−a,ˉsj=sj−ab−a,ˉtk=tk−ab−a,i=1,…,n1,j=1,…,n2,k=1,…,n3. Then original Problem 3 is transformed to
Problem ˉ3. Given points 0≤ˉx1<ˉx2<⋯<ˉxn1≤1, 0≤ˉs1<ˉs2<⋯<ˉsn2≤1 and 0≤ˉt1<ˉt2<⋯<ˉtn3≤1 find a smooth function ˉf such thatˉf(ˉxi)=ui, i=1,2,…,n1 ,ˉf′(ˉsj)=vj(b−a), j=1,2,…,n2 ,ˉf″(ˉtk)=wk(b−a)2,k=1,2,…,n3 . Assuming ˉσ3(ˉη) is a normal spline constructed for the Problem ˉ3, the normal spline σ3(η) can be received as
σ3(η)=ˉσ3(η−ab−a),a≤η≤b.
Reproducing kernel for Sobolev spaces W32[0,∞) with norm
‖f‖=(∫∞0[(f(s))2+(f(3)(s))2]ds)1/2, was received in [11]. It is the symmetric function
V(η,ξ)={∑6i=1yi(η)ci(ξ),0≤η≤ξ<∞,∑6i=1yi(ξ)ci(η),0≤ξ≤η<∞, where
y1(η)=exp(η),y2(η)=exp(−η),y3(η)=exp(−η/2)cos(√3η/2),y4(η)=exp(−η/2)sin(√3η/2),y5(η)=exp(η/2)cos(√3η/2),y6(η)=exp(η/2)sin(√3η/2),c1(ξ)=−exp(−ξ)/6,c2(ξ)=exp(−ξ/2)(√3sin(√3ξ/2)−cos(√3ξ/2))/3−exp(−ξ)/2,c3(ξ)=exp(−ξ/2)(sin(√3ξ/2)/√3−cos(√3ξ/2))/2−exp(−ξ)/3,c4(ξ)=exp(−ξ/2)(√3cos(√3ξ/2)−5sin(√3ξ/2))/6+exp(−ξ)/√3,c5(ξ)=−exp(−ξ/2)(cos(√3ξ/2)−√3sin(√3ξ/2))/6,c6(ξ)=−exp(−ξ/2)(sin(√3ξ/2)+√3cos(√3ξ/2))/6. Correspondingly
∂V(η,ξ)∂ξ=6∑i=1yi(η)c′i(ξ),∂2V(η,ξ)∂η∂ξ=6∑i=1y′i(η)c′i(ξ),∂2V(η,ξ)∂ξ2=6∑i=1yi(η)c″i(ξ),∂3V(η,ξ)∂η∂ξ2=6∑i=1y′i(η)c″i(ξ),∂4V(η,ξ)∂η2∂ξ2=6∑i=1y″i(η)c″i(ξ),∂V(η,ξ)∂η=6∑i=1y′i(η)ci(ξ),∂2V(η,ξ)∂η2=6∑i=1y″i(η)ci(ξ),∂3V(η,ξ)∂η2∂ξ=6∑i=1y″i(η)c′i(ξ),0≤η≤ξ≤1. Where
H3ε(R)={f|f∈S′,(ε2+|s|2)3/2F[f]∈L2(R)},ε>0, where S′(R) is space of L. Schwartz tempered distributions and F[f] is a Fourier transform of the f [8, 19, 20]. The parameter ε introduced here may be treated as a "scaling parameter". It allows to control approximation properties of the normal spline which usually are getting better with smaller values of ε, also it may be used to reduce the illconditioness of the related computational problem (in traditional theory ε=1) (see [15, 9]). This is a Hilbert space with norm
‖f‖H3ε=‖(ε2+|s|2)3/2F[φ]‖L2 . It is continuously embedded in the Hölder space C3b(R) that consists of all functions having bounded continuous derivatives up to order 2 ([19]). The reproducing kernel of this space is defined up to a constant as follows ([15, 9])
V(η,ξ,ε)=exp(−ε|ξ−η|)(3+3ε|ξ−η|+ε2|ξ−η|2). Correspondingly
The normal splines method for one-dimensional function interpolation and linear ordinary differential and integral equations was proposed in [13] and developed in [10, 11, 12]. General formula of reproducing kernel for Bessel potential spaces was published in [8] and its simplified version was given in works [15, 17, 18]. Multidimensional normal splines method was developed for two-dimensional problem of low-range computerized tomography in [16] and applied for solving a mathematical economics problem in [14]. Further results were reported on seminars and conferences.
References
y′1(η)=exp(η),y′2(η)=−exp(−η),y′3(η)=−exp(−η/2)sin(√3η/2+π/6),y′4(η)=exp(−η/2)cos(√3η/2+π/6),y′5(η)=exp(η/2)sin(π/6−√3η/2),y′6(η)=exp(η/2)cos(π/6−√3η/2),y″1(η)=exp(η),y″2(η)=exp(−η),y″3(η)=exp(−η/2)(sin(√3η/2+π/6)−√3cos(√3η/2+π/6))/2,y″4(η)=−exp(−η/2)(sin(√3η/2+π/6)+√3cos(√3η/2+π/6))/2,y″5(η)=−exp(η/2)(√3sin(√3η/2)+cos(√3η/2))/2,y″6(η)=exp(η/2)(√3cos(√3η/2)−sin(√3η/2))/2,c′1(ξ)=exp(−ξ)/6,c′2(ξ)=2exp(−ξ/2)cos(√3ξ/2)/3+exp(−ξ)/2,c′3(ξ)=exp(−ξ/2)cos(π/6−√3ξ/2)/√3+exp(−ξ)/3,c′4(ξ)=exp(−ξ/2)(−3√3cos(√3ξ/2)+5sin(√3ξ/2))/6−exp(−ξ)/√3,c′5(ξ)=exp(−ξ/2)cos(√3ξ/2)/3,c′6(ξ)=exp(−ξ/2)sin(√3ξ/2)/3,c″1(ξ)=−exp(−ξ)/6,c″2(ξ)=−2exp(−ξ/2)sin(√3ξ/2+π/6)/3−exp(−ξ)/2,c″3(ξ)=−exp(−ξ/2)sin(√3ξ/2)/√3−exp(−ξ)/3,c″4(ξ)=exp(−ξ/2)(√3cos(√3ξ/2)+2sin(√3ξ/2))/3+exp(−ξ)/√3,c″5(ξ)=−exp(−ξ/2)(cos(√3ξ/2)+√3sin(√3ξ/2))/6,c″6(ξ)=exp(−ξ/2)(√3cos(√3ξ/2)−sin(√3ξ/2))/6.
Reproducing kernel for Bessel potential space was presented in [8] and its simplified variant in [16, 15, 17,18]. Normal splines will be constructed in Bessel potential space H3ε(R) defined asH3ε(R)={f|f∈S′,(ε2+|s|2)3/2F[f]∈L2(R)},ε>0, where S′(R) is space of L. Schwartz tempered distributions and F[f] is a Fourier transform of the f [8, 19, 20]. The parameter ε introduced here may be treated as a "scaling parameter". It allows to control approximation properties of the normal spline which usually are getting better with smaller values of ε, also it may be used to reduce the illconditioness of the related computational problem (in traditional theory ε=1) (see [15, 9]). This is a Hilbert space with norm
‖f‖H3ε=‖(ε2+|s|2)3/2F[φ]‖L2 . It is continuously embedded in the Hölder space C3b(R) that consists of all functions having bounded continuous derivatives up to order 2 ([19]). The reproducing kernel of this space is defined up to a constant as follows ([15, 9])
V(η,ξ,ε)=exp(−ε|ξ−η|)(3+3ε|ξ−η|+ε2|ξ−η|2). Correspondingly
∂V(η,ξ,ε)∂ξ=−ε2exp(−ε|ξ−η|)(ξ−η)(ε|ξ−η)+1|,∂2V(η,ξ,ε)∂η∂ξ=−ε2exp(−ε|ξ−η|)(ε|ξ−η|(ε|ξ−η|−1)−1),∂2V(η,ξ,ε)∂ξ2=ε2exp(−ε|ξ−η|)(ε|ξ−η|(ε|ξ−η|−1)−1),∂3V(η,ξ,ε)∂η∂ξ2=ε4exp(−ε|ξ−η|)(ξ−η)(ε|ξ−η|−3),∂4V(η,ξ,ε)∂η2∂ξ2=ε4exp(−ε|ξ−η|)(ε|ξ−η|(ε|ξ−η|−5)+3). In addition, the following formulae are required for computing the normal spline derivatives
∂V(η,ξ,ε)∂η=ε2exp(−ε|ξ−η|)(ξ−η)(ε|ξ−η|+1),∂2V(η,ξ,ε)∂η2=ε2exp(−ε|ξ−η|)(ε|ξ−η|(ε|ξ−η|−1)−1),∂3V(η,ξ,ε)∂η2∂ξ=−ε4exp(−ε|ξ−η|)(ξ−η)(ε|ξ−η|−3).
The normal splines method for one-dimensional function interpolation and linear ordinary differential and integral equations was proposed in [13] and developed in [10, 11, 12]. General formula of reproducing kernel for Bessel potential spaces was published in [8] and its simplified version was given in works [15, 17, 18]. Multidimensional normal splines method was developed for two-dimensional problem of low-range computerized tomography in [16] and applied for solving a mathematical economics problem in [14]. Further results were reported on seminars and conferences.
References
[1] A. Balakrishnan, Applied Functional Analysis, New York, Springer-Verlag, 1976.
[2] N. Aronszajn, Theory of reproducing kernels, Tranzactions of the AMS.– 950 – Vol.68.
[3] A. Bezhaev, V. Vasilenko, Variational Theory of Splines, Springer US, 2001.
[4] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[5] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[6] R. Adams, J. Fournier, Sobolev Spaces. Pure and Applied Mathematics. (2nd ed.). Boston, MA: Academic Press, 2003.
[7] D.R. Adams, L.I. Hedberg, Function spaces and potential theory. Berlin, Springer, 1996.
[8] N. Aronszajn, K.T. Smith, Theory of bessel potentials I, Ann.Inst.Fourier, 11, 1961.
[9] Reproducing Kernel of Bessel Potential space[10] V. Gorbunov, V. Petrishchev, Improvement of the normal spline collocation method for linear differential equations, Comput. Math. Math. Phys., 43:8 (2003), 1099–1108
[11] V. Gorbunov, V. Sviridov, The method of normal splines for linear DAEs on the number semi-axis. Applied Numerical Mathematics, 59(3-4), 2009, 656–670.
[12] V. Gorbunov, Extremum Problems of Measurements Data Processing, Ilim, 1990 (in Russian).
[13] V. Gorbunov, The method of normal spline collocation, USSR Computational Mathematics and Mathematical Physics,Volume 29, Issue 1, 1989, Pages 145-154.
[14] V. Gorbunov, I. Kohanovsky, K. Makedonsky, Normal splines in reconstruction of multi-dimensional dependencies. Papers of WSEAS International Conference on Applied Mathematics, Numerical Analysis Symposium, Corfu, 2004. (http://www.wseas.us/e-library/conferences/corfu2004/papers/488-312.pdf)
[15] I. Kohanovsky, Multidimensional Normal Splines and Problem of Physical Field Approximation, International Conference on Fourier Analysis and its Applications, Kuwait, 1998.
[16] I. Kohanovsky, Normal Splines in Computing Tomography, Avtometriya, 1995, N 2. (https://www.iae.nsk.su/images/stories/5_Autometria/5_Archives/1995/2/84-89.pdf)
[17] H. Wendland, Scattered Data Approximation. Cambridge University Press, 2005.
[18] R. Schaback, Kernel-based Meshless Methods, Lecture Notes, Goettingen, 2011.
[19] M.S. Agranovich, Sobolev Spaces, Their Generalizations and Elliptic Problems in Smooth and Lipschitz Domains, Springer, Switzerland, 2015.
[19] M.S. Agranovich, Sobolev Spaces, Their Generalizations and Elliptic Problems in Smooth and Lipschitz Domains, Springer, Switzerland, 2015.
[20] H. Triebel, Interpolation. Function Spaces. Differential Operators. North-Holland, Amsterdam, 1978.
Thursday, October 24, 2019
Quadratic Programming in Hilbert space. Part III. General case.
The algorithm described in the previous post can be applied for solving a more general quadratic programming problem [1], [2].
Let H is a Hilbert space with inner product ⟨⋅,⋅⟩H and norm ‖⋅‖H and A is a linear positive-definite self-adjoint operator acting in H. Let linear manifold D=D(A), D⊂H is the domain of definition of this operator and the element z belongs to D.
It is required to minimize functional J:
(J,φ)=⟨Aφ,φ⟩H−2⟨z,φ⟩H→min ,
subject to constraints
⟨hi,φ⟩H=ui,1≤i≤L ,⟨hi+L,φ⟩H≤ui+L,1≤i≤M,φ∈D,D⊂H.
The elements hi∈D,1≤i≤M+L are assumed to be linearly independent (thereby the system of constraints is compatible).
Let HA be an energetic space generated by the operator A [2], [3]. This space is a completion of linear manifold D with a scalar product
[φ,ϑ]A=⟨Aφ,ϑ⟩H ,φ,ϑ∈HA,
and corresponding norm [φ,φ]A.
We introduce elements ˉz and ˉhi from HA as solutions of the following equations
Aˉz=z,ˉz∈HA,Aˉhi=hi, ˉhi∈HA,1≤i≤M+L,
and quadratic functional
(ˉJ,φ)=‖φ−z‖2A.
It is easy to see that
(J,φ)=(ˉJ,φ)−‖z‖2A.
and constraints (2), (3) are transformed to
[ˉhi,φ]A=ui,1≤i≤L ,[ˉhi+L,φ]A≤ui+L, 1≤i≤M+L,φ∈HA,ˉhi∈HA.
Thereby the problem of minimization of functional (1) on linear manihold D subject to constraints (2), (3) is equivalent to the problem of minimization of functional (4) in Hilbert space HA under constraints (5), (6). This later problem is a problem of finding a projection of element z onto a nonempty convex closed set defined by (5), (6) and it has an unique solution. The solution can be found by using algorithms described in the previous posts.
ReferencesThereby the problem of minimization of functional (1) on linear manihold D subject to constraints (2), (3) is equivalent to the problem of minimization of functional (4) in Hilbert space HA under constraints (5), (6). This later problem is a problem of finding a projection of element z onto a nonempty convex closed set defined by (5), (6) and it has an unique solution. The solution can be found by using algorithms described in the previous posts.
[1] V. Gorbunov, Extremum Problems of Measurements Data Processing, Ilim, 1990 (in Russian).
[2] V.Lebedev, An Introduction to Functional Analysis in Computational Mathematics, Birkhäuser, 1997
[3] V. Agoshkov, P. Dubovski, V. Shutyaev, Methods for Solving Mathematical Physics Problems. Cambridge International Science Publishing Ltd, 2006.
Friday, July 19, 2019
Quadratic Programming in Hilbert space. Part II. Adaptive algorithm.
This is a continuation to the previous post.
One disadvantage of the algorithm described in the previous post is that it requires an initial feasible point. Getting such initial point may require a significant amount of computations. Also in a case when number of the inequality constraints is large the rate of convergence may be slow. Often occurs a situation when an active set corresponding to the problem solution includes a small number of the inequality constraints, significantly less than their total number. In such cases it makes sense to apply a different approach for solving the problem (1),(4),(5). The essence of it is as follows: we construct a sequence of σk approximations, each of which is a solution to the auxiliary minimization problem of the functional (1) on a certain subset of constraints (4), (5). The auxiliary minimization problems are defined in such a way that the value of the functional (1) strictly increases for each subsequent approximation. These minimization problems are solved with the algorithm outlined in the previous post. Thus, in order to get solution of the original problem (1),(4),(5), it is necessary to solve a series of problems with a small number of constraints. Let's describe this in details.
Assume there is a subset of constraints (4), (5):
⟨hi,φ⟩H=bii∈I1,⟨hi,φ⟩H≤bii∈Ik2,Ik2⊂I2 ,Ik=I1∪Ik2,
where I1 and I2 are defined in (7). Denote by Φk the convex set defined by constraints (28) and (29). Obviously Φ (a set defined by constraints (4) and (5)) is a subset of Φk.
Let ˆσk be a normal solution of system (28), (29). Assume that ˆσk is not satisfying all constraints (5), i.e. ˆσk∉Φ (otherwise ˆσk is the solution of the problem (1),(4),(5)), and it∈I2∖Ik2 is a number of the most violated constraint. We set
In a case when ˆσk+1 does not satisfy a constraint with a number from I2∖Ik+12 the described process is repeated. Once the inequality (30) always holds and the number of constraints is finite, the algorithm converges to the normal solution of the system (4),(5) in a finite number of steps.One disadvantage of the algorithm described in the previous post is that it requires an initial feasible point. Getting such initial point may require a significant amount of computations. Also in a case when number of the inequality constraints is large the rate of convergence may be slow. Often occurs a situation when an active set corresponding to the problem solution includes a small number of the inequality constraints, significantly less than their total number. In such cases it makes sense to apply a different approach for solving the problem (1),(4),(5). The essence of it is as follows: we construct a sequence of σk approximations, each of which is a solution to the auxiliary minimization problem of the functional (1) on a certain subset of constraints (4), (5). The auxiliary minimization problems are defined in such a way that the value of the functional (1) strictly increases for each subsequent approximation. These minimization problems are solved with the algorithm outlined in the previous post. Thus, in order to get solution of the original problem (1),(4),(5), it is necessary to solve a series of problems with a small number of constraints. Let's describe this in details.
Assume there is a subset of constraints (4), (5):
⟨hi,φ⟩H=bii∈I1,⟨hi,φ⟩H≤bii∈Ik2,Ik2⊂I2 ,Ik=I1∪Ik2,
Let ˆσk be a normal solution of system (28), (29). Assume that ˆσk is not satisfying all constraints (5), i.e. ˆσk∉Φ (otherwise ˆσk is the solution of the problem (1),(4),(5)), and it∈I2∖Ik2 is a number of the most violated constraint. We set
Ik+12=Ik2∪{it},Ik+1=I1∪Ik+12.
Let Φk+1 be the convex set defined by constraints with numbers from Ik+1, then Φk+1 is a subset of Φk:
Φk+1⊂Φk.
and therefore the following inequality holds
and therefore the following inequality holds
‖ˆσk‖2H<‖ˆσk+1‖2H.
This is the strict inequality because ˆσk∉Φk+1.
The computational details of the algorithm are described below.
Let ˆAk be an active set corresponding to the normal solution ˆσk of system (28), (29):
ˆAk={i∈Ik:⟨hi,ˆσk⟩H=bi},
ˆσk=∑j∈ˆAkμjhj,∑j∈ˆAkgijμj=bi ,i∈ˆAk,gij=⟨hi,hj⟩H ,i∈ˆAk, j∈ˆAk.
For every i∈I2∖Ik2 we compute values
ˆeki=⟨hi,ˆσk⟩H−bi=∑j∈ˆAkgijμj−bi,
and find among them the largest, whose index is denoted as it, i.e.:
ˆekit=maxi∈I2∖Ik2{ˆeki}.
If ˆekit≤0, then ˆσk satisfies to all constraints (5) and it is a solution of the original problem (1), (4), (5), so σ=ˆσk. Otherwise a new auxiliary problem, which consists in minimizing the functional (1) on solutions of the augmented constraint subsystem
⟨hi,φ⟩H=bii∈I1,⟨hi,φ⟩H≤bii∈Ik2,⟨hit,φ⟩H≤bit,Ik+12=Ik2∪{it} ,Ik+1=I1∪Ik+12,
should be solved. An initial feasible point required to solve this problem can be defined as the normal solution ˜σk+1 of the system
⟨hi,φ⟩H=bii∈ˆAk,⟨hit,φ⟩H=bit,
Let ˜Ik+1=ˆAk∪{it} then ˜σk+1 is presented as
Let ˜Ik+1=ˆAk∪{it} then ˜σk+1 is presented as
˜σk+1=∑j∈˜Ik+1μjhj ,∑j∈˜Ik+1gijμj=bi ,i∈˜Ik+1 .
Since matrix of the system (34) differs from the matrix of the system (32) in only the last row and column, its Cholesky decomposition can be obtained by modifying the Cholesky decomposition of the system (32) matrix as it is described in Algorithms for updating a Cholesky factorization post.
Initial set I0 can be defined as follows:
I0=I1∪{i0,i0+M},
where i0 is such that
Initial set I0 can be defined as follows:
I0=I1∪{i0,i0+M},
where i0 is such that
δi0=minL+1≤i≤Sδi.
The idea of this algorithm was originally published in the paper [2] where it was applied for constructing of "splines in convex set". In that case, the minimizing objective function of the related variational problem was a semi-norm, and existing and uniqueness of the problem solution was not trivial. Corresponding auxiliary quadratic programming problems were not strictly convex ones and for their numerical solution a variant of coordinate descent algorithm was used. The algorithm showed good results when solving one-dimensional problems.
References
[1] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[2] A. Kovalkov, On an algorithm for the construction on splines with discrete constraints of the type of inequalities, Serdica Bulgariacae mathematicae publicationes, Vol.9, N.4, 1983
Wednesday, April 17, 2019
Quadratic Programming in Hilbert space. Part I.
This post describes an algorithm for the normal solution of a system of linear inequalities in a Hilbert space. An original version of this algorithm was developed by V. Gorbunov and published in [3]. A slightly modified version of that algorithm was described in [6].
Let H be a Hilbert space with inner product ⟨⋅,⋅⟩H and the induced norm ‖⋅‖H, there are elements hi∈H, numbers ui,1≤i≤M+L and positive numbers δi,1≤i≤M.
It is required to minimize functional J:
(J,φ)=‖φ‖2H→min ,
subject to constraints
⟨hi,φ⟩H=ui,1≤i≤L ,|⟨hi+L,φ⟩H−ui+L|≤δi,1≤i≤M, φ∈H.
The elements hi,1≤i≤M+L are assumed to be linearly independent (thereby the system (2), (3) is compatible). Solution of the problem (1)—(3) obviously exists and is unique as a solution of the problem of finding a projection of zero element of Hilbert space onto a nonempty convex closed set [1, 5]. Expanding the modules in (3) we rewrite the system (2)—(3) in form:
⟨hi,φ⟩H=bi1≤i≤L ,⟨hi,φ⟩H≤biL+1≤i≤N ,
where:
N=L+2M,S=L+M,bi=ui,1≤i≤L,bi+L=ui+L+δi , bi+S=−ui+L+δi , hi+S=−hi+L , 1≤i≤M .
Let Φ is a convex set defined by constraints (4) and (5). We denote
I1={1,…,L},I2={L+1,…,N},I=I1∪I2={1,…,N},A(φ)={i∈I:⟨hi,φ⟩H=bi}, P(φ)=I∖A(φ),Ψ(φ)={ψ∈H:⟨hi,ψ⟩H=bi, i∈A(φ)} ,gij=⟨hi,hj⟩H,1≤i,j≤S .
Feasible part of set Ψ(φ),φ∈Φ is a face of set Φ containing φ. If A(φ) is empty (it is possible when L=0) then Ψ(φ)=H.
In accordance with optimality conditions [4, 5] the solution of the problem (1), (4), (5) can be represented as:
σ=L∑i=1μihi+M+L∑i=L+1(μi−μi+M)hi ,μi≤0 ,μi+M≤0 ,L+1≤i≤L+M,μi(⟨hi,σ⟩H−bi)=0,L+1≤i≤N,μiμi+M=0 , L+1≤i≤S.
Here complementary slackness conditions (9) means that μk=0 for k∈P(σ) and the relations (10) reflect the fact that any pair of constraints (5) with indices i and i+M cannot be simultaneously active on the solution.Thereby there are at most S non-trivial coefficients μi in formula (8) and actual dimension of the problem (1), (4), (5) is S.
Let ˆσ be the normal solution of the system (4), then it can be written as
ˆσ=L∑i=1μihi ,
where coefficients μi are determined from the system of linear equations with symmetric positive definite Gram matrix {gij} of the linear independent elements hi:
L∑j=1gijμj=bi ,1≤i≤L .
If L=0 then there are no constraints (4) and ˆσ=0. If ˆσ satisfies constraints (5), that is, the inequalities
⟨hi,ˆσ⟩H≤bi ,L+1≤i≤N ,
which can be written like this:
L∑j=1gijμj≤bi ,L+1≤i≤N ,
then ˆσ is a solution to the original problem (1), (4), (5), i.e. σ=ˆσ.
Consider nontrivial case when ˆσ does not satisfy constraints (5). In this case σ belongs to the boundary of the set Φ and is the projection of zero element of space H onto the set Ψ(σ).
The projection ϑ of zero element of space H onto the set Ψ(φ) can be presented as
ϑ=∑i∈A(φ)μihi ,
here factors μi are defined from the system of linear equations with symmetric positive definite matrix
∑j∈A(φ)gijμj=bi ,i∈A(φ) ,
If factors μi, i∈I2∩A(φ) corresponding to the inequality constraints are nonpositive, then we can set μi=0 for i∈P(φ) and get all conditions (8)—(10) satisfied thereby ϑ=σ is a solution of the problem under consideration. The following algorithm is based on this remark. Let's describe its iteration.
Let σk be a feasible point of the system (4),(5):
σk=N∑i=1μkihi ,
In this presentation there are at most S non-zero multipliers μki.
We denote ϑk as a projection of zero element of space H onto the set Ψ(ϑk), Ak=A(σk) and Pk=P(σk)=I∖Ak. Then ϑk can be resented as
ϑk=∑i∈Akλkihi=N∑i=1λkihi ,λki=0, ∀i∈Pk ,∑j∈Akgijλkj=bi,i∈Ak .
Let ˆσ be the normal solution of the system (4), then it can be written as
ˆσ=L∑i=1μihi ,
where coefficients μi are determined from the system of linear equations with symmetric positive definite Gram matrix {gij} of the linear independent elements hi:
L∑j=1gijμj=bi ,1≤i≤L .
If L=0 then there are no constraints (4) and ˆσ=0. If ˆσ satisfies constraints (5), that is, the inequalities
⟨hi,ˆσ⟩H≤bi ,L+1≤i≤N ,
which can be written like this:
L∑j=1gijμj≤bi ,L+1≤i≤N ,
then ˆσ is a solution to the original problem (1), (4), (5), i.e. σ=ˆσ.
Consider nontrivial case when ˆσ does not satisfy constraints (5). In this case σ belongs to the boundary of the set Φ and is the projection of zero element of space H onto the set Ψ(σ).
The projection ϑ of zero element of space H onto the set Ψ(φ) can be presented as
ϑ=∑i∈A(φ)μihi ,
here factors μi are defined from the system of linear equations with symmetric positive definite matrix
∑j∈A(φ)gijμj=bi ,i∈A(φ) ,
If factors μi, i∈I2∩A(φ) corresponding to the inequality constraints are nonpositive, then we can set μi=0 for i∈P(φ) and get all conditions (8)—(10) satisfied thereby ϑ=σ is a solution of the problem under consideration. The following algorithm is based on this remark. Let's describe its iteration.
Let σk be a feasible point of the system (4),(5):
σk=N∑i=1μkihi ,
In this presentation there are at most S non-zero multipliers μki.
We denote ϑk as a projection of zero element of space H onto the set Ψ(ϑk), Ak=A(σk) and Pk=P(σk)=I∖Ak. Then ϑk can be resented as
ϑk=∑i∈Akλkihi=N∑i=1λkihi ,λki=0, ∀i∈Pk ,∑j∈Akgijλkj=bi,i∈Ak .
There are two possible cases: the point ϑk is feasible or it is not feasible.
In the first case we check the optimality conditions, namely: if λki≤0, ∀i∈I2 then ϑk is the solution of the problem. If the optimality conditions are not satisfied then we set σk+1=ϑk, find an index i∈Ak such that λki>0 and remove it from Ak.
In the second case σk+1 will be defined as a feasible point of the ray ϑ(t)
ϑ(t)=σk+t(ϑk−σk) , such that it is closest to the ϑk. Denote tkmin the corresponding value of t and ik∈Pk — a related number of the violated constraint. This index ik will be added to Ak forming Ak+1.
Thereby all σk are feasible points, that is, the minimization process proceeds within the feasible region and value of ‖σk‖H is not increasing. The minimization process is finite because of linear independence of the constraints, it eliminates the possibility of the algorithm cycling.
The feasibility of the ϑk can be checked as follows. Introduce the values eki,k∈Pk:
eki=⟨hi,ϑk−σk⟩H=N∑j=1gij(λkj−μkj) ,i∈Pk ,
and ϑk is feasible if and only if tkmin≥1 (see 19).
Thereby the algorithm's iteration consist of seven steps:
ReferencesIn the first case we check the optimality conditions, namely: if λki≤0, ∀i∈I2 then ϑk is the solution of the problem. If the optimality conditions are not satisfied then we set σk+1=ϑk, find an index i∈Ak such that λki>0 and remove it from Ak.
In the second case σk+1 will be defined as a feasible point of the ray ϑ(t)
ϑ(t)=σk+t(ϑk−σk) , such that it is closest to the ϑk. Denote tkmin the corresponding value of t and ik∈Pk — a related number of the violated constraint. This index ik will be added to Ak forming Ak+1.
Thereby all σk are feasible points, that is, the minimization process proceeds within the feasible region and value of ‖σk‖H is not increasing. The minimization process is finite because of linear independence of the constraints, it eliminates the possibility of the algorithm cycling.
The feasibility of the ϑk can be checked as follows. Introduce the values eki,k∈Pk:
eki=⟨hi,ϑk−σk⟩H=N∑j=1gij(λkj−μkj) ,i∈Pk ,
If eki<0, ∀i∈Pk then ϑk is a feasible point. Additionally, in a case when λki≤0, i∈I2∩Ak it is solution of the problem.
In a case when exist some eki>0 we calculate values
tki=bi−⟨hi,σk⟩Heki ,⟨hi,σk⟩H=N∑j=1gijμkj∀i∈Pk : eki>0.Here all tki>0. Now the maximum feasible step tkmin is computed as
tkmin=mini∈Pk{tki} and ϑk is feasible if and only if tkmin≥1 (see 19).
Thereby the algorithm's iteration consist of seven steps:
- Initialization. Let σ0 be a feasible point of the system (4),(5) and μ0i are corresponding multipliers (16).
- Compute multipliers λki, i∈Ak as solution of the system (18).
- If |Ak|=S then go to Step 6.
- Compute tki, ∀i∈Pk : eki>0. Find tkmin and the corresponding index ik.
- If tkmin<1 (projection ϑk is not feasible) then set
μk+1i=μki+tkmin(λki−μki) ,i∈Ak ,Ak+1=Ak∪{ik} ,
and return to Step 2. - (projection ϑk is feasible). If exists index ip, ip∈I2∩Ak such that λkip>0 then set
Ak+1=Ak∖{ip} ,μk+1i=λki ,i∈Ak ,
and return to Step 2. Otherwise go to Step 7. - Set σ=ϑk. Stop.
The algorithm starts from an initial feasible point of the system (4),(5). Such point σ0 can be defined as the normal solution of the system
⟨hi,φ⟩H=ui1≤i≤S ,
and can be presented as
σ0=S∑i=1μ0ihi ,
and can be presented as
σ0=S∑i=1μ0ihi ,
where multipliers μ0i are defined from the system
S∑j=1gijμ0j=ui ,1≤i≤S .
The algorithm described here implements a variant of the active set method [2] with the simplest form of the minimized functional (1). It can also be treated as a variant of the conjugate gradient method [7] and allows to solve the considered problem by a finite number of operations. The conjugate gradient method in this case is reduced into the problem of finding the projection ϑk, that is, into a solution of system (18). Infinite dimensionality of the space H for this algorithm is irrelevant.
The most computationally costly part of the algorithm is calculation of projections ϑk by solving systems (18). Matrices of all systems (18) are positive definite (by virtue of linear independence of elements hi) and it is convenient to use Cholesky decomposition to factorize them. At every step of the algorithm a constraint is added to or removed from the current active set and the system (18) matrix gets modified by the corresponding row and symmetric column addition or deletion. It is possible to get Cholesky factor of the modified matrix without computing the full matrix factorization (see previous post), it allows greatly reduce the total volume of computations.
This algorithm can be also applied in a case when instead of modular constraints (3) there are one-sided inequality constraints
⟨hi+L,φ⟩H≤ui+L,1≤i≤M,
⟨hi+L,φ⟩H≤ui+L,1≤i≤M,
for that it is enough to set
bi+L=ui+L ,bi+S=−∞,1≤i≤M.[1] A. Balakrishnan, Applied Functional Analysis, Springer-Verlag, New York, 1976.
[2] P. Gill, W. Murray, M. Wright, Practical Optimization, Academic Press, London, 1981.
[3] V. Gorbunov, The method of reduction of unstable computational problems (Metody reduktsii neustoichivykh vychislitel'nkh zadach), Ilim, 1984.
[4] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[5] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[6] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[7] B. Pshenichnyi, Yu. Danilin, Numerical methods in extremal problems, Mir Publishers, Moscow, 1978.
Friday, February 8, 2019
Algorithms for updating Cholesky factorization
A problem of updating Cholesky factorization was treated in [1], [2] and [3]. Here algorithms taken from those works are described. These algorithms are for computing a factorization ˜A=˜L˜LT where ˜A is a matrix received of the matrix A=LLT after a row and the symmetric column were added or deleted from A.
Let A∈Rn×n is a symmetric positive definite matrix. Applying Cholesky method to A yields the factorization A=LLT where L is a lower triangular matrix with positive diagonal elements.
Suppose ˜A is a positive definite matrix created by adding a row and the symmetric column to A:
˜A=(AddTγ) ,dT∈Rn
Then its Cholesky factorization is [2]:
˜L=(LeTα) ,
where
e=L−1d ,
and
α=√τ ,τ=γ−eTe,τ>0.
If τ≤0 then ˜A is not positive definite.
Now assume ˜A is obtained of A by deleting the rth row and column from A. Matrix ˜A is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get ˜L from L in such case. Let us partition A and L along the rth row and column:
A=(A11a1rA12aT1rarraT2rA21a2rA22) ,L=(L11lT1rlrrL21l2rL22) .
Then ˜A can be written as
˜A=(A11A12A21A22) .
By deleting the rth row from L we get the matrix
H=(L11L21l2rL22)
with the property that HHT=˜A. Factor ˜L can be received from H by applying Givens rotations to the matrix elements hr,r+1,hr+1,r+2, …,hn−1,n and deleting the last column of the obtained matrix ˜H:
˜H=HR=(L11L21˜L220)=(˜L 0) ,
where R=R0R1…Rn−1−r is the matrix of Givens rotations, L11 and ˜L22− lower triangle matrices, and
˜L=(L11L21˜L22),˜H˜HT=HRRTHT=HHT=˜A,˜H˜HT=˜L˜LT,˜L˜LT=˜A.
Orthogonal matrix Rt (0≤t≤n−1−r),Rt∈Rn×n which defines the Givens rotation annulating (r+t,r+t+1)th element of the HkR0…Rt−1 matrix is defined as
Rt=(1…00…0⋮…………⋮0…cs…00…−sc…0⋮…………⋮0…00…1) .
where entries (r+t,r+t) and (r+t+1,r+t+1) equal c, (r+t,r+t+1) entry equals s, and (r+t+1,r+t) entry equals −s, here c2+s2=1. Let's ˜lt−1ij− coefficients of matrix HkR0…Rt−1 (˜l−1ij− coefficients of Hk) and
c=˜lt−1r+t,r+t√(˜lt−1r+t,r+t)2+(˜lt−1r+t,r+t+1)2 ,s=˜lt−1r+t,r+t+1√(˜lt−1r+t,r+t)2+(˜lt−1r+t,r+t+1)2 ,
Then matrix HkR0…Rt will differ from HkR0…Rt−1 with entries of (r+t) и (r+t+1) columns only, thereby
˜lti,r+t=˜lt−1i,r+t=0, ˜lti,r+t+1=˜lt−1i,r+t+1=0,1≤i≤r+t−1,˜lti,r+t=c˜lt−1i,r+t+s˜lt−1i,r+t+1,˜lti,r+t+1=−s˜lt−1i,r+t+c˜lt−1i,r+t+1r+t≤i≤n−1.
Where
˜ltr+t,r+t=√(˜lt−1r+t,r+t)2+(˜lt−1r+t,r+t+1)2 ,˜ltr+t,r+t+1=0.
Also, coefficient ˜ltr+t,r+t is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of c and s, it was recommended [3] to calculate value of the square root w=√x2+y2 as folows:
v=max{|x|,|y|} ,u=min{|x|,|y|} ,w={v√1+(uv)2 ,v≠00 ,v=0 .
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about n2 flops instead of about (n+1)33 flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about 3(n−r)2 flops (the worst case requires about 3(n−1)2 operations) instead of about (n−1)33 flops required for its direct calculation.
References
[1] T. F. Coleman, L. A. Hulbert, A direct active set algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.
Let A∈Rn×n is a symmetric positive definite matrix. Applying Cholesky method to A yields the factorization A=LLT where L is a lower triangular matrix with positive diagonal elements.
Suppose ˜A is a positive definite matrix created by adding a row and the symmetric column to A:
˜A=(AddTγ) ,dT∈Rn
Then its Cholesky factorization is [2]:
˜L=(LeTα) ,
where
e=L−1d ,
and
α=√τ ,τ=γ−eTe,τ>0.
If τ≤0 then ˜A is not positive definite.
Now assume ˜A is obtained of A by deleting the rth row and column from A. Matrix ˜A is the positive definite matrix (as any principal square submatrix of the positive definite matrix). It is shown in [1],[3] how to get ˜L from L in such case. Let us partition A and L along the rth row and column:
A=(A11a1rA12aT1rarraT2rA21a2rA22) ,L=(L11lT1rlrrL21l2rL22) .
Then ˜A can be written as
˜A=(A11A12A21A22) .
By deleting the rth row from L we get the matrix
H=(L11L21l2rL22)
with the property that HHT=˜A. Factor ˜L can be received from H by applying Givens rotations to the matrix elements hr,r+1,hr+1,r+2, …,hn−1,n and deleting the last column of the obtained matrix ˜H:
˜H=HR=(L11L21˜L220)=(˜L 0) ,
where R=R0R1…Rn−1−r is the matrix of Givens rotations, L11 and ˜L22− lower triangle matrices, and
˜L=(L11L21˜L22),˜H˜HT=HRRTHT=HHT=˜A,˜H˜HT=˜L˜LT,˜L˜LT=˜A.
Orthogonal matrix Rt (0≤t≤n−1−r),Rt∈Rn×n which defines the Givens rotation annulating (r+t,r+t+1)th element of the HkR0…Rt−1 matrix is defined as
Rt=(1…00…0⋮…………⋮0…cs…00…−sc…0⋮…………⋮0…00…1) .
where entries (r+t,r+t) and (r+t+1,r+t+1) equal c, (r+t,r+t+1) entry equals s, and (r+t+1,r+t) entry equals −s, here c2+s2=1. Let's ˜lt−1ij− coefficients of matrix HkR0…Rt−1 (˜l−1ij− coefficients of Hk) and
c=˜lt−1r+t,r+t√(˜lt−1r+t,r+t)2+(˜lt−1r+t,r+t+1)2 ,s=˜lt−1r+t,r+t+1√(˜lt−1r+t,r+t)2+(˜lt−1r+t,r+t+1)2 ,
Then matrix HkR0…Rt will differ from HkR0…Rt−1 with entries of (r+t) и (r+t+1) columns only, thereby
˜lti,r+t=˜lt−1i,r+t=0, ˜lti,r+t+1=˜lt−1i,r+t+1=0,1≤i≤r+t−1,˜lti,r+t=c˜lt−1i,r+t+s˜lt−1i,r+t+1,˜lti,r+t+1=−s˜lt−1i,r+t+c˜lt−1i,r+t+1r+t≤i≤n−1.
Where
˜ltr+t,r+t=√(˜lt−1r+t,r+t)2+(˜lt−1r+t,r+t+1)2 ,˜ltr+t,r+t+1=0.
Also, coefficient ˜ltr+t,r+t is a nonnegative one.
In order to avoid unnecessary overflow or underflow during computation of c and s, it was recommended [3] to calculate value of the square root w=√x2+y2 as folows:
v=max{|x|,|y|} ,u=min{|x|,|y|} ,w={v√1+(uv)2 ,v≠00 ,v=0 .
Applying the this technique to Cholesky factorization allows significally reduce the complexity of calculations. So, in case of adding a row and the symmetric column to the original matrix it will be necessary to carry out about n2 flops instead of about (n+1)33 flops for the direct calculation of the new Cholesky factor. In the case of deleting a row and the symmetric column from the original matrix, the new Cholesky factor can be obtained with about 3(n−r)2 flops (the worst case requires about 3(n−1)2 operations) instead of about (n−1)33 flops required for its direct calculation.
References
[1] T. F. Coleman, L. A. Hulbert, A direct active set algorithm for large sparse quadratic programs with simple bounds, Mathematical Programming, 45, 1989.
[2] J. A. George and J. W-H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, 1981.
[3] C. L. Lawson, R. J. Hanson, Solving least squares problems, SIAM, 1995.
Simple normal splines examples
In this post we illustrate the normal spline interpolation approach with a few simple examples.
Let there is the following information of a smooth function φ(x,y),(x,y)∈R2:
φ(0,0)=0,∂φ∂x(0,0)=1,∂φ∂y(0,0)=1,
and it is necessary to reconstruct φ using this data.
Assume this function is an element of Hilbert space H2.5ε(R2), thus it can be treated as a continuously differentiable function φ∈C1(R2), and construct a normal spline σ(2.5)1 approximating φ:
σ(2.5)1=argmin{‖φ‖2H2.5ε:(1),(2),(3),∀φ∈H2.5ε(R2)}.
This spline can be presented as
This spline can be presented as
σ(2.5)1=μ1h1+μ′1h′1+μ′2h′2,
here
here
h1(η1,η2,ε)=exp(−ε√η21+η22)(1+ε√η21+η22),h′1(η1,η2,ε)=ε2exp(−ε√η21+η22)(η1+η2),h′2(η1,η2,ε)=h′1(η1,η2,ε), (η1,η2)∈R2,
and coefficients (μ1,μ′1,μ′2) are defined from the system (see the previous post):
[10002ε20002ε2][μ1μ′1μ′2]=[011].
Eventually
σ(2.5)1(x,y,ε)=exp(−ε√x2+y2)(x+y),(x,y)∈R2.
Fig.1 Spline σ(2.5)1,ε=1
Fig.2 Spline σ(2.5)1,ε=0.1
Now let function φ(x,y),(x,y)∈R2 is a twice continuously differentiable function which satisfies constraints:
φ(0,0)=0,∂φ∂x(0,0)+∂φ∂y(0,0)=2.
where
Fig.1 Spline σ(2.5)1,ε=1
Fig.2 Spline σ(2.5)1,ε=0.1
Now let function φ(x,y),(x,y)∈R2 is a twice continuously differentiable function which satisfies constraints:
φ(0,0)=0,∂φ∂x(0,0)+∂φ∂y(0,0)=2.
We approximate it by constructing a normal spline σ(3.5)1 in H3.5ε(R2):
σ(3.5)1=argmin{‖φ‖2H3.5ε:(11),(12),∀φ∈H3.5ε(R2)},σ(3.5)1=μ1h1+μ′1h′1,where
h1(η1,η2,ε)=exp(−ε√η21+η22)(3+3ε√η21+η22+ε2(η21+η22)),h′1(η1,η2,ε)=ε2exp(−ε√η21+η22)(1+ε√η21+η22)(η1+η2),
and coefficients (μ1,μ′1) are defined from the system:
[3002ε2][μ1μ′1]=[02].
Therefore
σ(3.5)1(x,y,ε)=exp(−ε√x2+y2)(1+ε√x2+y2)(x+y),(x,y)∈R2.
As the last example consider a problem of reconstructing a continuously differentiable function φ(x),x∈R, which satisfies constraint
dφdx(0)=1,
and it is closest to function z(x)=2x,x∈R. We approximate it by constructing a normal spline σ(2)1 in H2ε(R):
σ(2)1=argmin{‖φ−z‖2H2ε:(19),∀φ∈H2ε(R)},σ(2)1=z+μ′1h′1=2x+μ′1h′1,
Performing calculations analogous to previous ones, we'll receive:
As the last example consider a problem of reconstructing a continuously differentiable function φ(x),x∈R, which satisfies constraint
dφdx(0)=1,
and it is closest to function z(x)=2x,x∈R. We approximate it by constructing a normal spline σ(2)1 in H2ε(R):
σ(2)1=argmin{‖φ−z‖2H2ε:(19),∀φ∈H2ε(R)},σ(2)1=z+μ′1h′1=2x+μ′1h′1,
Performing calculations analogous to previous ones, we'll receive:
σ(2)1(x,ε)=2x−xexp(−ε|x|),x∈R.
Tuesday, January 29, 2019
Hermite-Birkhoff Interpolation of Scattered Data
Suppose P1={pi,pi∈Rn}M1i=1 and P2={p′i,p′i∈Rn}M2i=1 are sets of data locations lying in a domain Ω⊂Rn.
Let functional fi is a value of a function φ at a node pi∈P1 and functional f′i is a value of that function directional derivative (a slope of the function) at a node p′i∈P2
Assume the values of functionals fi and f′i at nodes belonging to P1 and P2 are known:
(fi,φ)=φ(pi)=ui ,pi∈P1 ,1≤i≤M1,(f′i,φ)=∂φ∂ei(p′i)=u′i ,p′i∈P2 ,1≤i≤M2,
where ∂φ∂ei(p′i)=∑nk=1∂φ∂xk(p′i)cosθik, and cosθik,1≤k≤n− directional cosines of a unit vector ei.
Consider a problem of the function φ reconstruction. Let function to be approximated φ belongs to Hilbert space Hsε(Rn) where s>r+n/2,r=1,2,…. In that case space Hsε is continuously embedded in Hölder space Crb(Rn), φ(x)∈Cr(Rn) and values of function φ(x) and the function partial derivatives at the fixed points are linear continuous functionals in Hsε.
We assume that P1 does not contain coincide nodes and there is no directional derivatives defined in a node in P2 which directions are linearly dependent, also a node from P1 may coincide with a node from P2. Under this restriction functionals fi,f′i are linearly independent.
In accordance with Riesz representation theorem [1] linear continuous functionals fi,f′i can be represented in the form of inner product of some elements hi,h′i∈Hsε and φ∈Hsε, for any φ∈Hsε.
(fi,φ)=⟨hi,φ⟩Hsrε,(f′i,φ)=⟨h′i,φ⟩Hsrε,∀φ∈Hsε .
The notation ⟨⋅,⋅⟩Hsrε in (3) denotes the inner product in space Hsrε (see the previous post for details). It is easy to see that under above assumptions elements hi,h′i are linearly independent.
Let Vsr(η,x,ε),sr=r+n/2+1/2,r=1,2,… is a reproduction kernel of Hilbert space Hsrε(Rn) (see the previous post). The kernel Vsr(η,x,ε)∈Cr(Rn) as a function of η∈Rn. In accordance with reproducing property of the reproducing kernel for any φ∈Hsrε(Rn) and x∈Rn the identity
φ(x)=⟨Vsr(⋅,x,ε),φ⟩Hsrε
holds. This means that Vsr(⋅,x) is the Riesz representer of the evaluation functional δx defined as value of a function φ∈Hsrε at a given point x∈Rn. As Vsr is a radial basis continuously differentiable function we can differentiate the identity (4) and receive the following identity
∂φ(x)∂xi=⟨∂Vsr(⋅,x,ε)∂xi,φ⟩Hsrε
which holds for any φ∈Hsrε and x∈Rn. It means that function ∂Vsr(⋅,x)∂xi represents a point-wise functional defined as value of function ∂φ(⋅)∂xi at a point x.
Thereby
hi(η,ε)=Vsr(η,pi,ε),pi∈P1 ,1≤i≤M1, h′i(η,ε)=∂Vsr(η,p′i,ε)∂ei, p′i∈P2 ,1≤i≤M2, η∈Rn.
and system of constraints (1), (2) can be presented as a system of linear equations in Hilbert space Hsrε:
⟨hi,φ⟩Hsrε=ui,1≤i≤M1, ⟨h′i,φ⟩Hsrε=u′i, 1≤i≤M2,hi,h′i,φ∈Hsrε(Rn).
This system of constraints defines a convex and closed set in Hilbert space — as an intersection of a finite number of hyperplanes (this set is not an empty set because of linear independence of functionals hi,h′i).
The problem of reconstruction of function φ satisfying system of constraints (8) and (9) is an undetermined problem. We reformulate it as the problem of finding a solution of the system (8), (9) that has minimal norm:
σ1=argmin{‖φ‖2Hsrε:(8),(9),∀φ∈Hsrε(Rn)},
such solution exists and it is unique [1, 6]. The element σ1 is an interpolating normal spline.
In accordance with generalized Lagrange method ([5], [6]) the solution of the problem (10) can be written as follows:
σ1=M1∑j=1μjhj+M2∑j=1μ′jh′j ,
here coefficients μi,μ′i are defined by the system
M1∑j=1gijμj+M2∑j=1g′ijμ′j=ui,1≤i≤M1,M1∑j=1g′jiμj+M2∑j=1g″ijμ′j=u′i,1≤i≤M2,
where gij,g′ij,g″ij are coefficients of the positive definite symmetric Gram matrix of the set of linearly independent elements hi,h′i∈Hsrε(Rn):
gij=⟨hi,hj⟩Hsrε=(fi,hj),1≤i≤M1, 1≤j≤M1,g′ij=⟨hi,h′j⟩Hsrε=(fi,h′j),1≤i≤M1, 1≤j≤M2,g″ij=⟨h′i,h′j⟩Hsrε=(f′i,h′j),1≤i≤M2, 1≤j≤M2, Taking into account (3), (6), (7) we receive:
gij=Vsr(pi,pj,ε),pi,pj∈P1, 1≤i≤M1, 1≤j≤M1,g′ij=∂Vsr(pi,p′j,ε)∂ej=n∑k=1∂Vsr(pi,p′j,ε)∂xkcosθjk,pi∈P1, 1≤i≤M1, p′j∈P2, 1≤j≤M2,g″ij=∂2Vsr(p′i,p′j,ε)∂ei∂ej=n∑l=1n∑k=1∂2Vsr(p′i,p′j,ε)∂ηl∂xkcosθilcosθjk,p′i,p′j∈P2, 1≤i≤M2, 1≤j≤M2.
Thus the problem (10) is reduced to solving a system of linear equations (12), (13) with Gram matrix defined by (18) — (19) and constructing the normal spline (11).
Assume the function to be approximated φ belongs to Hilbert space Hs1ε(Rn), s1=1+n/2+1/2 (see the previous post). It is a continuously differentiable function, φ∈C1(Rn). Write down the expressions for hj,h′j,gij,g′ij,g″ij in space Hs1ε(Rn):
hj(η,ε)=exp(−ε|η−pj|)(1+ε|η−pj|) ,pj∈P1 , 1≤j≤M1,h′j(η,ε)=ε2exp(−ε|η−p′j|)n∑k=1(ηk−p′jk)cosθjk ,p′j∈P2 , 1≤j≤M2,η∈Rn,
then
gij=exp(−ε|pi−pj|)(1+ε|pi−pj|) , pi,pj∈P1 ,1≤i≤M1,g′ij=ε2exp(−ε|pi−p′j|)n∑k=1(pik−p′jk)cosθjk ,pi∈P1, p′j∈P2 ,1≤i≤M1, 1≤j≤M2,g″ij=n∑l=1n∑k=1∂2Vs1(p′i,p′j,ε)∂ηl∂xkcosθilcosθjk,p′i,p′j∈P2, 1≤i≤M2, 1≤j≤M2,i≠j,where∂2Vs1(p′i,p′j,ε)∂ηk∂xk=ε2exp(−ε|p′i−p′j|)(1−ε(p′ik−p′jk)2|p′i−p′j|),∂2Vs1(p′i,p′j,ε)∂ηl∂xk=−ε3exp(−ε|p′i−p′j|)(p′ik−p′jk)(p′il−p′jl)|p′i−p′j|, l≠k,g″ii=ε2n∑l=1 (cosθil)2=ε2,1≤i≤M2.
Consider a case when the function to be approximated φ may be treated as an element of Hilbert space Hs2ε(Rn), s2=2+n/2+1/2. In this case it is a twice continuously differentiable function, φ∈C2(Rn). Let's write down the expressions for hj,h′j in space Hs2ε(Rn):
hj(η,ε)=exp(−ε|η−pj|)(3+3ε|η−pj|+ε2|η−pj|2)) ,pj∈P1 ,1≤j≤M1,h′j(η,ε)=ε2exp(−ε|η−p′j|)(1+ε|η−p′j|)n∑k=1(ηk−p′jk)cosθjk ,p′j∈P2, 1≤j≤M2, η∈Rn,
and corresponding Gram matrix elements gij,g′ij,g″ij are as follows
gij=exp(−ε|pi−pj|)(3+3ε|pi−pj|+ε2|pi−pj|2)) ,pi,pj∈P1 ,1≤i≤M1,g′ij=ε2exp(−ε|pi−p′j|)(1+ε|pi−p′j|)n∑k=1(pik−p′jk)cosθjk ,pi∈P1, p′j∈P2 ,1≤i≤M1,1≤j≤M2,g″ij=n∑l=1n∑k=1∂2Vs2(p′i,p′j,ε)∂ηl∂xkcosθilcosθjk,p′i,p′j∈P2, 1≤i≤M2, 1≤j≤M2,i≠j,where∂2Vs2(p′i,p′j,ε)∂ηk∂xk=ε2exp(−ε|p′i−p′j|)(1+ε|p′i−p′j|−ε2(p′ik−p′jk)2),∂2Vs2(p′i,p′j,ε)∂ηl∂xk=−ε4exp(−ε|p′i−p′j|)(p′ik−p′jk)(p′il−p′jl), l≠k,g″ii=ε2n∑l=1 (cosθil)2=ε2,1≤i≤M2,
If the original task does not contain constraints including function φ derivatives (P2 is an empty set, thereby there are no constraints (2) in (1)—(2)), we may construct an interpolating normal spline assuming φ is an element of Hilbert space Hs0ε(Rn), s0=n/2+1/2. In this case it is a continuous function, φ∈C(Rn), and expressions for hi,gij are defined by:
hi(η,ε)=exp(−ε|η−pi|) ,η∈Rn, pi∈P1 ,1≤i≤M1,gij=exp(−ε|pj−pi|)) , pi,pj∈P1 ,1≤i≤M1, 1≤j≤M1,
When value of the parameter ε is small this spline is similar to multivariate generalization of the one dimensional linear spline.
We now consider the choice of value for parameters s and ε. The parameter s defines smoothness of the spline — if s>n/2 then spline belongs to C[s−n/2], where [⋅] denotes integer part of number. Moreover, discontinuities of derivatives of a greater order are observed only at the nodes pi,p′i, at all other points of Rn the spline is infinitely differentiable. With increasing s the condition number of the Gram matrix increases and computational properties of the problem of the function φ approximation are worsening. Thereby it make sense to choose a reasonably small value for the parameter s.
A normal spline is similar to Duchon's Dm−spline [3] when value of parameter ε is small. Approximating properties of the spline are getting better with smaller values of ε, however with decreasing value of ε the condition number of Gram matrix increases. Therefore, when choosing the value of parameter ε, a compromise is needed. In practice, it is necessary to choose such value of the ε that condition number of Gram matrix is small enough. Numerical procedures of the matrix condition number estimation are well known (e.g. [4]).
As well, as a rule, it is useful to preprocess the source data of the problem by scaling the domain Ω⊂Rn in which the interpolation nodes pi,p′i are located into a unit hypercube.
Often accurate values of ui,u′i in (1), (2) are not known. Assuming that uniform error bounds δi,δ′i are given, we can formulate the problem of function φ approximation by constructing a uniform smoothing normal spline σ2:
σ2=argmin{‖φ‖2Hsrε:(35),(36),∀φ∈Hsrε(Rn)},|φ(pi)−ui|≤δi,pi∈P1 ,1≤i≤M1,|∂φ∂ei(p′i)−u′i|≤δ′i, p′i∈P2 ,1≤i≤M2.
In a case when a mean squared error δ2 is known an approximating mse-smoothing normal spline σ3 can be constructed as a solution of the following variational problem:
σ3=argmin{‖φ‖2Hsrε:(38),∀φ∈Hsrε(Rn)},∑i∈P1(φ(pi)−ui)2+∑i∈P2(∂φ∂ei(p′i)−u′i)2≤δ2. Constraints (35)—(36) and (38) define convex closed sets in Hilbert space Hsrε(Rn), sr=r+n/2+1/2,r=1,2,…. Under assumptions made earlier these sets are not empty, thus problems (34) and (37) are problems of finding a projection of zero element of Hilbert space to a non-empty closed convex set. A solution of such problem exists and it is unique.
Algorithms of solving problems (34) and (37) will be described in next posts.
Earlier a problem of multivariate normal splines constructing was treated in ([2], [7], [8]) and presented at [9] and [10].
[1] A. Balakrishnan, Applied Functional Analysis, New York, Springer-Verlag, 1976.
[2] V. Gorbunov, I. Kohanovsky, K. Makedonsky, Normal splines in reconstruction of multi-dimensional dependencies. Papers of WSEAS International Conference on Applied Mathematics, Numerical Analysis Symposium, Corfu, 2004. (http://www.wseas.us/e-library/conferences/corfu2004/papers/488-312.pdf)
[3] J. Duchon, Splines minimizing rotation-invariant semi-norms in Sobolev spaces, Lect. Notes in Math., Vol. 571, Springer, Berlin, 1977
[4] W.Hager, Condition Estimates, SIAM Journal on Scientific and Statistical Computing, Vol.5, N.2, 1984
[5] A. Ioffe, V. Tikhomirov, Theory of extremal problems, North-Holland, Amsterdam, 1979.
[6] P.-J. Laurent, Approximation et optimization, Paris, 1972.
[7] I. Kohanovsky, Normal Splines in Computing Tomography, Avtometriya, 1995, N 2. (https://www.iae.nsk.su/images/stories/5_Autometria/5_Archives/1995/2/84-89.pdf)
[8] I. Kohanovsky, Data approximation using multidimensional normal splines. Unpublished manuscript. 1996.
[9] I. Kohanovsky, Multidimensional Normal Splines and Problem of Physical Field Approximation, International Conference on Fourier Analysis and its Applications, Kuwait, 1998.[10] I. Kohanovsky, Normal splines in fractional order Sobolev spaces and some of its
applications, The Third Siberian Congress on Applied and Industrial mathematics (INPRIM-98), Novosibirsk, 1998.
Subscribe to:
Posts (Atom)