Open Access
Issue
Wuhan Univ. J. Nat. Sci.
Volume 29, Number 5, October 2024
Page(s) 403 - 411
DOI https://doi.org/10.1051/wujns/2024295403
Published online 20 November 2024

© Wuhan University 2024

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

0 Introduction

Unconstrained optimization methods are widely used in the fields of nonlinear

dynamic systems and engineering computation to obtain the numerical solution of the optimal control problem[1,2]. In this paper, we consider the following unconstrained optimization problem:

m i n   f   ( x ) ,   x R n , Mathematical equation(1)

where f: RnRMathematical equation is sufficiently smooth high-dimensional function. The nonlinear conjugate gradient (CG) methods are highly useful for solving this kind of problem because they can avoid storing and remembering matrices[3,4]. In general, the iterative formula of the CG method is obtained by

x k + 1 = x k + α k d k , Mathematical equation(2)

where αkMathematical equation is step size along the search direction dkMathematical equation, and dkMathematical equation defined by

d k = { - g k , k = 1 , - g k + β k d k - 1 , k 2 , Mathematical equation(3)

where gkMathematical equation denotes the gradient f(xk)Mathematical equation. βkMathematical equation is scalar and it is chosen differently so that there are different CG methods corresponding to the different βkMathematical equation. Some well-known formulas for βkMathematical equation are given by

β k H S = g k T y k - 1 d k - 1 T y k - 1   Mathematical equation(Hestenes-Stiefel (HS) method[5]),

β k F R = g k 2 g k - 1 2   Mathematical equation(Fletcher-Reeves (FR) method[6]),

β k P R P = g k T y k - 1 g k - 1 2   Mathematical equation(Polak-Ribiere-Polyak (PRP) method[7]),

β k C D = - g k 2 d k - 1 T g k - 1 Mathematical equation (Conjugate Descent (CD) method[8]),

β k L S = g k T y k - 1 - d k - 1 T g k - 1   Mathematical equation(Liu -Storey (LS) method[9]),

and

β k D Y = g k 2 d k - 1 T y k - 1   Mathematical equation(Dai and Yuan (DY) method[10]),

where ·Mathematical equation is Euclidean norm and "T" stands for the transpose,yk-1=gk-gk-1Mathematical equation. In general, DY and FR methods based on inexact line search have good convergence performance, but the computational efficiency is not as good as that of PRP method. In order to establish methods with both good numerical performance and convergence properties, many scholars have proposed improved conjugate gradient methods in recent years[11-14]. They showed that the method is globally convergent if the following strong Wolfe line search conditions for αkMathematical equation are satisfied,

{ f ( x k + α k d k ) - f ( x k ) ρ α k g k T d k | g ( x k + α k d k ) T d k | - σ g k T d k , Mathematical equation(4)

where 0<ρ<σ<1Mathematical equation.

The line search in the conjugate gradient method is usually chosen by a strong Wolfe line search, however, the line search skill usually brings computational burden, especially in solving large-scale nonlinear unconstrained optimization problem. To overcome this problem, Sun and Zhang[3] introduced five CG methods respectively without line search in which the line search step is replaced by a simple step size formula αkMathematical equation:

α k = - δ g k T d k d k Q k 2 , Mathematical equation(5)

where dkQk=dkTQkdkMathematical equation, δ(0, vmin/μ)Mathematical equation is chosen such that δμ/vmin<1Mathematical equation, μMathematical equation is Lipchitz constant defined in Assumption 1 below, and {Qk} is a sequence of positive definite matrices satisfying for positive constants vmin>0Mathematical equation and vmax>0Mathematical equation such that Mathematical equationdRnMathematical equation, vmindTddTQkdvmaxdTdMathematical equation.

Chen and Sun[4] extended this technique to two-parameter family CG method, which can be deemed as generalization of the HS and LS without line search. Yu[15] and Narushima[16] applied this method to memory gradient method and obtained their global convergence, respectively. Yin and Chen[17] showed that three-term CG methods without line search are globally convergence. Other modified CG methods without line search reader may see Refs.[18-20].

As supplementary of these results, in this paper, we continue their research and would show that the proposed CG method without line search in which the line search step is replaced by fixed step-length formula (5), is global convergence for following new βkMathematical equation:

β k = g k 2 ω k g k - 1 2 + μ k d k - 1 T y k - 1 , Mathematical equation(6)

where ωk0Mathematical equation, μk0Mathematical equation. This motivation mainly comes from Refs.[3, 4]. Our main aim is to develop a new method and obtain better property of the new method while keeping its simple structure. It should be noted that the formula (6) includes some important special sub-classes. For example, when ωk=0Mathematical equation, μk=1,Mathematical equation the proposed CG method can be deemed as DY method without line search; when ωk=1Mathematical equation, μk=0,Mathematical equation the proposed CG method can be deemed as FR method without line search. Hence, the proposed CG method (6) without line search can be deemed as generalization of the DY and FR.

The rest of this paper is organized as follows. In Section 1, we first give some preliminary results on the CG method without line search and discuss our method sufficient descent property. Furthermore, we prove the global convergence of the proposed method without line search and present algorithm frame. A large amount of numerical experiments are given for illustrative purposes in Section 2. Conclusions appear in Section 3.

1 Analysis of Global Convergence and Algorithm Frame

1.1 Analysis of Global Convergence

We adopt the following assumption on function of which is commonly used in the literature.

Assumption 1[3] The objective function fMathematical equation is LC1Mathematical equation in a neighborhood ΩMathematical equation of the level set L:={ xRn|f(x)f(x1)}Mathematical equation and LMathematical equation is bounded. Here, by LC1Mathematical equation we mean that the gradient f(x)Mathematical equation is Lipschitz continuous with modulus μ ,Mathematical equationi.e., there exist μ>0Mathematical equation such that

f ( x k + 1 ) - f ( x k ) μ x k + 1 - x k   Mathematical equation

for any xk+1, xkΩMathematical equation.

Assumption 2[3] The function fMathematical equation is LC1Mathematical equation and strongly convex on ΩMathematical equation. In other words, there exists λ>0Mathematical equation such that

[ f ( x k + 1 ) - f ( x k ) ] T ( x k + 1 - x k ) λ x k + 1 - x k 2   f o r   a n y   x k + 1 ,   x k Ω . Mathematical equation

Lemma 1   Suppose that xkMathematical equation is given by (2), (3) and (5). Then

g k + 1 T d k = ρ k g k T d k Mathematical equation(7)

holds for all k, where

ρ k = 1 - δ ϕ k d k 2 d k 2 Q k   Mathematical equation(8)

and

ϕ k = { 0 , f o r    α k = 0 , ( g k + 1 - g k ) T ( x k + 1 - x k ) x k + 1 - x k 2 , f o r    α k 0 .     Mathematical equation(9)

Proof   See Lemma 1 in Ref. [3].

Lemma 2   Suppose that Assumption 1 holds and that xkMathematical equation is given by (2), (3) and (5). Then

l i m i n f k g k 0   i m p l i e s   d k 0 g k 4 d k 2 < . Mathematical equation(10)

Proof   See Lemma 5 in Ref. [3].

Lemma 3   Suppose that Assumption 2 holds and βkMathematical equation is given by (6). Then gkTdk-gk2Mathematical equation.

Proof   We prove Lemma 3 by induction. For k=1Mathematical equation, it is easy to obtain that g1Td1=-g12-g12Mathematical equation. Suppose gk-1Tdk-1-gk-12Mathematical equation holds, now we prove that gkTdk-gk2Mathematical equation holds.

By (3), (6) and Lemma 1, we have

        g k T d k = g k T ( - g k + β k d k - 1 ) = g k T ( - g k + g k 2 ω k g k - 1 2 + μ k d k - 1 T y k - 1 d k - 1 ) = - g k 2 + g k 2 ω k g k - 1 2 + μ k d k - 1 T y k - 1 g k T d k - 1 = - g k 2 + g k 2 ρ k - 1 g k - 1 T d k - 1 ω k g k - 1 2 + μ k d k - 1 T y k - 1 . Mathematical equation

From Corollary 3 in Ref.[3], we know 0<ρk-11Mathematical equation. From the induction hypothesis, we have ρk-1gk-1Tdk-1-ρk-1gk-120Mathematical equation. Furthermore, one has

μ k d k - 1 T y k - 1 = μ k d k - 1 T ( g k - g k - 1 ) = μ k ( d k - 1 T g k - d k - 1 T g k - 1 ) = μ k ( ρ k - 1 g k - 1 T d k - 1 - g k - 1 T d k - 1 ) = μ k ( ρ k - 1 - 1 ) g k - 1 T d k - 1 0 , Mathematical equation

which is due to μk0Mathematical equation, ρk-1-1<0Mathematical equation and gk-1Tdk-10Mathematical equation. From above analysis, we have

g k 2 ρ k - 1 g k - 1 T d k - 1 ω k g k - 1 2 + μ k d k - 1 T y k - 1 0 , Mathematical equation

where ωkMathematical equation and μkMathematical equation are not equal to zero at the same time. Thus, we obtain gkTdk=-gk2+gk2ρk-1gk-1Tdk-1ωkgk-12+μkdk-1Tyk-1-gk2Mathematical equation. This finishes our proof.

If there exists a constant c>0Mathematical equation such that gkTdk-cgk2Mathematical equation(k1Mathematical equation), then we say the search direction dkMathematical equation of the method satisfies the sufficient descent condition. Lemma 3 means that the search direction dkMathematical equation is the sufficient descent direction, which is often required in the convergence analysis of CG method.

Theorem 1   Suppose that Assumptions 1, 2 hold and that βkMathematical equation is given by (6). Then

d k 2 g k 4 d k - 1 2 ω k g k - 1 4 + H g k 2 , Mathematical equation

where HMathematical equation is a positive constant.

Proof   By (3), we have

d k = - g k + β k d k - 1 , Mathematical equation

squaring both sides of it, we obtain

d k 2 = - g k + β k d k - 1 2 Mathematical equation

By Lemma 1 and substituting the expression (6) into above formula, we have

d k 2 = - g k + g k 2 ω k g k - 1 2 + μ k d k - 1 T y k - 1 d k - 1 2 = g k 2 - 2 g k 2 ω k g k - 1 2 + μ k d k - 1 T y k - 1 g k T d k - 1 + g k 4 d k - 1 2 [ ω k g k - 1 2 + μ k d k - 1 T y k - 1 ] 2 Mathematical equation

= g k 2 - 2 ρ k - 1 g k - 1 T d k - 1 ω k g k - 1 2 + μ k d k - 1 T y k - 1 g k 2 + g k 4 d k - 1 2 [ ω k g k - 1 2 + μ k d k - 1 T y k - 1 ] 2 Mathematical equation

g k 2 | 1 + 2 ρ k - 1 g k - 1 T d k - 1 ω k g k - 1 2 + μ k d k - 1 T y k - 1 | Mathematical equation

  + g k 4 d k - 1 2 [ ω k g k - 1 2 + μ k d k - 1 T y k - 1 ] 2   Mathematical equation

g k 2 | 1 + 2 ρ k - 1 g k - 1 T d k - 1 ω k g k - 1 2 | + g k 4 d k - 1 2 [ ω k g k - 1 2 ] 2 , Mathematical equation

which is due to μkdk-1Tyk-1>0Mathematical equation that is deduced in Lemma 3. By gk-1Tdk-1-gk-12Mathematical equation in Lemma 3 , we have

2 ρ k - 1 g k - 1 T d k - 1 ω k g k - 1 2 - 2 ρ k - 1 g k - 1 2 ω k g k - 1 2 = - 2 ρ k - 1 ω k Mathematical equation

Now by 1-δμ/vminρk-1Mathematical equation that is deduced in Ref. [3], and the two inequalities above, we obtain

d k 2 | 1 - 2 ρ k - 1 ω k | g k 2 + g k 4 d k - 1 2 [ ω k g k - 1 2 ] 2 | 1 - 2 ( 1 - δ μ / v m i n ) ω k | g k 2 + g k 4 d k - 1 2 ω k 2 g k - 1 4 . Mathematical equation

Let H=|1-2(1-δμ/vmin)ωk|Mathematical equation, we obtain

d k 2 H g k 2 + g k 4 d k - 1 2 ω k g k - 1 4 , d k 2 g k 4 d k - 1 2 ω k g k - 1 4 + H g k 2 . Mathematical equation

The proof is completed. There holds 1-δμ/vminρk-1Mathematical equation for all k if Assumption 1 is valid.

Theorem 2   Suppose that Assumptions 1, 2 hold and that sequence {xk}Mathematical equation is given by (2), (3), (5) and (6). Then liminfkgk=0Mathematical equation.

Proof   Suppose, by contradiction, that the stated conclusion is not true. Then, in view of gk>0Mathematical equation, there exists a constant ε>0Mathematical equation, such that gkεMathematical equation for all k, and by Theorem 1 we have dk2gk4dk-12ωkgk-14+Hε2Mathematical equation.

Thus, we have a recursive equation which leads to

d k 2 g k 4 d k - 1 2 ω k g k - 1 4 + H ε 2 d k - 2 2 ω k g k - 2 4 + 2 H ε 2 d 1 2 ω 2 g 1 4 + ( k - 1 ) H ε 2 . Mathematical equation(11)

Thus, equation (11) means that

g k 4 d k 2 1 a b + ( k - 1 ) c Mathematical equation(12)

where a=1ω2Mathematical equation, b=d12g14Mathematical equation and c=Hε2Mathematical equation, a, b and cMathematical equation are constants. From equation (12) we have dk0gk2dk2=+Mathematical equation, which is contradictory to Lemma 2. Hence we obtain liminfkgk=0Mathematical equation.

Theorem 2   denotes that our proposed CG method without line search possesses global convergence.

1.2 Algorithm Frame

Based on the discussion above, now we can describe the algorithm frame for solving the unconstrained optimization problem (1) as follows:

Step 0   Given an initial point x0RnMathematical equation, constants ε0>0Mathematical equation, ρ(0,1)Mathematical equation, σ(0,1)Mathematical equation, ωk0Mathematical equation, μk0Mathematical equation and let k:=0Mathematical equation.

Step 1   If a stopping criterion gk<ε0Mathematical equation is satisfied, then stop; otherwise, go to Step 2.

Step 2   Compute a step size αkMathematical equation by formula (6).

Step 3   Let xk+1=xk+αkdkMathematical equation, compute dkMathematical equation and βkMathematical equation by (3) and (6).

Step 4   Let k:=k+1Mathematical equation and go to Step 1.

From above algorithm, we note that the step-length αkMathematical equation depends on the Lipschitz constant, but in general, the Lipschitz constant is not known in previously. It should be noted that if f(x)Mathematical equation is a twice continuous differentiable strictly convex function, the Lipschitz constant can be replaced by a positive constant[15]. So, we choose δ=1Mathematical equation.

2 Numerical Experiments

In this section, in order to show the performance of the given method, we test our proposed method with fixed step-length (5) (denoted by OMFSL), method (6) with the strong Wolfe line search (4) (denoted by SSWLS), and some other considered algorithms without line search such as DY and FR method [3] and Chen and Sun's method[4] via 24 test problems from Ref.[21]. We use the same convergence criteria in Ref.[21].

The parameters are chosen as ρ=0.04Mathematical equation, σ=0.5Mathematical equation, ωk=0.7Mathematical equation and μk=0.3Mathematical equation. All codes are written in MATLAB 7.5 and ran on Lenovo with 1.90 GHz CPU processor, 2.43 GB RAM memory, and Windows XP operating system. In our numerical experiments the no-line-search method with Qk=I (the unit matrix) show bad convergence behavior. Thus we consider BFGS updates for Qk in formula (5).

The test results are shown in Tables 1 and 2. The results for each problem by each method are shown in the form of k/NG/NF/f(x¯)/TcpuMathematical equation, where k, NG, NF, f(x¯), TcpuMathematical equation denote the number of the iteration at the terminate iteration, the total number of calculations of the gradient value of the objective function, the total number of computations of the objective function value, the value of function at the terminate iteration, the CPU time in seconds, respectively. nMathematical equation indicates the variable dimension of the test problems.

From Tables 1 and 2, it is easy to see that OMFSL performs better than other considering methods. Meanwhile, we presented the Dolan and Moré[22] performance profiles for OMFSL and other considered methods. Note that the performance ratio p(τ)Mathematical equation is the probability for a solver for the tested problems with the factor τMathematical equation of the smallest cost. In Figs.1-5, the performance of OMFSL with fixed step-length (5) and other considered methods are presented.

Thumbnail: Fig. 1 Refer to the following caption and surrounding text. Fig. 1 Performance profile on the absolute errors of f(x¯)Mathematical equation versus f(x*)Mathematical equation

Thumbnail: Fig. 2 Refer to the following caption and surrounding text. Fig. 2 Performance profile on the number of iteration

Thumbnail: Fig. 3 Refer to the following caption and surrounding text. Fig. 3 Performance profile on the NG

Thumbnail: Fig. 4 Refer to the following caption and surrounding text. Fig. 4 Performance profile on the NF

Thumbnail: Fig. 5 Refer to the following caption and surrounding text. Fig. 5 Performance profile on the Tcpu

As we can see from Fig.1, OMFSL is superior to all other considering methods for the absolute errors of f(x¯)Mathematical equation versus f(x*)Mathematical equation (the function value at the optimal solution). Figures 2-5 shows that OMFSL is better than all other considered methods for the number of iteration, the number of gradient evaluations (NG), the number of function value evaluations (NF) and CPU time (Tcpu). In conclusion, we can see that OMFSL without line search is very competitive for the test problems, and OMFSL is alternative for solving nonlinear unconstrained optimization problems.

Table 1

Numerical results for the tested problems

Table 2

Results of f(x¯)Mathematical equation and Tcpu for the tested problems

3 Conclusion

In this paper, we have combined a two-parameter conjugate gradient method defined by formula (6) with Sun-Zhang's step-length formula defined by formula (5) and have proved its global convergence property under the appropriate assumptions. This step-length formula (5) might be practical in case that the line search is expensive or hard. We allow certain flexibility in selecting the sequence {Qk} in practical computation, and we also can consider other updates for Qk. Our proofs require that the function is at least and the level set is bounded. Reducing the requirement of optimization function and finding completely constant step-length might be topic of further research.

References

  1. Zhang L M, Gao H T, Chen Z Q, et al. Multi-objective global optimal parafoil homing trajectory optimization via Gauss pseudo spectral method[J]. Nonlinear Dynamics, 2013, 72(1): 1-8. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  2. Jiang X Z, Jian J B. A sufficient descent Dai-Yuan type nonlinear conjugate gradient method for unconstrained optimization problems[J]. Nonlinear Dynamics, 2013, 72(1): 101-112. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  3. Sun J, Zhang J P. Global convergence of conjugate gradient methods without line search[J]. Annals of Operations Research, 2001, 103(1): 161-173. [CrossRef] [MathSciNet] [Google Scholar]
  4. Chen X D, Sun J. Global convergence of a two-parameter family of conjugate gradient methods without line search[J]. Journal of Computational and Applied Mathematics, 2002, 146(1): 37-45. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
  5. Hestenes M R, Stiefel E. Methods of conjugate gradients for solving linear systems[J]. Journal of Research of the National Bureau of Standards, 1952, 49(6): 409-436. [CrossRef] [Google Scholar]
  6. Fletcher R, Reeves C M. Function minimization by conjugate gradients[J]. The Computer Journal, 1964, 7(2): 149-154. [CrossRef] [MathSciNet] [Google Scholar]
  7. Polyak B T. The conjugate gradient method in extremal problems[J]. USSR Computational Mathematics & Mathematical Physics, 1969, 9(4): 94-112. [CrossRef] [Google Scholar]
  8. Flecher R. Practical Methods of Optimization, Vol1: Unconstrained Optimization[M]. New York: John Wiley & Sons, 1987. [Google Scholar]
  9. Liu Y, Storey C. Efficient generalized conjugate gradient algorithms[J]. Journal of Optimization Theory and Application, 1991, 69(1): 129-137. [CrossRef] [MathSciNet] [Google Scholar]
  10. Dai Y H, Yuan Y X. A nonlinear conjugate gradient with a strong global convergence property[J]. SIAM Journal on Optimization, 1999, 10(1): 177-182. [CrossRef] [MathSciNet] [Google Scholar]
  11. Deepho J, Abubakar A B, Malik M, et al. Solving unconstrained optimization problems via hybrid CD-DY conjugate gradient methods with applications[J]. Journal of Computational and Applied Mathematics, 2022, 405: 113823. [CrossRef] [MathSciNet] [Google Scholar]
  12. Abubakar A B, Kumam P, Malik M, et al. A hybrid conjugate gradient based approach for solving unconstrained optimization and motion control problems[J]. Mathematics and Computers in Simulation, 2022, 201: 640-657. [CrossRef] [MathSciNet] [Google Scholar]
  13. Goncalves M L N, Lima F S, Prudente L F. A study of Liu-Storey conjugate gradient methods for vector optimization[J]. Applied Mathematics and Computation, 2022, 425: 127099. [CrossRef] [Google Scholar]
  14. Chen C L, Luo L L, Han C H, et al. Global convergence of an extended descent algorithm without line search for unconstrained optimization[J]. Journal of Applied Mathematics and Physics, 2018, 6(1): 130-137. [CrossRef] [Google Scholar]
  15. Yu Z S. Global convergence of a memory gradient method without line search[J]. Journal of Applied Mathematics & Computing, 2008, 26(1): 545-553. [CrossRef] [MathSciNet] [Google Scholar]
  16. Narushima Y S. A memory gradient method without line search for unconstrained optimization[J]. SUT Journal of Mathematics, 2006, 42(2): 191-206. [CrossRef] [MathSciNet] [Google Scholar]
  17. Yin L, Chen X D. Global convergence of two kinds of three-term conjugate gradient methods without line search[J]. Asia Pacific Journal of Operational Research, 2013, 30(1):1-10. [Google Scholar]
  18. Li X, Chen X D. Global convergence of shortest-residual family of conjugate gradient methods without line search[J]. Asia Pacific Journal of Operational Research, 2005, 22(4): 529-538. [CrossRef] [MathSciNet] [Google Scholar]
  19. Du S Q, Chen Y Y. Global convergence of a modified spectral FR conjugate gradient method[J]. Applied Mathematics and Computation, 2008, 202(2): 766-770. [CrossRef] [MathSciNet] [Google Scholar]
  20. Zhu H,Chen X D. Global convergence of a special case of the Dai-Yuan family without line search[J]. Asia-Pacific Journal of Operational Research, 2008, 25(3): 411-420. [CrossRef] [MathSciNet] [Google Scholar]
  21. Zhu T F, Yan Z Z, Peng X Y. A modified nonlinear conjugate gradient method for engineering computation[J]. Mathematical Problems in Engineering, 2017, 2017(1): 1425857. [CrossRef] [PubMed] [Google Scholar]
  22. Dolan E D, Moré J J. Benchmarking optimization software with performance profiles[J]. Mathematical Programming, 2002, 91(2): 201-213. [CrossRef] [MathSciNet] [Google Scholar]

All Tables

Table 1

Numerical results for the tested problems

Table 2

Results of f(x¯)Mathematical equation and Tcpu for the tested problems

All Figures

Thumbnail: Fig. 1 Refer to the following caption and surrounding text. Fig. 1 Performance profile on the absolute errors of f(x¯)Mathematical equation versus f(x*)Mathematical equation
In the text
Thumbnail: Fig. 2 Refer to the following caption and surrounding text. Fig. 2 Performance profile on the number of iteration
In the text
Thumbnail: Fig. 3 Refer to the following caption and surrounding text. Fig. 3 Performance profile on the NG
In the text
Thumbnail: Fig. 4 Refer to the following caption and surrounding text. Fig. 4 Performance profile on the NF
In the text
Thumbnail: Fig. 5 Refer to the following caption and surrounding text. Fig. 5 Performance profile on the Tcpu
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.