一般的零点求解方法
给定给定区间 [a,b] ,函数连续且 f(a)⋅f(b)<0 ,那么根据介值定理,函数必然在区间内有根。
- 二分法:将区间不断二分,使端点不断逼近零点。下一次迭代的区间为 [a,c 或 [c,b] ,其中 c=2a+b 。
- 割线法 (线性插值):基本思想是用弦的斜率近似代替目标函数的切线斜率,并用割线与横轴交点的横坐标作为方程式的根的近似。即给定两个点 (a,f(a)),(b,f(b)) 。其割线方程为 y=b−af(b)−f(a)⋅(x−b)+f(b) ,那么令 y=0 , x 的值即为下一次迭代的结果
c=b−f(b)−f(a)f(b)⋅(b−a)
- 逆二次插值法:为割线法的进化版本。使用三个点确定一个二次函数,二次函数与横轴交错的点即为下次迭代的值。但是,其二次函数可能不会和横轴相交,因此做出一点改变,以y值作为自变量。给定三个点 (xi−2,f(xi−2)),(xi−1,f(xi−1)),(xi,f(xi)), 则通过这三个点确定的二次函数为
x=(f(xi−2)−f(xi−1))(f(xi−2)−f(xi))(y−f(xi−1))(y−f(xi))⋅xi−2+(f(xi)−f(xi−2))(f(xi)−f(xi−1))(y−f(xi−2))(y−f(xi−1))⋅xi+(f(xi−1)−f(xi−2))(f(xi−1)−f(xi))(y−f(xi−2))(y−f(xi))⋅xi−1
,令 y=0 ,求得
xi+1=(f(xi−2)−f(xi−1))(f(xi−2)−f(xi))f(xi−1)f(xi)⋅xi−2+(f(xi)−f(xi−2))(f(xi)−f(xi−1))f(xi−2)f(xi−1)⋅xi+(f(xi−1)−f(xi−2))(f(xi−1)−f(xi))f(xi−2)f(xi)⋅xi−1
布伦特法
初始化区间 (a0,b0) 使得 f(a0)⋅f(b0)<0 。其中 bk 是上次迭代中的根估计值。如果 ∣f(a0)∣<∣f(b0)∣,那么赋值互换(我们认为对应函数值的绝对值较小的点更接近真正的根值) 。
每次迭代包含四个点:
- bk :为当前迭代的根估算值;
- ak : 对位点,即满足 ∣f(ak)∣<∣f(bk)∣且 f(ak)⋅f(bk)<0 的值。
- bk−1 : 上一次迭代的根估算值,第一次迭代设置为 bk−1=a0
- bk−2 :上上此迭代的根估算值(不用初始化,在首次迭代过程中,不会用到他来进行判断,结尾进行赋值)。
方法的选择
有以下四个不等式:
∣δ∣<∣bk−bk−1∣ (1) ∣δ∣<∣bk−1−bk−2∣ (2) ∣s−bk∣<21∣bk−bk−1∣ (3) ∣s−bk∣<21∣bk−1−bk−2∣ (4)
上次迭代为二分法且 (1) 为假;上次迭代为二分法且 (3) 为假;上次迭代为插值法且 (2) 为假;上次迭代为插值法且 (4) 为假;以插值法计算的临时值不在 43ak+bk 和 bk 中间,以上五个条件满足一个,那么本次迭代的值采用二分法,否则采用插值法。
而插值法的选择如下:如果三点各不同,则用二次插值;否则用线性插值。
本次迭代的临时值 s 作为区间的一个端点,另一个端点在 ak 和 bk 中选择,二者作为 ak+1,bk+1 ,且满足 f(ak+1)⋅f(bk+1)<0 , ∣f(ak+1)∣>∣f(bk+1)∣
Brent's method is a root-finding algorithm which combines root bracketing, bisection, and inverse quadratic interpolation. It is sometimes known as the van Wijngaarden-Deker-Brent method. Brent's method is implemented in the Wolfram Language as the undocumented option Method
→ Brent
in FindRoot[eqn, {x,x0,x1}].
Brent's method uses a Lagrange interpolating polynomial of degree 2. Brent (1973) claims that this method will always converge as long as the values of the function are computable within a given region containing a root. Given three points x1, x2, and x3, Brent's method fits x as a quadratic function of y, then uses the interpolation formula
x=[f(x3)−f(x1)][f(x3)−f(x2)][y−f(x1)][y−f(x2)]x3+[f(x1)−f(x2)][f(x1)−f(x3)][y−f(x2)][y−f(x3)]x1+[f(x2)−f(x3)][f(x2)−f(x1)][y−f(x3)][y−f(x1)]x2
Subsequent root estimates are obtained by setting y=0, giving
x=x2+QP
where
P=S[T(R−T)(x3−x2)−(1−R)(x2−x1)]Q=(T−1)(R−1)(S−1)
with
RST≡f(x3)f(x2)≡f(x1)f(x2)≡f(x3)f(x1)
Brent's Method
Brent's method for approximately solving f(x)=0, where f:R→R, is a "hybrid" method that combines aspects of the bisection and secant methods with some additional features that make it completely robust and usually very efficient. Like bisection, it is an "enclosure" method that begins with an initial interval across which f changes sign and, as the iterations proceed, determines a sequence of nested intervals that share this property and decrease in length. Convergence of the iterates is guaranteed, even in floating-point arithmetic. If f is continuous on the initial interval, then each of the decreasing intervals determined by the method contains a solution, and the limit of the iterates is a solution. Like the bisection and secant methods, the method requires only one evaluation of f at each iteration; in particular, f′ is not required.
The following provides a rough outline of how the method works. The full details of the method are complicated and can be found in R. P. Brent, Algorithms for Minimization without Derivatives, Prentice-Hall, 1973. The method builds upon an earlier method of T. J. Dekker and is the basis of MATLAB's fzero routine.
At each iteration, Brent's method first tries a step of the secant method or something better. If this step is unsatisfactory, which usually means too long, too short, or too close to an endpoint of the current interval, then the step reverts to a bisection step. There is also a feature that occasionally forces a bisection step to guard against too little progress for too many iterations. In the details of the method, a great deal of attention has been paid to considerations of floating-point arithmetic (overflow and underflow, accuracy of computed expressions, etc.).
An overview of the operation of the method is as follows:
- The method begins with
- a stopping tolerance δ>0,
- points a and b such that f(a)f(b)<0.
If necessary, a and b are exchanged so that ∣f(b)∣≤∣f(a)∣; thus b is regarded as the better approximate solution. A third point c is initialized by setting c=a.
- At each iteration, the method maintains a,b, and c such that b=c and
(i) f(b)f(c)<0, so that a solution lies between b and c if f is continuous;
(ii) ∣f(b)∣≤∣f(c)∣, so that b can be regarded as the current approximate solution;
(iii) either a is distinct from b and c, or a=c and is the immediate past value of b.
Each iteration proceeds as follows:
- If ∣b−c∣≤δ, then the method returns b as the approximate solution.
- Otherwise, the method determines a trial point b^ as follows:
(i) If a=c, then b^ is determined by linear (secant) interpolation: b^=f(b)−f(a)af(b)−bf(a).
(ii) Otherwise, a,b, and c are distinct, and b^ is determined using inverse quadratic interpolation:
- Determine α,β, and γ such that p(y)=αy2+βy+γ satisfies p(f(a))=a, p(f(b))=b, and p(f(c))=c.
- Set b^=γ.
- If necessary, b^ is adjusted or replaced with the bisection point. (The rules are complicated.)
- Once b^ has been finalized, a,b,c, and b^ are used to determine new values of a,b, and c.
(The rules are complicated.)
Remark: In part (ii) of step 2, the coefficients α,β, and (especially) γ are easily determined using standard methods at the cost of a few arithmetic operations. (Of course, there needs to be a safeguard against the unlikely event that f(a),f(b), and f(c) are not distinct.) Note that γ is just p(0), so if f really were the inverse of a quadratic, i.e., f−1(y)=p(y)=αy2+βy+γ for all y, then b^=γ would satisfy f(b^)=f(p(0))=f(f−1(0))=0. Thus inverse quadratic interpolation provides a low-cost approximate zero of f that should be more accurate than that obtained by linear (secant) interpolation. Note that if direct quadratic interpolation were used instead of inverse quadratic interpolation, i.e., if we found p(x)=αx2+βx+γ such that p(a)=f(a),p(b)=f(b), and p(c)=f(c), then it would be necessary to find a b^ such that p(b^)=0 using the quadratic formula, which involves a square root. By using inverse quadratic interpolation, Brent's method avoids this square root.