Minimization

January 30, 2004, 04:25

Hello, I'm searching for a method which can evaluate the global minimum of a multivariable function f(x,y,z). But due to feedback consideration, I have only acces to values of grad(f)(no way to calculate f). How can I do?

February 3, 2004, 17:29

If you cannot evaluate the cost function "f" itself, then you cannot cheaply employ random searches of the solution space, or search methods such as genetic algorithms that must be fed the value of f at random locations. Since you have access to only the values of "grad(f)", for efficiency reasons you are restricted to optimization methods that rely on gradient information to descend into the valleys. Pick a classical gradient-based algorithm, such as a Newton-based or a quasi-Newton or Powell's method or BGFS etc.

Pick a suitable starting point r0 = (x0, y0, z0). Arbitrarily label the value of f at r0 as f0 = f(r0) = 0. Evaluate grad(f0). Step through the minimization (optimization) procedure, but anywhere in the algorithm that you come across a needed value of f, get a second or third order (if the Hessian is involved) estimate of f by extrapolation using previous points and the gradient at those points. You can numerically approximate the Hessian by numerically differentiating the Jacobian grad(f). Eventually the algorithm should terminate at a location where grad(f) is approximately zero (a minimum). This procedure will locate the minimum in the search space, i.e., rmin = (xmin, ymin, zmin), but of course it will not tell you the absolute value of fmin = f(rmin), since you do not have access to that. It only tells you the relative or gauge f, i.e., fmin - f0.

If you suspect that the minimum found by the algorithm is only local and not global, you will need to restart the algorithm from a different initial point, say r0n = (x0n, y0n, z0n). Integrate grad(f) from f(r0)=0 to the point r0n (using a second or third order integration based on values of grad(f)), to find a value of f(r0n) that is consistent with f(r0)=0. Use an integration contour (path) that is parallel to one of the coordinate axes for each of its three legs (from (x0,y0,z0) to (x0n,y0,z0) to (x0n,y0n,z0) to (x0n,y0n,z0n)), or integrate along a straight line from (x0,y0,z0) to (x0n,y0n,z0n), parametrically. Then begin the second run of the minimization algorithm at r0n. Similarly if you need a third or further runs.

Of course, you can use this procedure of integration of grad(f) to find a consistent gauge value of f for every single required function evaluation even in methods such as genetic algorithms or simulated annealing etc, but the cost would probably be prohibitive unless you are working in a really small search space.

Of course, to do all this, you will probably be unable to use canned routines. You should use a routine from say the book Numerical Recipes (Vetterling et al), and modify the source code to estimate values of f as I described. Good luck.

Ananda Himansu

February 4, 2004, 01:57

Perhaps I failed to mention the simple mathematical concept behind the bookkeeping that I suggested in the previous post.

Let the cost function that you are trying to minimize be denoted by g(r). Pick an arbitrary point r0=(x0,y0,z0) in your search domain. Then we can alternatively represent g as

g(r) = g(r0) + Definite Path Integral of grad(g) from r0 to r

Your cost evaluation function does not return g but rather grad(g). So g(r0) is unknown. Define a gauge cost function f by

f(r) = Definite Path Integral of grad(g) from r0 to r

From this definition, note that if g is single-valued, then f is independent of the path of integration (in theory). Also that f(r0) = 0. And that grad(f) = grad(g), and hessian(f) = hessian(g). The definition of f is theoretically equivalent to saying f(r) = g(r) - g(r0), but algorithmically the path integral definition of f must be used, since the value of g(r) is unknown. A key fact that follows from the definition of f is that if the minimum (gmin) of g occurs at rmin=(xmin,ymin,zmin), gmin = g(rmin), then the minimum (fmin) of f also occurs at rmin, i.e., fmin = f(rmin). Also, fmin = gmin - g(r0).

Apply a classical gradient-based minimization algorithm to f, taking into account that because of the integral definition, the value of f must be extrapolated from nearby point/s using the value of grad(g). The end-result of your minimization effort will be obtaining the values of rmin and fmin ( = f(rmin)-f(r0) = g(rmin)-g(r0)).

Ananda Himansu

February 4, 2004, 05:50

Thank you very much for your response The gauge function should be very usefull

February 4, 2004, 11:23

You are quite welcome. To clarify further, the path integral definition of g is obtained as the antiderivative of the differential relation

dg = grad(g) * dr

where the * represents a vector dot product, which may not have been clear from the way I wrote it previously.

With a classical gradient-based optimization algorithm, the construction of the gauge function f is only required in order to know how low your minimum is relative to your starting point, i.e., to know gmin-g(r0). If you did not care to know this, with pure gradient-based algorithms, you could eliminate all references to f itself, and merely find the point rmin as the endpoint of your descent path, without constructing f.

If your search space is small enough to do a systematic exhaustive search on a mesh with a spacing small enough to satisfy your accuracy requirement, then you will need to construct the gauge cost f, in order to efficiently compare the values and find the minimum achieved on the selected mesh. In this case, grad(g) is used only to construct f, instead of being used to select a descent direction.

February 5, 2004, 01:28

This may be more information than you are looking for, but ... my friend and officemate Farhad reminded me of a different approach to your minimization problem. This approach is not as general as the one I previously described, but worth mentioning since it will involve less programming on your part, and may also require less computation, because no numerical path integral is needed.

If you can be certain that your cost function g(r) is differentiable, and that it does not achieve its minimum on the boundary of your search domain, but rather at a point in the interior, then sufficient conditions that an interior point rmin be the location of a minimum (local or global) is that grad(g) = 0, and that hessian(g) be positive definite, at rmin.

You can define a new objective function

h(r) = grad(g)*grad(g)

where the * represents a vector dot product.

You can use a canned standard minimization algorithm without needing to modify it, to locate a minimum at rmin of the function h(r), then numerically check two things. First, that h(rmin) = 0 to within the error of your minimization algorithm (this verifies that grad(g)=0). Secondly, that hessian(g) is positive definite (check some books for suitable tests. For small matrices, a suitable test is that all principal minors are positive). This will eliminate the cases where rmin is a maximum or saddle point. If you have not located a true minimum of g or if the minimum is only local and not global, you can restart your search at a different initial point to try to find a different minimum of h. Note that it is not advisable to try to find rmin using multidimensional rootfinding algorithms, because the function h(r) does not cross the hyperplane h = 0 at rmin. In fact, it never crosses that hyperplane anywhere.

Of course, this approach is not as robust as the previous one. Another restriction on the approach is that unless you can be sure that h(r) is differentiable (hessian(g) must be well-defined everywhere), you must choose a minimization algorithm that relies only on values of h, and not on grad(h). Thus, you are restricted to the Nelder-Mead downhill simplex algorithm and to Powell's direction set methods.

After you have found the proper rmin, you can use the path integration of grad(g) described in the previous approach to estimate how low your minimum is relative to its value at some other point in the domain, i.e., g(rmin)-g(r0), if you desire to know this.

January 30, 2004, 04:25	Minimization	#1
doche Guest Posts: n/a	Hello, I'm searching for a method which can evaluate the global minimum of a multivariable function f(x,y,z). But due to feedback consideration, I have only acces to values of grad(f)(no way to calculate f). How can I do?

February 3, 2004, 17:29	Re: Minimization	#2
Ananda Himansu Guest Posts: n/a	If you cannot evaluate the cost function "f" itself, then you cannot cheaply employ random searches of the solution space, or search methods such as genetic algorithms that must be fed the value of f at random locations. Since you have access to only the values of "grad(f)", for efficiency reasons you are restricted to optimization methods that rely on gradient information to descend into the valleys. Pick a classical gradient-based algorithm, such as a Newton-based or a quasi-Newton or Powell's method or BGFS etc. Pick a suitable starting point r0 = (x0, y0, z0). Arbitrarily label the value of f at r0 as f0 = f(r0) = 0. Evaluate grad(f0). Step through the minimization (optimization) procedure, but anywhere in the algorithm that you come across a needed value of f, get a second or third order (if the Hessian is involved) estimate of f by extrapolation using previous points and the gradient at those points. You can numerically approximate the Hessian by numerically differentiating the Jacobian grad(f). Eventually the algorithm should terminate at a location where grad(f) is approximately zero (a minimum). This procedure will locate the minimum in the search space, i.e., rmin = (xmin, ymin, zmin), but of course it will not tell you the absolute value of fmin = f(rmin), since you do not have access to that. It only tells you the relative or gauge f, i.e., fmin - f0. If you suspect that the minimum found by the algorithm is only local and not global, you will need to restart the algorithm from a different initial point, say r0n = (x0n, y0n, z0n). Integrate grad(f) from f(r0)=0 to the point r0n (using a second or third order integration based on values of grad(f)), to find a value of f(r0n) that is consistent with f(r0)=0. Use an integration contour (path) that is parallel to one of the coordinate axes for each of its three legs (from (x0,y0,z0) to (x0n,y0,z0) to (x0n,y0n,z0) to (x0n,y0n,z0n)), or integrate along a straight line from (x0,y0,z0) to (x0n,y0n,z0n), parametrically. Then begin the second run of the minimization algorithm at r0n. Similarly if you need a third or further runs. Of course, you can use this procedure of integration of grad(f) to find a consistent gauge value of f for every single required function evaluation even in methods such as genetic algorithms or simulated annealing etc, but the cost would probably be prohibitive unless you are working in a really small search space. Of course, to do all this, you will probably be unable to use canned routines. You should use a routine from say the book Numerical Recipes (Vetterling et al), and modify the source code to estimate values of f as I described. Good luck. Ananda Himansu

February 4, 2004, 01:57	Clarification Re: Minimization	#3
Ananda Himansu Guest Posts: n/a	Perhaps I failed to mention the simple mathematical concept behind the bookkeeping that I suggested in the previous post. Let the cost function that you are trying to minimize be denoted by g(r). Pick an arbitrary point r0=(x0,y0,z0) in your search domain. Then we can alternatively represent g as g(r) = g(r0) + Definite Path Integral of grad(g) from r0 to r Your cost evaluation function does not return g but rather grad(g). So g(r0) is unknown. Define a gauge cost function f by f(r) = Definite Path Integral of grad(g) from r0 to r From this definition, note that if g is single-valued, then f is independent of the path of integration (in theory). Also that f(r0) = 0. And that grad(f) = grad(g), and hessian(f) = hessian(g). The definition of f is theoretically equivalent to saying f(r) = g(r) - g(r0), but algorithmically the path integral definition of f must be used, since the value of g(r) is unknown. A key fact that follows from the definition of f is that if the minimum (gmin) of g occurs at rmin=(xmin,ymin,zmin), gmin = g(rmin), then the minimum (fmin) of f also occurs at rmin, i.e., fmin = f(rmin). Also, fmin = gmin - g(r0). Apply a classical gradient-based minimization algorithm to f, taking into account that because of the integral definition, the value of f must be extrapolated from nearby point/s using the value of grad(g). The end-result of your minimization effort will be obtaining the values of rmin and fmin ( = f(rmin)-f(r0) = g(rmin)-g(r0)). Ananda Himansu

February 4, 2004, 05:50	Re: Clarification	#4
doche Guest Posts: n/a	Thank you very much for your response The gauge function should be very usefull

February 4, 2004, 11:23	Re: Clarification	#5
Ananda Himansu Guest Posts: n/a	You are quite welcome. To clarify further, the path integral definition of g is obtained as the antiderivative of the differential relation dg = grad(g) * dr where the * represents a vector dot product, which may not have been clear from the way I wrote it previously. With a classical gradient-based optimization algorithm, the construction of the gauge function f is only required in order to know how low your minimum is relative to your starting point, i.e., to know gmin-g(r0). If you did not care to know this, with pure gradient-based algorithms, you could eliminate all references to f itself, and merely find the point rmin as the endpoint of your descent path, without constructing f. If your search space is small enough to do a systematic exhaustive search on a mesh with a spacing small enough to satisfy your accuracy requirement, then you will need to construct the gauge cost f, in order to efficiently compare the values and find the minimum achieved on the selected mesh. In this case, grad(g) is used only to construct f, instead of being used to select a descent direction.

February 5, 2004, 01:28	Re: Minimization	#6
Ananda Himansu Guest Posts: n/a	This may be more information than you are looking for, but ... my friend and officemate Farhad reminded me of a different approach to your minimization problem. This approach is not as general as the one I previously described, but worth mentioning since it will involve less programming on your part, and may also require less computation, because no numerical path integral is needed. If you can be certain that your cost function g(r) is differentiable, and that it does not achieve its minimum on the boundary of your search domain, but rather at a point in the interior, then sufficient conditions that an interior point rmin be the location of a minimum (local or global) is that grad(g) = 0, and that hessian(g) be positive definite, at rmin. You can define a new objective function h(r) = grad(g)grad(g) where the represents a vector dot product. You can use a canned standard minimization algorithm without needing to modify it, to locate a minimum at rmin of the function h(r), then numerically check two things. First, that h(rmin) = 0 to within the error of your minimization algorithm (this verifies that grad(g)=0). Secondly, that hessian(g) is positive definite (check some books for suitable tests. For small matrices, a suitable test is that all principal minors are positive). This will eliminate the cases where rmin is a maximum or saddle point. If you have not located a true minimum of g or if the minimum is only local and not global, you can restart your search at a different initial point to try to find a different minimum of h. Note that it is not advisable to try to find rmin using multidimensional rootfinding algorithms, because the function h(r) does not cross the hyperplane h = 0 at rmin. In fact, it never crosses that hyperplane anywhere. Of course, this approach is not as robust as the previous one. Another restriction on the approach is that unless you can be sure that h(r) is differentiable (hessian(g) must be well-defined everywhere), you must choose a minimization algorithm that relies only on values of h, and not on grad(h). Thus, you are restricted to the Nelder-Mead downhill simplex algorithm and to Powell's direction set methods. After you have found the proper rmin, you can use the path integration of grad(g) described in the previous approach to estimate how low your minimum is relative to its value at some other point in the domain, i.e., g(rmin)-g(r0), if you desire to know this.

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Minimization discrete values.	nico	Main CFD Forum	0	December 14, 2006 11:03
Minimization problem	Doche	Main CFD Forum	3	February 2, 2005 11:01