Optimising under arbitrarily many constraint equations

By dkl9, written 2024-218, revised 2024-218 (0 revisions)


Say we have a multivariate function to optimise, like f = x² + y² + z², under some constraints, like g1 = x² + y² - z and g2 = y + z - 1, both to equal zero.

The common method is that of Lagrange multipliers.

  1. Add a variable λ for each constraint function — here, we'll use λ1 and λ2.
  2. Declare the set of equations ∇f = λ1g1 + λ2g2.
  3. Bring in the equations g1 = 0 and g2 = 0 (etc, if there are more constraints).
  4. Solve for λ and, more importantly, the inputs x, y, z.

Lagrange multipliers annoy me, insofar as they introduce extra variables. There is another way — arguably more direct, if perhaps more tedious in calculation and less often taught. I found it alone, tho surely someone else did first — probably Euler.

§ Lagrange, anyway

For the sake of a standard answer to check against, let's use Lagrange multipliers.

The gradient of x² + y² + z² is [2x, 2y, 2z]. Likewise, ∇(x² + y² - z) = [2x, 2y, -1], and ∇(y + z - 1) = [0, 1, 1]. So step 2 gives these equations:

It readily follows that λ1 = 1 or x = 0.

If λ1 = 1, then λ2 = 0, and z = -1/2. By the second constraint, y + z - 1 = 0, find that y = 3/2. By the first constraint, x² + y² - z = 0, find that x² = -11/4, which is a contradiction for real inputs.

If x = 0, then, by the first constraint, z = y², and, by the second constraint, y² + y - 1 = 0, so y = (-1 ± sqrt(5))/2 and z = (3 ∓ sqrt(5))/2.

§ Determinants

With one constraint, the method of Lagrange multipliers reduces to ∇f = λg. ∇f and ∇g are vectors, which differ by a scalar factor iff they point in the same (or directly opposite) directions iff (for three dimensions) the cross product ∇f × ∇g = 0 iff (for two dimensions) the two-by-two determinant |∇fg| = 0.

With two constraints, the method asks when ∇f = λg + μh. That would mean ∇f is a linear combination of ∇g and ∇h, which it is iff ∇f, ∇g, and ∇h are all coplanar iff (for three dimensions) the three-by-three determinant |∇fgh| = 0.

As it happens, the cross product is a wolf that can wear determinant's clothing. Just fill one column with basis vectors: ∇f × ∇g = |∇fg [ê1 ê2 ê3]|.

Likewise, with zero constraints, the "method of Lagrange multipliers" — really, the first-derivative test — asks when ∇f = 0. Fill a three-by-three matrix with two columns of basis vectors: [∇f [ê1 ê2 ê3] [ê1 ê2 ê3]]. Suppose the basis vectors multiply like the cross product, as in geometric algebra. Then the determinant, rather than the usual 0 for a matrix with two equal columns, turns out to equal that ordinary column vector ∇f (up to a scalar constant).

In every scenario so far — and I claim this holds for higher dimensions and more constraints — the core equations to optimise under constraints are the actual constraint equations, along with a single determinant. The matrix has its columns filled with the gradient of the function to optimise, each constraint gradient, and copies of the basis vectors, in order, to make it square.

§ Example

Fill a matrix with those gradients given above. We'll take its determinant.

fg1g2
2x2x0
2y2y1
2z-11

The determinant, when simplified, is 2x(1 + 2z). The equations to consider are just

The first tells us that x = 0 or z = -1/2. If x = 0, z = y², so y² + y - 1 = 0, so y = (-1 ± sqrt(5)) / 2, and z = (3 ∓ sqrt(5))/2. If z = -1/2, then y = 3/2 and x is imaginary. These are the same results as above; the method works, using only the variables given in the problem.