Multivariable Calculus, Locally Invertible

The Derivative of the Inverse

Let f be locally invertible near f(p) = q. In other words, there is some g(y) such that g(f(x)) = x for a small neighborhood about p. (Here x and y are n dimensional vectors.) If g is differentiable, differentiate g(f(x)) = x using the chain rule. Thus g′(y)*f′(x) = 1, where 1 is the identity matrix and * is matrix multiplication. In other words, the jacobian of the inverse function is the inverse matrix of the jacobian.

But is g diffferentiable? Move coordinates to the origin for convenience. Thus f(0) = 0, and the jacobian is still J. Let K be the inverse matrix of J. We are interested in bounding g(y)-K*y over |y|. We know we can bound f(x)-J*x over |x|, since f is differentiable at 0. Restrict this quotient to ε by keeping x in a small neighborhood about 0, then restrict y to the image of this neighborhood. Now g(y)-K*y over |y| becomes x-K*f(x) over |f(x)|. Replace f(x) with J*x+|x|×e(x), where e is the error term that is bounded by ε. Since K*J is the identity matrix, the numerator becomes -|x|×K*e(x), where * indicates multiplication of the error vector e through the matrix K. The denominator is the length of J*x+|x|×e(x). How is the length of x affected by its transform through J? This requires the theory of quadratic forms, an advanced topic in linear algebra which is outside the scope of this page. Suffice to say, there is a minimum and maximum scaling factor, corresponding to the minimum and maximum eigen values, which can be derived from J. Let t be the minimum scaling factor. Thus the length of x is scaled by at least t. Make sure ε is much smaller than t, so that the entire denominator looks like t×|x|. Similarly, let s be the maximum scaling factor of K, hence the numerator is no more than s×ε×|x| distance from the origin. Now the entire fraction reduces to (s/t)×ε. Do this for arbitrarily small ε, and g satisfies the definition of differentiable, with jacobian K.

Criteria for Locally Invertible

fine - but how do we know f is locally invertible in the first place? A nonsingular jacobian is not sufficient. We provided a counterexample in one variable, hence there is no assured inverse function in n variables.

However, if f is differentiable about 0, and the derivative is continuous at 0, so that J remains nonsingular near 0, then yes, f is locally invertible.

Here is the idea. If f(p) = f(0), draw a straight line from 0 to p, whose image is a loop in the range. As u moves along the path from 0 to p, its image v = f(u) starts at 0 and traces a path that returns to 0. Let v leave 0 traveling perpendicular to a hyperplane w. Eventually v has to return to w, since it comes back to 0. The distance to w is a continuous function on [0,1], and it attains its maximum. At that point the path has turned 90°. The image point v left 0 heading out of the plane w, and as it turned back towards w, it had to travel parallel to w, at least for an instant. The direction vector, tangent to the path, has changed by 90°. We will use continuity to constrain the direction vector, so this cannot happen. In other words, the function is too smooth for v to turn around in such a short time.

First we need to find a neighborhood in which the direction vector is well defined. Remember that J is symmetric, since mixed partials are equal. A theorem in advanced linear algebra says a symmetric matrix has orthogonal eigen vectors. Since J is nonsingular the eigen values are all nonzero. So there is a frame of reference in which f merely scales the axes, at least near 0. Rotate your frame of reference, so that the coordinate axes are the eigen vectors of J; hence J becomes diagonal. Also, perform any reflections, so that the eigen values are all positive. The length of a vector, when fed through J, is now multiplied by something in between the minimum eigen value b and the maximum eigen value c, the smallest and largest entries on the diagonal of J. Thus f is a bounded linear operator, at least at 0.

The characteristic polynomial of J is a continuous function near 0, hence the eigen values are continuous near 0. The minimum eigen value can be held close to b, and the maximum eigen value can be held close to c. For some neighborhood about 0, each vector, multiplied through the jacobian, has its length scaled by something between b-ε and c+ε. The directional derivative is never 0. Let u travel in any direction, and as long as it is close to 0, v is traveling in a certain direction. It does not pause, even for an instant. In fact its speed is somewhere between b and c.

As u and v move together, take the velocity of v and divide by speed, giving a unit vector that indicates direction. The direction of travel for u defines the direction of travel for v. This is a map from unit vectors to unit vectors; a function from the sphere to the sphere.

The direction of v is a mathematical function of J and the direction of u, which is continuous in all variables. Move x just a bit, or let u travel in a slightly different direction, and v changes its direction just a bit.

Cross the unit sphere with a tiny closed ball and find a compact product space. Continuous on this compact space becomes uniformly continuous. Some δ assures us that the direction of any path in the range will not change by more than 15°. Restrict the neighborhood about 0 to something smaller than δ. Wherever we are in this neighborhood, close to the origin, if u travels in a fixed direction, v travels in a certain direction, plus or minus a few degrees. If u moves in a straight line, v cannot turn 90°. That proves f is 1-1.

Proving f is surjective is just as painful. Take the radius δ of our neighborhood about 0 and multiply it by b/2. This defines a radius about 0 in the range. Let y be any point in this neighborhood. Let u take steps towards an unknown destination, as v takes steps towards y. Let's see how to take that first step.

Send y through J inverse, giving a vector x₁ in the domain. Let u move from 0 to x₁, while v moves from 0 towards y₁. If f were linear, we would be done. The image v would wind up at y, and f(x₁) would equal y. Of course f isn't linear, but it's pretty close. The velocity of v, in speed or direction, has not changed much from its initial value. Thus v winds up pretty close to y, closer than it was before.

Starting at x₁,y₁, build a new path to y. Now u stops at x₂ and v stops at y₂, and y₂ is closer to y than y₁. Repeat this process, as v homes in on y. If you do everything right, you can approach y geometrically. A sequence in the domain maps to a sequence in the range, whose limit is y. The steps in the domain shrink geometrically, and the sequence is cauchy, with a limit of x. By continuity, f(x) = y.

For some neighborhood about 0, f is 1-1 and onto, and invertible.

I realize I skipped past a lot of details, but a rigorous δ ε proof would go on for several pages, and I think you'd lose interest. I probably wouldn't get all the details right anyways. On an intuitive level, I think we know what is going on. Close to the origin, f keeps the velocity of v pretty close to constant, as long as u moves in the same direction, and that's enough to keep v from turning around on itself, as it homes in on any point in the range.

Bicontinuous Function

We focused on the origin, but that was a matter of convenience. Given any open set o and any point p in o, the image of o encloses f(p). Thus the image of o can be covered in open sets, and is open. If the jacobian of f is continuous and nonsingular throughout a region, f is bicontinuous on that region. Open sets in the domain correspond to open sets in the range. And if f is also injective, it implements a homeomorphism on the two regions.