At some point in our mid-teens, we breathed and ate the arithmetic-geometric mean inequality:

for any and any non-negative integers .

No doubt we pitied the poor souls who needed to have it stated. No other inequality could compare to it. In fact, we (meaning I) knew of nearly no other inequality: we saw the triangle inequality as common sense, knew it had an odd-looking higher-dimensional cousin called “Minkowski’s inequality”, and had seen Cauchy’s inequality once, but we’d never seen any of them applied in real life. (Admittedly, the triangle inequality gets used every time we walk straight to a place instead of taking the sides, but real life does not count as “real life” for the purposes of this paragraph.)

Then we did not hear anything about inequalities for a little while – and then Cauchy((-Bunyakovsky)-Schwarz) became ubiquitous. For non-mathematicians: Cauchy’s inequality states that for any real or complex numbers , or, verbally, “the dot product of two vectors is at most the square root of the product of their lengths”. A special case of it, namely,

is the Kalashnikov of analytic number theory: it can be used with ease by even a poorly trained student, with an impressively high kill ratio. In experienced hands, it is often one of several tools used to assault successfully an apparently impregnable target.

As soon as we learned some basic functional analysis, Cauchy and Minkowski fell into place, together with Hölder, of which Cauchy is a special case: Minkowski turned out to have stated that the expressions are bona fide norms even for , and Höder shows that the dual of such a norm is another one such norm. What is meant by “dual” here is an easily explainable notion with a great deal of depth –

– but we are getting farther away from our topic: what on earth happened to the arithmetic-geometric mean inequality? Only very, very rarely does it appear in actual mathematics. (To avoid a fuzzy region, I’ll count only occurences with as genuine.) Poor inequality. What happened?

Some takes on the issue:

- All of the other inequalities mentioned here have natural interpretations in terms of norms.
- Was the arithmetic-geometric mean inequality ever considered important in mathematics, or is it just a mirage created by maths competitions? In the former case, recur – when was it thought important and why, and when did this stop being the case? In the latter case – how did this meme enter teen maths competition culture, and why has it persisted?
- As a friend pointed out at lunchtime, the arithmetic-geometric mean inequality is equivalent to the statement that the exponential function is convex. In this guise, the inequality lives on.

Other possibilities? Should we put this to a vote?

I can only comment from the math competition point of view (while I concentrate on teaching number theory in the Finnish math olympiad training, I have also taught inequalities several times). I think the reason the arithmetic-geometric inequality lives on in math competitions is that it’s pretty, extremely easy to understand (although also very weak, so often, if you use it, and then you still need to use one more inequality, then you aren’t getting anything that you would like to get). Also, it’s a very good example of the use of the Jensen inequality (which, in my opinion, is one of the best inequalities to be taught in the math olympiad training as so many inequalities can be reduced to that).

Also, with the arithmetic-geometric inequality you know where you stand, and what the result should look like. If you can guess when the equality holds, then you can try introducing some weights (ie. chopping the terms into pieces so that the pieces are equal when the equality holds), etc.

So what I guess I’m trying to say is that the arithmetic-geometric inequality is educationally a very good tool. However, I’m not so sure that it’s actually so popular in IMO problems anymore (except maybe with n=3).

Anne-Marie: the arithmetic-geometric mean inequality is really easy to get used to, but I wonder whether most teenagers who use it have seen the explanation that really makes it seem obvious (namely, convexity of exp). Of course, it sounds as if you were giving them that explanation (see: Jensen), but I don’t think everybody does.

I also think that most of the good things you say about the inequality from a pedagogical standpoint would also apply to Cauchy.

I always teach them Jensen, because I think that it is the best weapon for inequalities in math competitions. I’ve thought that everybody teaches it, but of course, I really don’t know.

Probably many of those things would apply to Cauchy, but arithmetic-geometric is probably a bit easier to use at first: you only have one set of variables, for instance, so you just try to put something nice to a_i’s, and you don’t have to figure out how the b_i’s would change the situation.

Interesting. One can consider a function f(x)=( [a_1^x+…+a_n^x] / n ) ^{1/x} for non-negative a_i’s. Then this function is MONOTONE in x. Hence you get all your favourite inequalities, noting that f(-\infty)=min a_i, f(+\infty)=max a_i, … f(-1) mean harmonic, f(0)=mean geometric (in the limit!), f(1) mean arithmetic, etc.

Did you know that?

Stas: I knew the l_x norm was a monotone function on x on spaces of measure 1 (such as this one), but rarely think of x<1 in this context (since the l_x norm isn't really a norm then – Minkowski's inequality stops being valid). If I ever knew what you said, I had forgotten!

The AM-GM inequality is equivalent to the convexity of the logarithm function and this is used, for instance, in measure theory in showing that the maximum entropy for a sequence of independent trials is attained for the uniform distribution.

You are right, Keivan: the AM-GM inequality is equivalent both to the fact that the logarithm function is convex-down and (as said before) to the fact that the exponential function is convex(-up). The fact that entropy is maximised by equidistribution is a very nice application of this.

Hölder’s inequality is usually proved using Young’s inequality, which in turn is a consequence of the strict concavity of the logarithm. So perhaps the AM-GM inequality is a sort of invisible hand in real life mathematics.

Good point. At the same time, I would rather see the concavity (“convex-upness”) of the logarithm as lying at the root of all of this – the AM-GM inequality is some sort of sideshoot. The concavity of the logarithm is obvious from a graph; this can’t be said of the AM-GM inequality.

Is there any relation between what is meant here by Young’s inequality (namely, , where ) and Young’s inequality in functional analysis, or is it just a case of two inequalities’ being due to the same person (or not even)?

Harald, you may find that finding which Young they were actually due to slightly hard,

see

http://www-history.mcs.st-andrews.ac.uk/history/Biographies/Chisholm_Young.html

Dear Thomas –

Drats! But are both results at least due to the same equivalence class of cardinality 2?

Hi Harald,

Yes according to my memory and a quick use of google. I noticed the paper the Harmonic analysis inequality appeared in uses I rather than the mathematical we. Do you know when this custom developed?

Thomas

I am not sure. I recall Brandt (of somewhat obscure quadratic-forms fame) used “I” (or rather “ich”) in ways that sticked out particularly: paraphrasing slightly, “I have done this, whereas they did not, he wouldn’t, and you never could.” Quite confusingly, (E.) Landau would use “ich” but refer to himself in the third person as “Verfasser”. (3. Recent results; 3.1. Hilbert 3.2. Hadamard 3.3. Verfasser.) I think Littlewood states at some point in his Miscellany that he wondered for some time in his youth who that fine mathematician, Mr. Verfasser, could be and where he would be working at.

(Note: “Verfasser” means “author”; some prefer it to “Autor” due to Germanic-purism issues.)

Remarks of obscure importance on adjectives:

1) AM-GM is equivalent to the concavity of the logarithm. That you see the later easily by drawing the graph is merely a psychological advantage. So to call it a “sideshoot” seems a bit unfair. I stick to “invisible hand”!

2) Brandt and his matrices are pretty well known to anybody who has dealt with supersingular elliptic curves an isogenies between them (cf. Gross’s “Heights and special values of L-series”, Mestre’s “La méthode des graphes” and a bunch of papers on criptography). Is it fair to qualify his fame as “obscure”?

While the arithmetic-geometric mean inequality does follow from the convexity of the exponential, I believe that it is very important to point to out to the undergraduate students that it can be proved in a completely elementary way. I think it is disturbing when students try to “nuke” problems instead of trying to understand them from an elementary point of view.

I would see convexity as a very intuitive concept that can and should be introduced early. The convexity of the exponential is something that can be immediately and literally seen. One can then deduce the arithmetic-geometric mean inequality (if at all needed) from this – this can be done in a few lines, or perhaps left as an exercise.

The arithmetic geometric mean inequality simply says that if one has a rectangular box, then its volume is no greater than that of the cubical box whose side length is the average of the lengths of the sides of the rectangular box. Said this way, the inequality is both intuitively obvious and easy to remember. Also its role in mathematics, as a sort of primitive version of the isoperimetric inequality, is evident. In fact, a standard proof of the isoperimetric inequality goes via the Brunn-Minkowski inequality, which is just what one obtains from the arithmetic-geometric mean inequality by approximating general convex regions by rectangular boxes (such a proof can be found, for example, in Federer’s book on geometric measure theory).

Since the isoperimetric inequality is essentially equivalent to the Sobolev inequality, one can say that both are rooted in the arithmetic-geometric mean inequality, which is to say a big chunk of differential geometry and a big chunk of elliptic PDE theory are rooted in this inequality. One could see this all simply as a manifestation of the concavity of the logarithm, but, at least to the geometrically minded, it makes more sense to look at it the other way around, since the statement about boxes which began this comment is more basic than is the concavity of the logarithm.

This is very interesting – and I certainly would agree that the isoperimetric inequality essentially rests on this. At the same time, I am not sure I agree that the fact that a box of equal sides maximises volume is obvious – especially in dimensions higher than 2. It is a fact that is very easy to remember, yes, but its apparent obviousness to some may just arise from the fact that we are used to it. The concavity of the logarithm seems to me to be something that can be seen in a more immediate sense.

You could say that the optimality of the square (dim=2), at the very least, can also be seen from a graph. However, the concavity of the logarithm is both something that can be seen readily and proven in an absolutely mechanical way, whereas I do not think one can say quite the same for the isoperimetric inequality, except in the case of boxes.

That said, thank you for the link to the Sobolev inequality. Can you give more details (e.g., how the isoperimetric inequality is equivalent to the Sobolev inequality; I thought several inequalities went under the latter heading)?

Late joining the discussion, but…a pleasing but obscure area of nonlinear programming was Geometric Programming, which in its original form is the minimization over strictly positive variables of a “posynomial” function subject to posynomial constraints g(x)=1) by a “proper posynomial” constraint. Harmonic Programming also required a sequence of optimizations which converged to a stationary point.

GP started in the late ’60s/early ’70s. I wrote my dissertation on a financial application in ’77. Better nonlinear programming methods appeared, and interest in GP flagged by 1980s. I notice a resurrection since 2000–as so often happens with these kinds of methods. A recent tutorial is at http://www.stanford.edu/~boyd/papers/pdf/gp_tutorial.pdf.