(Short story for those who have been following the matter in the last couple of weeks: Babai’s fix is both correct and elegant – and I spent last week going through other things to make sure that the overall procedure was correct as well. I gave a Bourbaki talk on all of this last Saturday; here is the video.)

Bourbaki talks, and the articles that accompany them, are expository. As many readers know, they are given by invitation; speakers are designated by N. Bourbaki (who does not exist) and assigned to work on a recent paper, or series of papers, by other people. The topic is typically close to one of the speaker’s specialties, but not quite within it.

A Bourbaki article has a target audience broader than that of specialists, though of course it is still aimed at people in maths and allied fields. It is often a step in the process by which a new proof is elucidated, polished, and in general assimilated into the general body of knowledge. I have tried to do my best for Babai’s remarkable work.

I also hope this will help lead to further improvements in the area, now that the correctness and precise strength of the result are clearer.

]]>

From Wikipedia:

“The **Séminaire Nicolas Bourbaki** (**Bourbaki Seminar**) is a series of seminars (in fact public lectures with printed notes distributed) that has been held in Paris since 1948. It is one of the major institutions of contemporary mathematics, and a barometer of mathematical achievement, fashion, and reputation. It is named after Nicolas Bourbaki, a group of French and other mathematicians of variable membership.”

As you can see from the link, a Bourbaki talk is always given by a speaker about other people’s work. It is accompanied by an expository paper explaining difficult recent material.

It is an honor and a pleasure of me to give a Bourbaki talk on Babai’s major breakthrough. This has been an exciting story.

I understand that my talk will be streamed live: live video. The notes will be made available online in the very near future. I hope they will make it easier for all others to follow the proof themselves.

]]>

Preparing a Bourbaki talk implies (a) going through somebody else’s work in great detail, and (b) preparing an expository paper on the subject. My exposition of Babai’s work is ready and will be publicly available in the next few days. It is a complete walkthrough of the proof, and should allow others to verify, as I did, that the modified version is correct.

Here is an excerpt from my introduction (in French in the original):

“Thm 1.1 (Babai) .- The string isomorphism problem can be solved in time exp(exp(O(sqrt(log n log log n)))) for strings of length n.

It is clear that the bound here is sub-exponential, but not quasipolynomial. In November 2015, Babai announced a solution in quasipolynomial time, with an explicit algorithm. The process of preparing this expository work confirmed that the algorithm was correct, or easily repairable, but it also made me realize that the time analysis was incorrect. The version announced here is correct.

Thm. 1.2 (Babai).- The graph isomorphism problem can be solved in time exp(exp(O(sqrt(log n log log n)))), where n is the number of vertices.

Our main references will be [Babai’s 2015 preprint and his extended abstract at STOC 2016]. We will attempt to examine the proof in as much detail as an expository work of this format allows, in part to help eliminate any doubt that may remain on the current form of the result. The error lay in a part of the proof that can be isolated and corrected, and might be later improved on its own (“Split or Johnson”). The rest of the work – rich in innovative ideas – is still valid.”

See also Laci Babai’s own announcement on the subject:

http://people.cs.uchicago.edu/~laci/update.html

**Update (Jan 14): ** As many of you know, Babai posted last Monday that he had fixed the problem. This is just to tell you that (a) his fix is correct (and elegant), (b) I have spent the last five days checking other things in the proof and making some bits explicit. I can now state with assurance that the proof is correct: Babai is now giving an algorithm that works in quasipolynomial time. See the video of my Bourbaki talk at https://www.youtube.com/watch?v=7NR975OM2G8&list=PL9kd4mpdvWcCN64K5VhaYFH_gBa-WlIr3&index=1

I will put up the corresponding expository article soon.

]]>

Summer was full of activities and travel. After giving a course at the summer school on analytic number theory at IHES, I went to Korea for most of August (ANTS and of course the ICM). Then I went back to Paris and got the keys to my new office at IMJ (Paris VI/VII), where I will be from now on as a CNRS *Directeur de recherche*. The office is in the Paris VII campus, close to the Bibliothèque Nationale and, most importantly, the Cinémathèque.

… and then I left for Saint Petersburg, to spend a trimester there as a Lamé Chair. During that time, I was offered a Humboldt Professorship (at Göttingen) , which I am now considering.

I have been back in Paris since late December, after relatively brief trips to Perú and the Czech Republic. I also managed to lose the keys to my office in the interval. It took a very long time to get another set of keys.

On a different note – my survey paper on growth in groups got accepted; it is about to appear. I also spent a great deal of time rewriting my proof of the Ternary Goldbach Conjecture – it is now in an essentially self-contained monograph. Besides adding expository material, I simplified some parts of the proof – notably the part on parabolic cylinder functions.

I have some partial drafts of planned blog posts from the last few months. Let me include here a brief set of impressions from my Korea trip. I did this in French, just to keep my chops up.

Ma première conférence en Corée a eu lieu – comme il est habituel dans certains pays – dans un hôtel, quelque peu isolé du reste de la ville de Gyeongju. En route de la gare, il était possible de voir un autre visage de la Corée que celui d’un pays hautement industrialisé; au fait, des parties de Gyeongju ressemblent à certaines villes de province d’un pays en développement, avec une intense activité commerciale conduite dans des petits locaux d’aspect plutôt pauvre. L’hôtel, au bord d’un lac, était luxueux, au moins dans ce que concernait ses espaces communs; il portait le nom d’un des conglomérats qui dominent la vie économique du pays. Après peut-être une demi-heure, j’ai eu envie de partir et voir un peu du monde hors ses murs. Pour être précis: je suis parti avec l’intention de trouver du 청국장 – c’est-à-dire, le fameux tofu puant coréen.

Je disposait de peut-être dix mots de coréen. Heureusement, le chauffeur du taxi commandé par l’hôtel n’était pas seulement un homme de bonne volonté, mais aussi très expressif, au point que j’arrivais à croire que je comprénnais une petite partie de ce qu’il me disait. Il m’a laissé à l’entrée d’un petit restaurant traditionnel qui n’avait pas du tofu puant. Là, quelqu’un m’a dirigé vers ce qui s’est avéré être un atelier semi-urbain où, à juger par un odeur très prometteur, du tofu puant était fabriqué. Malheureusement, l’atelier était fermé.

J’ai retrouvé le taxi, lequel reprenait sa route vers la ville. Après assez d’efforts, le chauffeur a trouvé un restaurant populaire pas loin de l’hôtel; après avoir vérifié qu’au moins un plat à base de tofu puant était servi, il est parti en refusant d’accepter qu’une partie de ce que le taximètre indiquait. En très peu de temps, je me suis trouvé face à un festin destiné qu’à moi, consistant en un ensemble de petits plats, y inclus du poisson salé, des grandes feuilles ressemblant à la menthe (mais en beaucoup mieux), et, bien sûr, du 청국장.

Ce dernier ne puait pas du tout; il avait plutôt un bouquet qui ressemblait à ceux de certain fromages, a savoir ceux dont on dit qu’ils puent.

Something that truly excited me about Korea was the possibility to use my newly-found knowlege of the Korean script. Indeed, it really helped me to get around, even without any actual knowledge of Korean. I had studied the Korean script twice: once, when Don Zagier taught me the basic principles in twenty minutes after dinner at the British Mathematical Colloquium; the second time, when I skimmed the wikipedia page on it on the morning of the day when I flew to Korea.

As you might have gathered, it is, so to speak, very logical – in fact, it was designed specifically for the Korean language, by a small committee of rather talented 15th-century people. The name it was given once by some (아침글, “one-morning writing”) was presumably meant to be pejorative, but it does reflect how quickly one can learn it. Or, to put it in a much more complex script, 故智者不終朝而會，愚者可浹旬而學 (or so I read).

A mystery remains: why is it that, in Korea, good local food is cheap, as are alcoholic drinks, but coffee is expensive? Or is this just the price that people pay to be seen in a Westernized coffee shop – that is, can one get less expensive coffee elsewhere?

]]>

Let me first announce the following conference, which will take place in St Petersburg at the end of November:

I hope it will turn out rather well. Please tell us if you want to attend – though, as we have limited funds, you would essentially have to fund your trip through your own grant (or your advisor’s, if you are a student, but we *might* be able to help with lodging in that case.) Things may be easier if you are in Russia. All are welcome!

]]>

an article (by Martin Andler, from Versailles) on the invited speakers at this year’s ICM, and, in particular, on geographical shifts in their career.

Besides offering some tables, the article briefly enumerates a few perspectives on the movement of mathematicians between countries, without really endeavouring to resolve the tensions between them. My aim here will be to give a summary of the situation and what I see as a second step in the analysis of these issues, in part with the hope that readers of this blog will discuss them further.

In case you have not looked at the tables yet: the three countries in which the most speakers were born are France (30 speakers), the Soviet Union (27) and the United States (26); the next countries in the list, far behind, are Germany and the UK (12 each). This should be completely unsurprising to anybody within the mathematical community. Then we have Italy and China, each at 9, and Hungary at 8. (South America – indeed all of Latin America – has a grand total of 7, including 3 from Argentina and 2 from Brazil.)

The birth-to-PhD and PhD-to-work charts show that the US acts as a very strong attractor for prospective doctoral students (58.5 speakers born elsewhere got their doctoral degrees in the US, and no speakers born in the US got their PhDs elsewhere; here, fractional numbers indicate shared positions/studentships and the like), but not as a workplace (28.5 speakers left after getting their PhD in the States, and 21.5 moved to the States after getting their PhD elsewhere). France works almost as a closed system at the educational level (only 3 speakers born in France got their PhDs elsewhere, and 3.5 did the inverse move) and as a very mild attractor as far as jobs are concerned (8.5 foreign PhDs moved to France, including 4 from the US, and 4 French PhDs moved out of France). As for the Soviet Union – 23.5 out of 27 speakers born there live *outside* the successor states (13 of them in the US), 10 moved out already to get their PhDs (6 of them in the US), and nobody not born in the Soviet Union currently works in the successor states. Only 3 of the 16 speakers born in Eastern-Europe-minus-USSR got their PhD there, and only one of them works in Eastern Europe now (a Hungarian in Hungary). Again, the overall picture agrees roughly with what conventional wisdom would have expected.

As for Latin America – 7 speakers were born there, and 3.5 left for the US to their PhDs, with the others staying in their home countries (Brazil and Argentina); 2 work in the US, four work in their home countries (again, Brazil and Argentina) and one works in France (myself). Two speakers born outside South America now live there; both are former Soviet citizens working in Brazil. Of course, the numbers are so low that one should take care not to see patterns that are not really there. (The same goes for Africa – there are two speakers from there, one working in Africa and one in the States.)

The article states that there are four different ways to view geographical shifts. In its words, these are (a) individual freedom, (b) the progress of science, (c) competition (as a way to advance science), (d) brain drain.

These are not really fully parallel to each other. For instance, “individual freedom” is not really a way to evaluate what goes on, or even a way to decide policy goals; rather, it would seem to be a principle that limits what policy tools we are willing to consider. At the same time, Andler states under this heading that “there are other compelling reasons to want to leave one’s country, e.g., miserable economic conditions or completely inadequate working conditions”. This belongs under its own (central) heading – namely, the conditions that make an individual able to work as a mathematician.

The second perspective (“the progress of science”) is said to be the one that what matters is the advancement of mathematics – and that it is our collective duty to ensure that mathematicians can develop themselves and work, and our individual duty to devote our lives to advance mathematics. This I would say to be uncontroversial, as far at least as mathematicians are concerned; not only is it something many could subscribe as a guiding philosophy – it is also an entirely reasonable way to frame the entire discussion. Other parties may have other goals in mind (national prestige, say, or, in the case of funding agencies, some more or less arbitrarily set formal goal), but most of the motives we would actually consider, including completely altruistic ones, fit nicely within this framework.

(Note that the article quotes Weil on dharma here. This is an example of something unfortunate: the clearest statement of a position is made by one of its more extreme proponents, and that, of course, has the effect of making the position seem a little less tenable.)

The third perspective – namely, “competition” – states that only by competing for the best faculty and students will universities have an incentive to keep or increase their level and give to faculty and students the working conditions they need to do mathematics. All of this is true, though one thing is not addressed – namely, that it is doubtful that, at the top of the pay scale in a few countries (US, say), further increases in salaries are really improving the ability of mathematicians to do mathematics, as opposed to simply serving as a tool for universities to compete (and a factor by which a few universities have a large, in-built advantage). It is also the case that salaries and especially working conditions can be set more by tradition than by anything else: for example, in the French system, salaries are essentially uniform regardless of location (thereby making faculty at some top institutions *less* well paid, in real terms, than in the provinces) – whereas, in the US system, which arguably has the largest financial basis of any, positions with truly light teaching loads are very rare, and positions with no teaching loads are essentially non-existent (in comparison, France’s CNRS opens up every year several positions with no teaching for life). Of course, part of the issue here is how to convince the body hosting the researcher that having the best students, or the best researchers, is really the priority; this is evident for us, but the financial source may have other goals in mind.

Lastly, we come to the “brain drain” heading. This is stated in the following terms: countries from where people emigrate lose the investment they made in their education, and they also lose the potential for further development; wealthier countries benefit – and also neglect making necessary investments in their own primary and secondary education; “it is much cheaper to import partially or fully trained young people”.

We have to look at this issue in the light of the data above.

(a) There is one very clear case of massive migration of people with PhDs from one place, namely, the former Soviet Union; this has to do with the implosion of an entire country. (We also see that many speakers left the rest of Eastern Europe already *before* the PhD stage.) Other than that, what we see is that large numbers of future speakers from outside the US did their PhDs there, but that the net flow to the US after the PhD stage was actually negative.

(b) As far as Latin America (say) is concerned, the issue is not a large net outflow (there turns out to be barely any) as low overall numbers. The same is true of other developing areas. It is striking that there are no speakers from India, given its mathematical tradition. (As for East Asia, it is difficult to reach meaningful conclusions, given that the Congress is in that geographical area this time around.)

Let us make our focus a little more precise. The article mentions some arguments for and against migration; as it states, they sometimes do not apply well to mathematics, whether they are under the ‘for’ heading (it is hard to see how (to use a paraphrase) “migrants sending money back home” is relevant here – though there is an analogue, namely, those cases where somebody from X manages to obtain substantial political power in the academic community in country Y, and uses it to procure funds to develop mathematics in X) or the ‘against’ heading (is having top mathematicians work full-time in a country really something that will improve significantly the teaching of students who do *not* intend to become research mathematicians? – the article seems to assert this).

As for costs saved by the USA (say) on education – figures per capita can be misleading here. Figures such as “$142,000 total average expenditure per student in primary/secondary education” are obtained much like similar figures on how much a prisoner costs: the total cost of a system – much of it consisting of fixed costs – is divided by the number of students or prisoners, as the case may be. What would be relevant here is not so much the marginal cost of educating an additional student, but the cost of having a better primary and secondary education system (or the cost of programs to supplement basic education). As far as the investment that the country from which the emigrate can be said to lose – obviously, what the state invests per student is often much less in developing countries, and not all students are supported by a public education system; rather, we could speak of what a country loses (to proceed with the same sort of logic) by not investing on a working postgraduate education system.

Still, these figures can be conducive to the right picture – namely, the academic system in the United States rests to a large extent on people who got their basic and undergraduate education elsewhere. The tables in the article should be enough make that clear. An awareness of this reality could, and should, contribute to create a common sense of responsibility. (On the French side, say, it should also contribute to create a sense of possibilities.)

Let us then restate the main issue within a clearly defined framework. There are young people with a great deal of talent and interest in mathematics in every part of the world. How do we ensure that they can develop their talent fully, and put it in practice to the best of their ability?

“We” here means anybody in the world who has an interest in the development of mathematics, or who considers wasted talent a pity and a waste. The way that we are phrasing the question sets certain perspectives deliberately outside its focus – namely, those based on national prestige, or “return on investment”. At the same time, lest the focus be thought of as narrow, let us emphasize that the question should not be thought of as concerning only an individual in the short run.

Consider, within this perspective, a system whereby talent is nurtured effectively in all countries, developed further in a few, and then put to work wherever it might be the case. Such a system would be a fair solution if the chances given to all students, regardless of origin, were equal or nearly equal; it would be a feasible solution if, and only if, it were sustainable. It may not be the best solution, let alone the only conceivable solution. However, it would be, for us, within the set of admissible solutions, provided that these two crucial “if”s are satisfied.

A few brief notes to supplement the above. We are talking about research mathematicians here, and not, say, about physicians and secondary school teachers, whose retention is a different issue altogether. (This is not, of course, to say that the issues raised by Andler on “adequate working conditions” and “miserable economic conditions” would not apply there.) We may even specify “leading research mathematicians”; this is, after all, what the database on ICM speakers is about.

Second – while the focus above may not be exactly the same as that of, say, government agencies that fund or could fund mathematics, this does not mean that the goals are all that different. Even from the viewpoint of “national prestige”, almost all would now agree that it is better for a country to produce very good football players (say) that work abroad, rather than not to produce them at all. It is also the case (for mathematicians or football players) that a country’s education system may be credited to the extent that it actually contributed to a professional’s formation; people do notice this – and thus it may make little sense, from the viewpoint of prestige, to see an exit from a country’s system at the bachelor’s or doctoral level as a greater loss than an exit that happens earlier.

Lastly, since the discussion may centre on mathematics in developing countries, let us give some examples from middle-income and high-income countries to clarify the framework of the discussion. An exodus of the proportions of the one that happened around the collapse of the Soviet Union is clearly something that gives rise to a non-sustainable situation. (Many would call it an effect of a non-sustainable situation as well, particularly given academic salaries in Russia in the early 90s.) The level of the country’s system for producing young mathematicians must clearly suffer as a result of such a shock ( – and as a result of the same drastic shortfalls that gave rise to it, some would add).

A somewhat different example is that given by the case of Germany. Here, again, the tables confirm what we already thought we knew: in net terms, Germany loses people after they get their doctorates. This is so in spite of senior academic salaries that compare favorably with those in large parts of Europe. The conventional guess – which is probably correct – is that this is due to a structural problem: Germany has nothing like tenure-track or associate professorships, or *postes de maître de conférences*; there are temporary “collaborators” and then there are full professorships. This is a problem that will not concern us here, at least as in so far as PhDs from Germany seem to be able to find jobs elsewhere. At the same time, it is a kind of problem that would legitimately concern some people in Germany, in that the system would be able to retain more people, and attract some, if it were structured differently. The same goes for any other country in a similar situation.

At this point – with a definition of the problem and its scope – we are at the beginning of a meaningful discussion. I thought briefly about the possibility of sketching the situation in a South American country (say). I may still do so soon. However, if you have read so far (congratulations!), you probably agree that this is a good point at which to declare the discussion open, and to hear what people have to say about (a) the situation in their own home countries, or in countries they are acquainted with; (b) how we could become better at recognizing and developing mathematical talent, at a global level; (c) the same, on placing mathematicians; or rather, how, given current trends in geographical shifts after or before the PhD level, we can still find viable ways to go much further on (b), even when this is far from completely apparent, and even when this goes against an overly simplistic take on “brain drain”.

]]>

Here is the website.

We are still putting together funding, but we have already managed to ensure quite a bit, and so we will be able to fund quite a few graduate students and young researchers (and perhaps some that are not quite so young). Our aim is to cover the expenses of admitted applicants in South America fully, and to cover the local expenses of people from elsewhere as well.

I’m not calling it a “summer” or “winter” school, since climate at a moderate-high altitude (3400 meters over sea level) relatively close to the equatorial line simply does not follow that categorization. “Dry-season school” would have sounded a little odd, even though it would have been accurate.

(Implications: sunny, cool at night and in the early morning, no mud, no rain, and unfortunately, no mushrooms either.)

As I said: people of all genders are encouraged to apply – or, in Spanish, tod@s están invitad@s a postular, which is a rather nice way to put it, since it explicitly includes cyborgs.

I hope the speakers will find they have been fairly depicted by their photograph:

This depicts the use of the Monte Carlo method to approximate an area. Many thanks to Martín Chambi!

]]>

At the same time, even though I had many reasons for staying away for blogging, it is a bit of a pity that I did not have time to blog precisely at a time when there started to be many more potential readers for it. Of course, some of them (you) are probably still around.

So far, I have written here mostly on mathematics, cinema and my travels. I would also like to touch a bit more frequently on a few serious subjects, not strictly mathematical, though sometimes related to mathematics and academia.

There is an idea going around my head since I read T. Rothman’s Genius and Biographers: The Fictionalization of Evariste Galois. This well-received essay from 1982 has had a definite influence on how Galois is seen nowadays; see a summary by a popular science writer, or the references in the introduction to the third edition of I. Stewart’s Galois-theory textbook. It has, to a great extent, replaced the account in E. T. Bell’s rather dated Men of Mathematics (1937). (E. T. Bell’s book is a collection of short biographies that inspired generations of mathematicians, while being famously imprecise and slanted, to say the least.)

It’s easy to correct *Men of Mathematics* on just about anything; at some point, before its general lack of accuracy was known, it may also have been necessary and worthwhile. What is, then, genuinely bothersome – or simply wrong-headed – about Rothman’s article?

First, it comes across as an effort, not just to defictionalize Galois, but to deromanticize him. The two concepts are not identical. Romanticization, strictly speaking, consists in the projection of a sensibility or the incorporation into a narrative, rather than in the practice of playing fast and loose with the facts. More importantly, there is no *projection* here, in the sense of imposing a sensibility alien to the subject. Galois lived in a romantic age; to understand his behavior, we must accept that hero worship, the search of sacrifice and martyrdom, the simultaneous identification with *the people* on the part of progressive sectors of bourgeois youth – combined with claims on a previously aristocratic concept of *honour* – were all concepts that pervaded the climate, rather than parts of a later grid. Rothman states under the heading “Harsher words” that “[Infeld] intends to make Galois a hero of the people”; this is a rather odd indictment, given that Galois was a member of an illegal organization called *La societé des amis du peuple*. We can discuss to what extent Infeld’s (socialist) conception of *le peuple* differed from that of Galois’s friends, or mention, as others have done, that (both in Galois’s and in Infeld’s time!) many of those self-identifying as *amis du peuple* were of middle-class extraction; still, what cannot be nullified is Galois’s *self*-inscription into a developing collective narrative.

Is it healthy for a biographical article to be written in a way that is so out of tune with its subject’s sensibility? This is not necessarily a disqualification; what is odd is to view accounts closer to an emic perspective as imposing an alien narrative. An etic perspective may differ sharply from an emic perspective; an etic account can draw attention to this fact without invalidating itself — and, conversely, it should not confuse the presence of a particularly large difference with objectivity.

To give an extreme analogue: an atheist may of course write a biography of Joan of Arc — and a historical essay on her should certainly be written differently than whatever account of her purported miracles was used at the Vatican for her canonization. Still, if a biographer charged another with “intending to make Joan of Arc into a religious figure”, something would be seriously amiss. We would also nowadays be careful not to rush to pathologize whatever seems to us unusual in some of her narrated experiences, given that, say, to state that one had visions was seen as acceptable and in consonance with the sensibility of a sector of society at the time. We would try to contextualize matters, even though we can probably agree that it is easier to make a case for a pathology in her case than in that of Galois, who had no visions and scaled no walls. Calling either self-destructive (not Rothman’s term) is to both say a triviality and to miss the point; it was neither’s intention to maximize his or her chances at survival.

There is another point to make, one about tone. Perhaps conscious of his subject’s identification with the sort of narrative that he dislikes, Rothman comes across as an adversary not just of E. T. Bell or Infeld, but of Galois; at times, his text comes across as a speech for the defense of an accused establishment. Since Rothman makes some remarks (in “Harsher words”) on the motivations of previous biographers, it seems fair to place his own habits of thought and language within a certain tradition still alive in contemporary academia. What we are dealing with is a discourse often used to defend academic hierarchy; if the speaker is fortunate, he will defend a hierarchy elsewhere, at a different time, and affecting people other than himself. Thus goes Rothman:

But Galois’s troubles were not yet over. A few days later, he failed his examination to l’Ecole Polytechnique for the second and final time. Legend has it that Galois, who worked almost entirely in his head and who was poor at presenting his ideas verbally, became so enraged at the stupidity of his examiner that he hurled an eraser at him. Bell records this as a fact but according to the little-known study of Joseph Bertrand the tradition is false. Bertrand, who appears to have detailed information about the event, records that Galois, while expounding on the properties of logarithmic series, refused to prove his statements to the examiner M. Dinet and, in response to Dinet’s questions, replied merely that the answer was completely obvious.

So was the result.[my highlight]

Rothman does not mention a version mentioned by Stewart (*Galois theory*, “Historical introduction”, 1973):

A variant asserts that Dinet asked Galois to outline the theory of “arithmetical logarithms.” Galois informed him, no doubt with characteristic bluntness, that there were no

arithmetical logarithms. Dinet failed him.

(Stewart then goes into an interesting digression, stating that it is possible that Dinet might have been referring to the index modulo . This seems unlikely at first sight: the term “discrete logarithm” for what Gauss himself calls the index sounds like much later nomenclature — and, since *Disquisitiones Arithmeticae* was still relatively recent and Dinet has passed to posterity mostly for failing Galois, it does not seem plausible that Dinet would have been thinking of this. I will gladly stand corrected on this, however; where can one find what was expected from a candidate to admission at École Polytechnique at the time?)

At any rate, the meaning is clear. Rothman (quite reasonably) rejects Bell’s claims of eraser-hurling (which, according to Stewart, go back to Dupuy); then, he gives a version in which there is no misbehavior on Galois’s part, but simply some impatience at an imprecise remark. It is not extraordinary that a second-rate individual would have perceived Galois’s attitude as petulant. What was perceived as extraordinary even by Galois’s contemporaries is that such an individual would have then taken this as a sufficient reason to fail the candidate.

Rothman very nearly comes across as taking the opposite view. “So was the result” does not just make it seem as if the outcome should have been expected (by a seventeen-year-old candidate); it is a kind of statement that, by taking a response on the part of a individual up in the hierarchy as if he were a force of nature, manages to condemn the person down in the hierarchy, while pretending not to pass judgement. This kind of shorthand should be familiar enough to all readers; we are Rothman’s contemporaries. (Rothman quotes Galois himself as stating “Hierarchy is a means for the inferior”; he seems to have little time for such sentiments.) Rothman later says that “[he] do[es] not wish to suggest Galois should have been failed.”; if he had not already come close to suggesting as much, such a disclaimer would not have been necessary.

There is an entire theme to be developed here: neither Rothman nor romantic “historians” are indulging in anachronism – rather, Rothman would seem to sympathize with the hierarchical sensibility that *condemned* Galois, and that still exists in some weakened and modified form to this day. (An actual continuity here may be a point for debate; it may be simply a case of one hierarchy’s sympathy for another, with which it identifies.) What would have been impossible in the early 19th century is something else, namely, Rothman’s amateur psychologizing. This and the defense of academic hierarchy are related, however, in that the sort of superficial and conventional “psychology” in which Rothman’s essay engages is precisely the sort that is used nowadays (or, in many places, a generation or two ago) to defend a hierarchy, while implying that only an individual can be unreasonable.

Rothman’s main imputation is — no surprises here — is that Galois had “developed not a little paranoia”. At some point, academic paranoia will be understood by all to mean something rather different from the common kind – that is, it will be a set phrase imposed by force of habit. In the meantime, however, paranoia is still a clinical diagnosis, and, unless it is meant as an insult (much like, say, the originally medical term “idiot”), it has to be supported when used to describe a scholar much as in any non-academic context.

Some of the evidence adduced is decidedly odd. A shot was fired from a guard’s garret into a cell that Galois shared with several other prisoners. This was interpreted by Infeld as an attempt on Galois’s life. Rothman says he has “tried to present this episode in as neutral a tone as possible”; apparently, the way to do this is to seem to go to some length to attempt to defend the decision to throw Galois into a dungeon (“evidently because he had insulted the superintendent”) together with the man who was actually shot – something that was considered by other political prisoners to be unusual and completely out of line. At any rate, we are given no evidence that Galois believed that the shot had been aimed at *him*; the mere belief that a shot fired into a prison cell may have been intentional is enough.

The other evidence is that Galois took rejection letters badly. There is also another little matter – at least one of his manuscripts got lost after submission to the Academy. (Whether a second memoire got lost or neglected by Cauchy is something that seems unproved in either direction; Rothman cites R. Taton’s case against this – and also gives references that suggest that the suspicion of intention on Cauchy’s part was solely Infeld’s, and not Galois’s.)

Rothman’s essay ends in a sardonic note:

The underlying assumption is apparent: Galois was persecuted because he was a genius and all scientists, to a greater or lesser degree, understand that genius is not tolerated by mediocrity. A genius must be recognized as such even when standing drunk at a banquet table with a dagger in his hand. […] This is a presumption of the highest arrogance.

In fact, some of the material there and elsewhere gives a picture of a young man who was generally known to be, at the least, very talented; word of this had got around in academic circles, and also beyond that – he was “our little scholar” to other prisoners. The likely reasons for his sad and brief life and career can be multiple – but we cannot say that he had somehow managed to make his talent unrecognizable.

In the end, a popular revolutionary, a romantic hero and a difficult young person are not three different characters, nor even three distinct, incompatible views of the same person. Neither do these categories match poorly with the view of Galois as a richly gifted mathematician frustrated by pedantry on the part of the incompetent and fumbling on the part of those who were usually more than competent. Of course one may argue that Galois was ill-equiped to deal with such a situation; almost all adolescents would have been, even those whose fathers had not been pushed to suicide by local Jesuits. The way that Galois responded – namely, by a sharpening of his conflict with authority as such – would have been within the bounds of what is normal in any era; more to the point, it was precisely what made sense in a young man of already formed republican convictions in an atmosphere of repression and stifled revolution – at, moreover, a time that exalted struggle and sentiment as much as it rewarded conformism. We can and should attempt to undo romantic legends, when they are legends; however, to deromanticize and depoliticize Galois is to misunderstand him.

Rothman’s essay was brought back to my mind by an essay by M. Duchin (marked as juvenilia on her webpage). There, Rothman is paraphrased as having shown that E. T. Bell changed the chronology of events; it is also stated that Galois’s father’s suicide helps to explains the result of his (oral) examination at the École Polytechnique immediately thereafter. In all fairness, Rothman states that Bell’s main source does not make clear the chronology of events; moreover, Rothman can be interpreted to mean that Galois was in a particularly irritable mood (something that, in his view, makes claims of “the examiner’s stupidity” less valid), rather than to insinuate that Galois did badly in some objective sense.

In general, I found Duchin’s essay thought-provoking, and I certainly share her strong suspicion of the concept of “genius” itself. Still, I was unconvinced by her contention that there is something particularly male about genius-worship. I had also thought that there had been a transition at some point between the early nineteenth century, when one could speak of the genius *of* somebody (originally something close to a not entirely beneficent daemon), to popular usage in the twentieth century, when it became extremely common to say that somebody was himself (or sometimes herself) a genius. This arguably crucial shift is left unexplored.

Since I am travelling, here are the obligatory tourist photographs.

]]>

The following are just brief impressions; they should not be taken for serious criticism. I found every spectacle I attended very worthwhile.

The characters in Le mot « progrès » dans la bouche de ma mère sonnait terriblement faux (Matei Visniec) come across as (deliberate) caricatures, except for the Mother and the Father, who wear the masks of a ghost and a skull, and perhaps the Son, who is, for the most part, dead. The play takes place in the former Yugoslavia. While I generally do not think along such lines, I could not help wondering what somebody from that place would think about such a play – dramatically potent, but sometimes arguably verging on the exploitative, and, perhaps slightly relevant in this context, written by a non-Yugoslav author and played by non-Yugoslav actors.

Le tigre bleu de l’Euphrate (Laurent Gaudé, adapted to the stage by Gilles Chavassieux) is a beautifully written monologue; the actor playing the dying Alexander the Great and the percussionist both performed admirably. The only lukewarm thing I can say is that, if you know your history, then you already know not just the plot (including Bessus’s betrayal of Darius) but the general tone (a young man in search of the unattainable, fatally frustrated by the voice of common sense on the verge of crossing the Indus). By the way, ladies: one of the friends with whom I coincided at the festival also remarked that the actor is rather handsome. (I thought him of about the right age to be a credible thirty-three-year-old Alexander.)

Yvonne, princesse de Bourgogne was played as, well, what I expect by now from Gombrowicz. Some day I will accept that his cruel variants on the absurd and the grotesque are really not my thing. However, I must say that the *mise-en-scene* was remarkable (particularly the costumes, inspired by Nô theatre, as I read somewhere).

Now for the play in the In festival. (All of the above was in the Off, which was just getting underway.) This is the third Festival in which I am a spectator, and I can now confirm a pattern: at least in the evenings, a play in the *In* is something (a) performed by an absolutely top-notch company, (b) going deeply into the territory of existential despair. Here (b) may be a specialisation resulting from having large, stark courtyards as the scene. A few years ago, I saw La vie de Galilée and La mort de Danton, played in the Cour du Lycée Saint-Joseph by the Théâtre national de la Bretagne. Both lent themselves very well to this treatment (something not à priori obvious in Brecht’s case). Last year’s La mouette was wilfully *à rebours*. This time, Angélica Liddell‘s Todo el cielo sobre la tierra (El síndrome de Wendy) was again a natural fit.

I would not have thought beforehand that Utoya, Peter Pan, ballroom dancing in China (cue in a live orchestra on a raised stage) and certain other things would be a wise mixture, but it all came across as sincere, touching and rather effective. I suppose it makes some sense to conjecture that Wendy Darling would have grown into a teacher/camp counsellor with strong ephebophilic tendencies.

One thing, though – I thought “if the author were Spanish, then Wendy’s monologues would be not at all impressive” – it’s not just that their anti-conventionalism is, in context, rather conventional, but that this sort of rhetoric is, in my perception, very common – and I’d add: facile – in Spanish literature of the last generation or so. Looking at the program notes, I saw that Liddell is, in fact, one of those confusing españoles y latinoamericanos con nombres no hispanicos. Siempre cayéndonos por sorpresa!

(It turns out this is a stage name. I vaguely suspected that.)

Well, I am in Sevilla now. By the way, do people want more Spanish in this blog?

]]>

Of course, I’m expected to write a summary of the main ideas here – not just emphasizing the new points, as I do when giving a talk to an audience of specialists, but giving an overall picture of the proof and all its parts, old and new. Let me do that.

I know that the audience here is very mixed – if you don’t have the background to follow a paragraph, just keep on reading.

**History**

Leonhard Euler – one of the greatest mathematicians of the eighteenth century, and indeed of all time – and his close friend, the amateur and polymath Christian Goldbach, kept a regular and copious correspondence. Goldbach made a guess about prime numbers, and Euler quickly reduced it to the following conjecture, which, he said, Goldbach had already stated to him: every positive integer can be written as the sum of at most three prime numbers.

We would now say “every positive integer greater than 5”, since we no longer think of 1 as a prime number. Moreover, the conjecture is nowadays split into two: the *weak*, or ternary, Goldbach conjecture states that every odd integer greater than 5 can be written as the sum of three primes; the *strong*, or binary, Goldbach conjecture states that every even integer greater than 2 can be written as the sum of two primes. As their names indicate, the strong conjecture implies the weak one (easily: subtract 3 from your odd number , then express as the sum of two primes).

See Dickson, *History of the theory of numbers*, Vol I., Ch. XVIII, for the early history of the subject. In summary — Waring seems to have come up with the weak conjecture again in the late eighteenth century; the nineteenth century saw some computational work (checking the conjecture for small integers – by hand!) but little real progress.

The strong conjecture remains out of reach. A few weeks ago — my preprint appeared on May 13, 2013 — I proved the weak Goldbach conjecture.

The proof builds on the progress made in the early 20th century by Hardy, Littlewood and Vinogradov. Vinogradov proved (1937) that the conjecture is true for all odd numbers larger than some constant . (Hardy and Littlewood had shown the same under the assumption of the Generalized Riemann Hypothesis; more on that later.) Since Vinogradov’s day, the constant has been specified and gradually improved, but the best (i.e., smallest) available value for was (Liu-Wang), which was much too large. Even would be too large: since is larger than the estimated number of subatomic particles in the universe times the number of seconds since the Big Bang, there wouldn’t be any hope of checking every case of by computer (even if the computer were really highly parallel, and you had really high cosmic priority)!

I brought down to (and could bring it farther down if needed). D. Platt and I had checked the conjecture for all odd numbers up to by computer (and could have gone farther), so that was the end of the story.

What goes into the proof? Let us first step back and look at the general framework of the *circle method*, introduced by Hardy and Littlewood.

**The circle method: Fourier analysis on the integers**

Fourier analysis is something we do every time we tune to a radio station: there is a signal, and we decompose it into the contributions from different frequencies. In mathematical terms – we are given a function (i.e., a function on a single real variable; in the case of a radio, the variable is *time*) and we define the *Fourier transform* by , where we write for . Then, as we learn in any Fourier analysis course, , provided that decays rapidly enough and is otherwise well-behaved. (This is the “Fourier inversion formula”.)

In other words, has been decomposed as a sum of (complex) exponential functions, with the (complex) exponential function present with “strength” . (This is equivalent to a decomposition into “sine waves” and , since .) To go back to the example of a radio: is large when is close to the frequency of some radio station, and small otherwise. (What your radio receives is a superposition of what all stations transmit; your radio receiver’s business is precisely to figure out the contribution of frequencies around a given .)

We can do the same if is a function from the integers to . In fact, things are now simpler — we get to define by a sum rather than an integral: . A funny thing here is that doesn’t change at all if we add , or any other integer , to . This is so because, for an integer,

(Thanks again, Euler.) Thus, we may restrict to the interval — or, more abstractly, we can think of as living in the quotient . Topologically, is a circle; this is just the same as saying that, since it doesn’t matter whether we add or substract to our frequency, we might as well have the little frequency marker in our radio go around a circle marked with numbers from up to , rather than have it slide back and forth (a segment of) the real line (as in the actual radio on my table). This is where the phrase *circle method* comes from.

The decomposition of now looks as follows: provided that decays rapidly enough.

Why do we care? The Fourier transform is immediately useful if we are dealing with additive problems, such as the Goldbach conjectures. The reason behind this is that the transform of a convolution equals a product of transforms: . Recall that the *(additive) convolution* of is defined by .

We can see right away from this that can be non-zero only if can be written as for some , such that and are non-zero. Similarly, can be non-zero only if can be written as for some such that , and are all non-zero. This suggests that, to study the ternary Goldbach problem, we define , , so that they take non-zero values only at the primes.

Hardy and Littlewood defined for composite (or zero or negative), and for prime (where is a parameter to be fixed later). Here the factor is there to provide “fast decay”, so that everything converges; as we will see later, Hardy and Littlewood’s choice of (rather than some other function of fast decay) is actually very clever, though not quite the best possible. The term is there for technical reasons (basically, it turns out that it makes sense to weigh a prime by because roughly one out of every integers of size about is a prime).

We see that if and only if can be written as the sum of three primes. Our task is then to show that (i.e., ) is non-zero for every larger than a constant. Since the transform of a convolution equals a product of transforms,

Our task is thus to show that the integral is non-zero.

As it happens, is particularly large when is close to a rational with small denominator; it is as if there were really radio stations transmitting at the (small-denominator) frequencies marked in the “tuning” drawing above – when the dial is close to one of them, there is a loud, clear , and when we are away from all of them, we can hear only a low hum. This suggests the following strategy: estimate for all within small arcs around the rationals with small denominators (the *major arcs* — so called because they make a major contribution, in spite of being small); bound for outside the major arcs (everything outside the major arcs is called *minor arcs*); then show that the contribution of the minor arcs to the integral is smaller in absolute value than the contribution of the major arcs, thereby forcing the integral to be non-zero.

It is this general strategy that gets called the *circle method*. Hardy and Littlewood introduced it to deal with a wide variety of additive problems; for example, this was also part of their approach to Waring’s problem (yes, same Waring), on integers that are sums of th powers of integers. This was taken over by Vinogradov, who was the first to give good, unconditional bounds on for in the minor arcs (something considered very remarkable at the time). The circle method is also my general strategy: what I have done is to give much better estimates on the major and minor arcs than were previously available, for , and chosen with great care.

(Incidentally: while we can start to treat the binary, or strong, Goldbach conjecture with the circle method, we soon hit a wall: the “noise” from the minor arcs overwhelms the contribution from the major arcs. This is well explained in this post by Tao.)

**Dirichlet L-functions and their zeros**

Before we can start working on the major arcs, we need to discuss L-functions. First, there is the zeta function , first studied for complex by Riemann, after whom it is now named. This is given by when the real part of is greater than . For , the series diverges, but the function can be defined (uniquely) by analytic continuation (and this can be done explicitly by, e.g., Euler-Maclaurin, as in Davenport, Multiplicative Number Theory, 2nd. ed, p. 32), with a pole at .

Analogously, there are Dirichlet L-functions, defined by for , and by analytic continuation for . Here is any Dirichlet character; for every

given , is a function of . A Dirichlet character (of *modulus* ) is just a function of period (i.e. for all ), with the additional properties that it be multiplicative ( for all , ) and that whenever and are not coprime. (The sophisticated way to put it is that it is a character of lifted to .) Dirichlet characters and Dirichlet L-functions were introduced by, um, Dirichlet, in order to study primes in arithmetic progressions.

A zero of a function is just an such that . A non-trivial zero of , or of , is a zero of , or of , such that . (The other zeros are called trivial because it is easy to tell where they are (namely, at negative integers and sometimes at ).) The Riemann hypothesis states that all non-trivial zeros of the Riemann zeta function “lie in the critical line”, meaning that they satisfy . The Generalized Riemann hypothesis for Dirichlet L-functions states that, for every Dirichlet character , every non-trivial zero of satisfies .

Since both the Riemann Hypothesis (RH) and the Generalized Riemann Hypothesis (GRH) remain unproven, any result proven using them will be *conditional*; we want to prove unconditional results. What can indeed be proven, and used, are partial results in the direction of GRH. Such results are of two kinds:

– Zero-free regions. Ever since the late nineteenth century (de la Vallée-Poussin) we have known that there are hour-glass shaped regions (more precisely, of the shape , where is a constant and where we write ) outside which non-trivial zeroes cannot lie;

– Finite verifications of GRH. It is possible to (ask the computer to) prove small, finite chunks of GRH, in the sense of verifying that all non-trivial zeros of a given function with imaginary part less than some constant lie on the critical line .

Most work up to date follows the first alternative. I chose the latter, and this had consequences on the precise way in which I defined the major and minor arcs: I got very precise results on the major arcs, but I had to define them to be few and very narrow,

or else the method wouldn’t work. This meant that the minor arc methods had to be particularly potent, since a particularly large part of the circle was left for them to deal with.

Let us look more closely at how one can deal with major arcs using partial results towards GRH, and, in particular, finite verifications of GRH.

**Major-arc estimates**

Recall that we want to estimate sums of the type , where is something like (say) for equal to a prime, and otherwise. Let us modify this just a little – we will actually estimate , where is the *von Mangoldt function*: if is either a prime power , , and otherwise. (The use of rather than is just a bow to tradition, as is the use of the letter (for “sum”); however, the use of rather than just plain does actually simplify matters when you deal with so-called explicit formulas, which we will see in a minute.) Here is some function of fast decay; it can be , as in Hardy and Littlewood’s work, or (as in my work) something else. (It could even be just the “brutal truncation” , defined to be when and otherwise; that would be fine for the minor arcs, but, as we will see, it is a bad idea as far as the major arcs are concerned.)

Assume is on a major arc, meaning that we can write for some ( small) and some (with small). We can express as a linear combination (that is, a sum of multiples) of terms of the form , where and runs over Dirichlet characters of modulus .

Why are these sums nicer than the plainer sums ? The argument has become , whereas before it was . Here is small — smaller than a constant, in our setup. In other words, will go around the circle a bounded number of times as goes from up to a constant times (by which time has become small). This makes the sum much easier to estimate.

It is a standard fact that we can express by an explicit formula (yes, the phrase has a technical meaning, just like *Jugendtraum*):

Here the term between brackets appears only for . In the sum, goes over all non-trivial zeros of , and is the Mellin transform of . We win if we manage to show that the sum over is small.

The point is this – if we check GRH up to imaginary part , then we know that all with satisfy , and thus . In other words, is then very small (compared to ). However, for any whose imaginary part has absolute value greater than , we know nothing about its real part, other than . (All right, we *could* use a zero-free region, but known zero free regions are notoriously weak for large – meaning they tell us little in practice.) Hence, our only chance is to make sure that is small when .

This, mind you, has to be true both for tiny (including: ) and for not so tiny ( between and a constant). If you toy around with the method of stationary phase, we get that behaves like for tiny (here is the Mellin transform of ) and like for not so tiny (where ). Thus, we are in a classical dilemma, often called the uncertainty principle because it is the mathematical fact underlying the physical principle of the same name: you cannot have a function that decreases extremely rapidly and whose Fourier transform (or, in this case, its Mellin transform) also decays extremely rapidly.

What does “extremely rapidly” mean? It means “faster than any exponential “. Thus, Hardy and Littlewood’s choice seems essentially optimal.

Not so fast! What we *can* do is choose so that decreases exponentially (with a constant a bit worse than before), but decreases faster than exponentially. This is a rather good idea, as it is (and not so much ) that risks being fairly small.

One choice obeying this description is the Gaussian . The Mellin transform turns out to be a parabolic cylinder function, with imaginary values for one of the parameters. Parabolic cylinder functions seem to be much-loved and much-studied by applied people — but mostly for real values of the said parameter. There are some asymptotic expansions of in the literature for general parameters (notably by F. W. J. Olver), but none that were explicit enough for my purposes. Thus, I had to go and provide fully explicit estimates myself, using the saddle-point method. This took me a while, but the results should be of general applicability – hi there, engineers – and the Gaussian smoothing will hopefully become a little more popular in explicit work in number theory.

Ah, by the way, these estimates on parabolic cylinder functions allow us to take not just , but also, more generally, , here is any band-limited function, meaning, in this context, any function whose Mellin transform restricted to the axis has compact support. We will want to optimize the choice of — more on that later.

**The minor arcs**

How do you bound when is *not* close to any rational of small denominator? That this is at all possible was Vinogradov’s great achievement. Progress since then has been gradual. My own take on things is the subject of my minor arcs paper. Let me just discuss a few of the ideas behind my improvements.

Vinogradov’s proof was greatly simplified in the 70s by Vaughan, who introduced the identity that now bears his name. Basically, Vaughan’s identity is a gambit: it gives you a great deal of flexibility, but at a cost — here, a cost of two logs, rather than, say, two pawns. The problem is that, if we are to reach our goal, we cannot afford to waste logs. The only way is to recover these logs, finding cancellation in the different sums that arise from Vaughan’s identity. This I had to do, mind you, without using -functions, since I could no longer assume that was small.

Here is another theme of this part of the proof. Every has an approximation the fact that is in the minor arcs just tells us that is not small. In fact, we are looking for bounds that decrease with ; the bound I obtain is proportional to . What is the effect of ?

Something I realized early on was that, if is not tiny, it can actually be used to our advantage. One reason is that there are terms of the form , and the Fourier transforms of smooth functions decay as the argument grows. There are other issues, however. Something we can use is the following: by basic results on Diophantine approximation, every has very good approximations by rationals of non-huge denominator. If is not tiny, then the approximation is good, but not very good; hence, there must be another, better approximation with non-huge (meaning: considerably smaller than ). We can go back and forth between the approximations and , depending on which one is more useful in context. This turns out to be better than using a single approximation , however good it may be.

Another way in which large gets used is to scatter the inputs to a large sieve. The large sieve can be seen as an approximate form of the Plancherel identity, recast as an inequality: whereas the Plancherel identity tells us that the -norm of a Fourier transform of a function defined on the integers (for instance; groups other than the integers are also valid) equals the -norm of the function itself, the large sieve tells us that the total of for a well-spaced sample of points is bounded by (a multiple of) . Now, in our case, the points are multiples of our angle . If , the spacing of the points is , which is nice — but we may have to apply the large sieve many times, since we have to apply it afresh for each chunk of points. However, if and is not tiny, we can go around the circle many times, and rely on rather than on to give us the spacing. Yes, the spacing may be smaller, but the effect of this is more than compensated by the fact that we have to invoke the large sieve far fewer times (perhaps only once). What is more, this scattering can be combined with a more traditional kind of scattering (Montgomery’s lemma; see Montgomery’s “Topics in multiplicative number theory”, or else the exposition in Iwaniec-Kowalski, section 7.4) so as to take advantage of the fact that we are dealing with sums on the primes.

**Putting it all together**

I’ve been telling you what goes into bounding for within the minor arcs , but what we really want to do is bound the integral . One easy – and traditional – way to do this is to just use the trivial inequality . Unfortunately, this wastes a factor of .

Since our bounds for , , are given in terms of , it makes sense to combine them with estimates for integrals of the type , where is a union of arcs around rationals with denominator greater than a constant but less than . How do you estimate such integrals? It turns out that this is very closely related to a question on the large sieve: what bounds can you get for samples , , where is of moderate size?

There was an answer in the literature (based on Montgomery’s lemma; the link with the circle method was already noticed by Heath-Brown) but it was sub-optimal by a factor of at least (or in fact more). There was a newer estimate for the large sieve due to Ramaré, but it had not been made fully explicit. I had to work that out, and then adapted the new large-sieve result to estimate the integral over above. As was expected, the spurious factor of (or really a bit more) disappeared.

It remains to look at the main term vs. the error term. It turns out that we have some room to choose what the main term will be, since it depends on the smoothings we choose. The main term is proportional to , where and are the two smoothings we choose to work with, is the odd number we want to express as the sum of three primes, and is again a parameter of our choosing. For comparison, the error term is proportional to . Thus, we have an optimization problem (“maximize the size of the double integral divided by “). It is best to choose symmetric or close to symmetric (), making sure, moreover, that for . This is not too hard to achieve while keeping of the form , where is band-limited.

What about ? The solution to the optimization problem tells us that it should be of small support, or at least concentrated near the origin. Other than that – there is, so to speak, a political problem: , unlike , gets used both in the major and the minor arcs; the major arcs really want it to be of the form or , whereas the minor arcs would prefer it to be something simple, like or like (as in Tao’s five-prime paper or my own minor-arcs paper).

The solution is simple: define , where is a large constant, and . For and pretty much arbitrary, if you know how to compute (or estimate) for some , and you also know how to estimate for the other , then you know how to estimate for all : just write it out and you will see!

(Perhaps you will also see that it helps if and are very small near .)

The moral is that different problems, and different parts of the same problem, require different smoothings. At least in the context of exponential sums, there turns out to be a simple trick for combining them, as we have just seen.

**Some final remarks on computing**

An analytic proof usually gives a proof valid for all larger than a constant . The reason is simple: say that we want to show that a quantity is positive. Typically, at the end of a great deal of analytic work, you will have proven that the quantity is of the form , where the absolute value of the error term is at most (for example; this is obviously simplified). This certainly shows that the quantity is positive — provided that . The task, then, is to sharpen the proof to such an extent that the constant becomes small enough that all cases can be checked by hand (meaning either literally your hand or a computer). This is what my work was largely about; checking the conjecture up to (and in fact up to ) was a bit of a side task – as we are about to see, it wasn’t even the main computational effort involved.

First, let me say a few more words about analytic results. There are results of the type “the statement is true for all larger than a constant , but this proof can tell you nothing about , other than that it exists”. This is called an *ineffective* estimate; many proofs of Vinogradov’s result in textbooks are of this kind. (The reason behind this is the possibility of so-called Siegel zeros.) A result can also say “the statement is true for all , and you should in principle be able to determine some value of by using ideas from the proof, but the author would much rather go drink coffee”. This is an effective, non-explicit statement; Vinogradov’s definitive version of his own proof was of this sort (as are many other results in mathematics, including some of my own past results). If you do give an explicit value of , then the result is called, well, *explicit*. Then comes the fourth stage: making reasonable, i.e., low enough that the case can be checked by hand. It was clear from the beginning that, in the case of the ternary Goldbach conjecture, “reasonable” meant roughly , even though nobody had actually checked it that far.

I said before that D. Platt and I had checked the conjecture for all odd numbers up to . Here is how we proceeded. It was already known (thanks to a major computational effort by Oliveira e Silva, Herzog and Pardi) that the binary Goldbach conjecture is true up to — that is, every even number up to is the sum of two primes. Given that, all we had to do was to construct a “prime ladder”, that is, a list of primes from up to such that the difference between any two consecutive primes in the list is at most . Thus, if anybody gives you an odd integer up to , you know that there is a prime in the list such that is positive and at most . By assumption, we can write for some primes , , and so .

Constructing such a ladder does not take *that* much time. (In fact, getting a ladder up to is probably something you can do on *your* old personal computer in the basement, over a few weeks — though storing it is another matter.) It’s all integer arithmetic, and we use deterministic primality checking (which is fast for primes of special form), so there is really no room for concern.

The major computation consists of verifying that, for every -function of conductor up to about (or twice that for even), all zeroes of the -function with imaginary part bounded by lie on the critical line. This was entirely Platt’s work; my sole contribution was to run around asking for computer time at different places (see the acknowledgements section of the major arcs paper). In fact, he went up to conductor (or twice that for even); he had already gone up to conductor in his thesis. The verification took, in total, about core-hours (i.e., the total number of processor cores used times the number of hours they ran equals ; nowadays, a top-of-the-line processor — such as the ones in the MesoPSL machine — typically has eight cores). In the end, as I said, I used only (or twice that for even), so the number of hours actually needed was more like ; since I could have made do with about , you could say that, in retrospect, only about core-hours were needed. The computers and I were digging on opposite ends of the mountain, and we met down the middle. The fact that the computers ran for longer than needed is hardly something to be regretted: the computation is of general use, and so it’s so much the better that it not be too tightly fitted to my needs; moreover, with proofs of this length, you want to “build like a Roman”, i.e., overcalculate in case you (not the computer!) have made a small mistake somewhere. (Why did you think those walls were so thick?)

Checking zeros of -functions computationally is something as old as Riemann (who did it by hand); it is also one of the things that were tried on electronic computers already in their early days (Turing had a paper on that). One of the main issues to be careful about arises whenever one manipulates real numbers: honestly speaking, a computer cannot store ; moreover, while a computer can handle rationals, it is really most comfortable handling just those rationals whose denominators are powers of two. Thus, you cannot really say: “computer, give me the sine of that number” and expect a precise result. What you should do, if you really want to *prove* something (as is the case here!), is to say: “computer, I am giving you an interval ; give me an interval , preferrably very short, such that “. This is called interval arithmetic; it is really the easiest way to do floating-point computations rigorously.

Now, processors don’t do this natively, and if you do this purely on software, you can slow things down by a factor of . Fortunately, there are ways of doing this halfways on hardware, halfways on software. Platt has his own library, but there are others online (e.g. PROFIL/BIAS).

(Oh, by the way, don’t use the function on an Intel processor if you want the result to be correct up to the last bit. Just what were they thinking? Use the *crlibm* library instead.)

Lastly, there were several rather minor computations that I did myself; you’ll find them mentioned in my papers. A typical computation was a rigorous version of a “proof by graph” (the maximum of a function is clearly less than because gnuplot told me so). You’ll find algorithms for this in any textbook on “validated computing” – basically, it’s enough to combine the bisection method with interval arithmetic.

Finally, let me point out that there is an elementary inequality in the minor arcs paper (namely, (4.24), in the proof of Lemma 4.2) that got proven in part by a human (me) and in part by a quantifier-elimination program. In other words, there are now computer programs out there (in this case, QEPCAD) that can actually prove useful things! Now, I have no doubt that the same inequality can be proven purely by the use of human beings, but it is nice to know that our computer friends can (pretend to) do something other than munch on numbers…

]]>