Where Have the Great Programmers Gone?

We need good software more than ever. It seems likely that the quality of software has a lot to do with the quality of the programmers who write it. Programming ability is distributed unequally in the extreme. Hence it makes sense to recognize the existence of great as opposed to merely good programmers.

The twin theses of this essay are that the current educational environment militates against the discovery and nurturing of programming talent, in spite of the crucial importance of great programmers. Moreover, even if a manager can get hold of a rare great programmer, it is not clear how she or he can be used. In this essay I contrast the current environment with the one in which the first great programmers were reared and consider what can be done.

The Unfairness Factor

Ability in any given direction is unequally distributed. But some of these inequalities are more pronounced than others. Let us introduce a numerical factor for inequality. As we cling to the notion that all are born equal, inequality in the endowment of something desirable is regarded as unfair. For this reason, I will call this numerical factor UF, the Unfairness Factor.

Consider the UF for running ability, defined on the basis of, say, the maximum speed attainable on foot by a sample of, say, twenty-year olds. Let us then define the UF as the ratio between the greatest and the median value. What will this ratio be, approximately?

The fastest are around ten metres per second. Some, in a sufficiently large sample, are sufficiently handicapped that they cannot move on foot at all; these will be assigned a maximum speed of zero. My guess is that the median for a reasonably large random sample will be five metres per second, which would put the UF for running at 2.

Let us now consider something more interesting, for example musical ability. How to assign a numerical value? We could consider a pass/fail test consisting of playing pieces of music to the satisfaction of a panel of expert referees. Musical ability would then be expressed numerically as the inverse of the number of days of preparation that a person needs to be able to pass the test. Say the test is such that someone with a high degree of musical ability would need a few days. Then others would need hundreds of days, while many of us would never pass the test, even when practicing assiduously all our lives. This puts the UF at a high number, or even at infinity if the median comes out at 0, as it well may.

We could define a UF for programming ability in a similar way. For example the test could be the writing of a significant and well-defined piece of software: a text editor, a C compiler, or a TCP/IP stack. The person taking the test would work on the assigned project in isolation, much like the contestants for the Prix de Rome in architecture a century ago. But unlike this contest, there would be a simple pass/fail criterion for the product and the score would be the inverse of the time taken.

The UFs for programming and musical ability would be very much higher than it is for running. And the UF would still seriously underestimate inequality: there is great inequality even among the musicians and programmers who score very highly on the test. The highest scoring musician might still be a facile hack rather than a true artist like Abbado or Arrau.

If we stay within the quantitatively measurable, that is, not try to single out the great ones, then we may well find a factor of ten in productivity (as observed by F.P. Brooks in his book The Mythical Man-Month). Consider a project of 50 programmers. In the current situation these are average programmers. Even though there might be potentially good ones among them, the setting will reduce their productivity to that of the average. These can be replaced by five good programmers. But once your team is that much smaller you save the considerable communication overhead inherent in a 50-programmer team.

What to do?

Given the considerable economic significance of radically better software and the enormous variation in programming ability, it is clear that we need to ensure that (1) such great programmers as there may be found get the chance to do their thing, and (2) that others get the chance to develop their potential in this direction.

The essay "Great Hackers" by Paul Graham gives some hints about how to maximally utilize great programmers. Everyone of these hints is negated by standard practice in software development.

Since about 25 years, there is a standard career for programmers: computer science or engineering in a university followed by employment in an organization with management that is imbued with the doctrines of software engineering. A youngster with programming talent is bored to tears by the programming courses offered at university. As explained by Graham, the kind of programmer that we are considering here cannot and will not work in the standard employment environment.

This situation has two effects. Those who potentially have these awesome powers are deflected away from programming. If some of them do develop them, it will typically be accidentally. And they will only manage to become useful if they have considerable entrepreneurial abilities as well. In that case two or three of them start a company, do their thing, and get acquired by companies like Microsoft, Yahoo, and Google, which were themselves started in this way. But a large proportion of great programmers do not have sufficient entrepreneurial abilities. As their way of working does not fit in managed software development, they often languish in "day jobs". They come to life at night in work on open-source software, an unintended gift to the community [ "Hackers and Painters", an essay by Paul Graham ].

The book "Programmers at Work" by Susan Lammers contains interviews conducted in the 1980s with notable programmers. How did these people get to be the kind of programmer they were? If university had anything to do with it, it was typically not because of their teaching of programming. More likely it was because university afforded relief from the constraints that come with full-time employment as a programmer or because it provided a programming assignment that turned out to be educational.

One of the programmers interviewed by Lammers is Charles Simónyi. As a high school student he became night watchman of an Ural II, one the half dozen computers in existence at that time in Hungary. The machine required programs to be entered in octal by means of switches. Before he left high school he wrote a compiler for a Fortran-like language, which he was able to sell. In this period he read the listing of the Gier compiler for Algol, one of the first for this language.

This crucial stage of Simónyi's education was determined by two things: personal tuition by the engineer in charge of the Ural II, and the use of the Ural II as a personal computer. The engineer was a "genius", an accolade not lightly bestowed by someone like Simónyi. The Ural II was not only his personal computer, but it was a very educational one.

More widely known ["Hackers" by Steven Levy] is the story of the MIT undergraduates who were among the world's best programmers without the benefit of any educational efforts apart from their own. For the TX-0 in the late fifties and the DEC PDP-1 in the early sixties, much of the system software was provided by a bunch of twenty-year old model train hobbyists who discovered these machines as having logic more interesting than that of their train installation. A striking example was the exploit of the clandestine rewiring of the PDP-1 to rectify a particularly annoying shortcoming in its instruction set.

It might seem that we now have an ideal situation for the nurturing of programming talent in the combination of many teenagers with their own computer attending schools with low academic demands. The reason it fails is that the use of these computers is not educational in the way this was the case for the TX-0, the PDP-1, and the Ural-II. As a measure to protect under-age computer users, we should insist on them only getting second-hand laptops and installing Linux on it themselves. In this way we can weakly emulate some of the great opportunities that existed in the past.

Simónyi and the MIT hackers were the beneficiaries of an extremely rare and lucky confluence of circumstances. But even the average person inducted into programming in the early sixties was better off than a contemporary first-year student exposed to the introductory course honed to perfection by Educators.

What was different in 1960? At the time, programmers as such did not exist; university courses in programming even less. All a manager could do was to hire a bunch of promising oddballs and hand them a sheaf of two dozen sheets of paper covered with pale mauve mimeographed printing. Three of these sheets would be filled with a listing of the machine instructions and their assembler mnemonics; the remaining sheets would contain a description of the libraries and of the operating system (if present). Having this as your only "tool" was intimidating, which caused a large proportion of new hires to drop out. To those remaining, this documentation did not confer any illusion of power: it was clear that everything about programming still needed to be learned.

In this environment two favourable factors existed. The first was that one of the prerequisites for being a promising oddball was that of having a nontrivial college degree. Examples were philosophy (J. A. Robinson), English literature (Mark Halpern), Classics (C. A. R. Hoare), Physics (E.W. Dijkstra). Such a degree ensured, if not exceptional intelligence, at least not a debilitating lack of it. Being thrown into the deep end in this way was educative, something that cannot be said of the typical first-year programming text, dumbed down in the way that only Educators have the secret of.

What to do? One interesting initiative came to fruition in the 2000-2001 academic year when the Ars Digita University existed. It taught a one-year course in computer science to graduates selected for a minimum of exposure to computer science or programming. In each of the ten months of the course, one book was studied. The first was "The Structure and Interpretation of Computer Programs" by H. A. Abelson and G. J. Sussman, computer scientists and programmers rather than Educators.

In spite of the attractions of the Ars Digita initiative, we need not give up on university computer science programs. The main stumbling block is that the programming courses are a deterrent to those who picked up their contents on their own when they were fifteen. By the simple expedient of arranging for direct entry to third year computer science courses on the basis of passing a suitable examination in programming, much talent can be saved for programming.