Log in

No account? Create an account
13 September 2018 @ 05:51 pm
The Grosch Centenary: Revisiting Grosch's Law.  
Herbert Ruben John Grosch

September 13, 2018 marks the Grosch centenary, today would have been the hundredth birthday of Herbert Reuben John Grosch (1918 to 2010).

Grosch was a colourful figure in scientific and corporate computing, who began as an astronomer in the 1930s and worked in scientific and corporate computing into the 1980s and held senior positions such as president of the Association of Computing Machinery from 1976-1978. He was referred to by some as a gadfly of the industry, free with candid and caustic comments about industry developments and he also had strong opinions about the history of the field and historians.

I knew Grosch briefly near the end of his life when he moved to Toronto. I had begun working on my thesis on his colleague Wallace J. Eckert and he was always more than willing to talk with me and share his knowledge of Eckert and the computer field more generally. I have talked about Grosch before on the IT History Society blog. In October at a conference in St. Louis, I will be presenting a talk on his most famous contribution Grosch's Law, which related the speed of a computer to the economy (cheapness) of its operation. In this blog post I want to give a summary of some of the information in that talk, focusing on Grosch's own formulation and interpretation of the law.

The law was announced in print in 1953 as follows:

I believe that there is a fundamental rule, which I modestly call Grosch's law, giving added economy only as the square root of the increase in speed - that is, to do a calculation ten times as cheaply you must do it one hundred times as fast. (Grosch 1953,310)

So it is on Grosch's law. I have noticed that accounts of Grosch's law I run across tend to be partial and even confused about what Grosch's law was or meant, so I will try to touch on some of the different meanings and interpretations and at the same time clarify them.

Grosch trained as an astronomer at the University of Michigan, completing a PhD in 1942, after working at various locations during World War II he ended the war working at the Thomas J. Watson Scientific Computing Laboratory at Columbia University in New York, and he continued to work there, helping various scientific researchers make use of machine computation, for 5 more years when in 1950 he formulated his eponymous law.

By 1950 Grosch had ambitions to move up in IBM by finding science and engineering clients who would use IBM's new lines of electronic calculators in their research. In particular he was angling to head a Washington D.C. service bureau for IBM renting the new Card Program Calculator II (or CPC II), a combination of IBM accounting machines including their new electronic calculator the IBM 605 (an upgrade from the 604 of the original CPC).

As he worked on securing this position he began to think about how pricing would work and be justified. He continues in his autobiography:

One afternoon, sitting comfortably next to my trusty Marchant, I began to lay out some ballpark figures on machine costs. I had the rental prices of the CPC I and the 604, and a pretty good idea of what the rental of the CPC II was going to be. I had the rentals of the 602A, the 602, the 601 for comparison. Making very rough estimates indeed, I converted what I knew of costs of the SSEC, the ASCC at Harvard, and the ENIAC, to monthly figures comparable to IBM rentals. Going further afield, I added unfinished machines like
BINAC and the MIT WHIRLWIND, and SEAC in Washington. I had heard little rumors about our Defense Calculator [...] and I put in a number based on $ 10,000 a month rental[.] (Grosch 1991, 131)

He wanted to compare this to a measure of performance. As a person doing celestial mechanics calculations in the late 1930s Grosch had become aware of the work of L. J. Comrie using British punched card machines to perform various calculations. At the Watson lab Grosch had met and talked with Comrie about scientific computation and this along with his own experience led to the following chain of reasoning.

What these gadgets really had to offer was speed; people were still cheap in 1950, and Comrie had told me the economical way for his girls [human computers] to do a multiplication was on a Brunsviga, punching the result on a card to get back into Hollerith [punched card] mode (as compared to low utilization on a 601, he explained). Speed for the kind of work I did depended on multiply time; [...] So I used multiply speed as my measure of performance. (Grosch 1991, 131-132)

Note here it is the practice of pre-computer scientific computation, that is a key guide to Grosch's attempt to understand the economics and logistics of what becomes a statement about computers. The question of what part of the computation you are using as the benchmark of speed or performance is key and here Grosch is relying on the experience of himself and Comrie in this tradition of computation.

Grosch continues:

I plotted this all up on a casual piece of log paper, cost versus speed. It looked sort of linear. I extended the baseline: added estimates for desk calculator operation, electric and hand-cranked, and for logarithmic work, and even for Crelle's tables and pencil-and-paper. It still looked linear. I was about to draw a line on the sheet and go back to other work, when I noticed that the probable slope was about one-half. I drew a line with precisely that slope, and it fitted the dozen or so points reasonably well. "Ah," said I,"economy is as the square root of the speed"! Grosch's Law was born. (Grosch 1991, 132)

Note that the cost being considered is not the cost of a computer so much as the total cost of computation including the wages of the humans involved. Grosch went on to the Washington service bureau selling rentals of the CPC-II and then to be fired for the first time from IBM. Presumably while selling CPC-II rentals to scientific and engineering firms he used Grosch's law as part of his pitch, the new machines might have an expensive sticker price, but the price of each individual multiplication would be greatly improved. So Grosch's law was in some respects the classic sales pitch, buy the expensive machine and save so much money. Grosch talked up his discovery at various conferences and industry events over the course of the 1950s.

In 1953 Grosch was working at General Electric managing computations on an IBM 701 they were renting and he published the paper "High Speed Arithmetic: The Digital Computer as a Research Tool" in the Journal of the Optical Society of America, Grosch's law appears for the first time in print tucked in at the end of this paper. In the paper Grosch spends most of the time discussing ways that a digital computer can be used in scientific research from lens design to machine indexing, abstracting and organization of scientific papers and information. Grosch had a broad and ambitious view of what computers were capable of achieving. In the introduction to the paper Grosch refers to the rapid development of computer technology noting:

J. W. Forrester, the director of the Digital Computer Laboratory of the Massachusetts Institute of Technology, has said that if we combine the speed, reliability, memory capacity, and cost of typical calculating machines into a suitable index or figure of merit this index has increased multiplicatively by a factor of ten each year since 1945 and shows no sign of slackening. (Grosch 1953, 306)

Note how this observation is rendered in a mathematical form not unlike Grosch's own law or other famous laws of the computer industry like Moore's law. Suggesting that Grosch's observation and later observations are drawing on an older tradition or tendency to make these sorts of simple but quantitative observations.

Grosch had a particular explanation for his law comparing it to the square-rigger who clambers and manipulates the rigging on the sails of a tall merchant ship of yore, as he put it in 1975:
in some obscure human-related way it reflected the professional user's application of the square-rigger motto: one hand for yourself and one for the ship. Given a burst of new power, the programmer would let the boss have some and keep the rest to play with[.] (Grosch 1975)

Grosch's law tends to be paraphrased as saying that there is an economy of scale in computing, the faster more powerful computer yields a lower unit cost of each calculation. The thing to notice here is that Grosch is explaining how some of the enlarged performance of a faster computer fails to translate into greater increased efficiency, a larger economy of scale. One might imagine that a computer twice as fast would also be twice or more as complex and so offer no economy scale, instead Grosch's assumption is that by say quadrupling speed one should achieve not merely a halving of cost but closer to a quartering of it. After all a machine that uses the same resources but does twice as much work in the same time would be twice as fast and twice as cheap. Clearly to be a viable technology the electronic computer with its massive speed increase over earlier techniques had to offer some economies of scale. If the ENIAC (the first large electronic computing machine) had worked but the cost of each multiplication had been the same as for a human operating a desk calculator then the ENIAC would have been a machine that ate money at the rate of over a $ 100 000 an hour. Yet it hardly seems obvious what that economy might be to me but not apparently to Grosch.

Grosch's explanation that a computer's hardware's increased performance would be used up by programmer's and other people in the system serving their own ends suggests that people expanded the work the computer would do in its task to fill the new capacity above what was strictly necessary. So this explanation is a variation on Parkinson's law that "work expands to fill the time available", or in this case part of the computer capacity. Parkinson first published his law in a satirical Economist article of 1955 and like Grosch named the law after himself. This suggests how in Grosch's mind at least Grosch's law was more akin to this sort of qualitative wry observation on human nature rather than a precise mathematical engineering relation.

Grosch also admitted that there was an element of self fulfilling prophecy to his law that people looking to price new computer equipment simply generated the price by applying Grosch's law and others often suggested this an explantion for the law. Yet over the course of his career Grosch insisted his law held good, that the faster, bigger computing operation yielded the greater economy. He emphasized that it was the cost of the whole system and this reflects his attitude to other aspects of managing and using computers.

Grosch was an advocate of the closed computing shop. That is a computing installation run by a cadre of dedicated computer operators, programmers and so on who were the ones working directly on the machines on behalf of clients such as scientists, engineers and business users. This contrasted with open shops where the clients directly programmed the computers and ran the machines themselves. Grosch felt that open shop users tended to favour programming methods that were easy to learn but inefficient and engage in fanciful projects of limited utility. So Grosch's understanding of his law reflects not merely his sense of how computing machines worked but the organization of computing in terms of people also.

In terms of how Grosch's law was first received and remembered by the community of computer users, at first it seems to makes little impact on the published record. I have only found one reference to it from the 1950s outside Grosch's 1953 paper, although it is early. In the Proceedings of the 1954 Joint Computer Congress in Philadelphia, Pennsylvania, keynote speech of C. W. Adams "Small computers in a Big World" includes the following statement on the advantage of using a large central computer facility:
This can be seen from a fairly obvious empirical relationship (which we might call Grosch's Law) to the effect that the amount of computation a machine can produce is roughly proportional to the square of the cost. (Adams 1955, 3)

So here Grosch's law is styled an empirical observation, remembered as an observable commonplace of the industry, indeed Grosch's law is often called an "empirical law" in the literature. Note that Grosch was the program chair of this conference and in attendance at the keynote, so Adams may have referenced him more in recognition of his presence then anything else. Adams admitted that the actual balance of consideration for a small or large computer remained an obscure matter of dispute during these early days of the industry. Adams has stated the rule in the reverse form to Grosch's original version, instead of stating how much efficiency one will get for a given amount of speed. He states the law in terms of how much speed will be accrued for a given cost. This also makes the economy of scale more obvious, speed increased as square of the total cost (instead of unit cost of a computer operation) and seems to remove any need to talk about why more efficiency is not achieved. This is actually how Grosch's law is almost always stated, although Grosch himself always gave his original square root formulation.

In the 1960s discussion of Grosch's law becomes more common and it is treated as a well known fact, I find well over a dozen references to Grosch in the 60s starting with a 1960 conference proceedings. A major concern becomes whether the law is valid, is there really an economy of scale for larger computers? In 1962 C. W. Adams is back to write on the question in a 1962 declaring in the title "Grosch's Law Repealed." Adams seems somewhat of an outlier in the 1960s in his skepticism of Grosch's Law. The most extensive study of the cost of computing against performance was given by Kenneth E. Knight. Knight's 1963 Carnegie Tech Phd thesis includes a careful analysis of the question and he republished some key results in Datamation in 1966 "Changes in Computer Performance: a historic view."Knight constructed a careful equation to classify and compare the performance factors of computers and declared "Grosch's law upheld", a response to Adams. A similar positive analysis of some cases is taken up by Martin B. Solomon Jr., who looked first at the validity of Grosch's law in the IBM 360 line of computers and later at the economies of scale for personnel at larger computer installations and concluded that Grosch's law was basically an accurate reflection of the relationship between price and performance.

Knight and Solomon would be heavily cited later by authors on costs of computing. Such as William Sharp's 1969 Rand Corporation study, the Economics of Computers. Although these sorts of studies remained prominent doubts seem to have persisted about how robust and valid Grosch's law was. Also it is interesting that they presented themselves and are remembered as proving Grosch's law rather than replacing it with a more rigorous set of measures and relationships between performance and cost. This reinforces the sense of the 60s as a time when large mainframe computers were king, they were understood to offer the best performance.

Beyond general arguments about economies of scale Grosch's law is remembered in certain interesting specific contexts. Perhaps the most notable of these is as a key motivation for the computer or information utility. That is the notion that came to prominence in the mid-60s of how computing services would be best provided by connecting businesses and homes to large centralized computing facilities, via phone or cable, akin to a large electrical utility plant. Time sharing (multiple users on the same computer) and telecomputing services (accessing a computer via telephone or cable), were being developed at this time and this was a particularly ambitious extrapolation of that trend. Advocates of such schemes would often invoke the economies of scale of Grosch's law as a clear demonstration of the superiority of such a scheme. Although not all accounts of these computer visions mention Grosch's law they all almost always motivate by the idea that they would provide economical computing services. The point is that Grosch's law is remembered in this novel context as a key idea with novel implications. As the 1970s wore on this idea proved impractical and failed to materialize but had ideas that suggest later developments in networking and computing.

Grosch's law is also invoked in more modest contexts. One of these is multi-processing or what is later called parallel processing. It was a recurring debate in the 60s whether a machine with multiple CPUs working together could achieve significant performance improvements (either in general performance situations or via delegating different processors to specialized tasks like managing input/output. In principle Grosch's law says nothing about how the computer achieves its speed, but as the 60s wore on more researchers identified the computer and CPU and their speed. So that in 1972 one researcher can begin a paper:
Professor Korn's recent not (in the August 1972 issue) revives the old multiprocessing vs. uniprocessing debate that was probably launched when Herb Grosch first formulated his "law" CPU power is equal to the square of its price, so for the price of two CPUs one should be able to get a single CPU four times as fast as either of the two. (Serlin 1972, 201)

So now not only is the fundamental price and speed of a computer equated with a CPU, the computer becomes the CPU, but it is anachronistically suggested that Grosch's law was always about the CPU. Knowing how Grosch formulated the law in terms of the entire process of computation one can see the disconnect between this later interpretation and the original formulation. However it illustrates how Grosch's law becomes a malleable frame through which to organize ones view of the computer.

The last example of a different remembrance of Grosch's law I want to touch on is how it was used to explain the end of the mainframe era and the rise of the microprocessor and the personal computer. Already in 1975 Grosch wrote in Computerworld "Grosch's Law Revisited" a response to claims that the rise of the mini-computer heralded the end of Grosch's law. Grosch responded by saying while Minis offered the illusion of cheapness the full cost of the system would out and vindicate Grosch's law. (Grosch 1975) At about the same time microprocessors were coming to prominence and experts were suggesting that the new mass produced microprocessor would change the economics of computing. (Upton 1976) Soon there were a series of analysis of the economics of performance reminiscent of the efforts of Knight and Solomon in the 60s. The most prominent of these was the 1985 analysis of Philop Ein-Dor "Grosch's Law Re-revisited: CPU Power and the Cost of Computation." He divided computers into five classes: mainframes, small mainframes, supercomputers, Minicomputers and Microcomputers and suggested that while Grosch's law held within each class, but not between classes. Finally in 1987 Haim Mendelson concluded that the distinction between economy in different classes Ein-Dor had detected was simply over fitting of noise in the data. (Mendelson 1987) After this Grosch law comes to be seen more as reflecting an historical truth then current computer developments, although Grosch maintained his view that if one considered whole systems faster, larger, more centralized computing operations were more efficient. Again Grosch's Law is seen as a frame through which a major change, the rise of the microcomputer might be remembered and understood.

While some of this activity resembles the attempts to measure the economies of scale from the 1960s this attempt to understand larger shifts in the industry is also a factor. Also it suggests how Grosch's law can be seen as characterizing the earlier era of big machines and its violation is represented by the rise of the new small machines. Grosch's Law always had some uncertainty about its validity and arguably the relationships people like Knight demonstrated were not quite Grosch's law, but people still referred to and remembered such economies of scale as Grosch's law again it formed part of their conception of what computers were and their tendencies.

Grosch's law has also played an interesting role in the history of computing in my talk in October I will talk about how two histories of the computer, Paul Ceruzzi's A History of the Modern Computer and Martin Campbell-Kelly and William Aspray's Computer: The History of the Information Machine, use Grosch's law in telling about the history of the computer and how that reflects the changing view of the law.

Mirrored on my LinkedIn blog.
Current Mood: nostalgicnostalgic

Recent Posts from This Journal