The Hundred-Year Language

April 2003(This essay is derived from a keynote talk at PyCon 2003.)It's hard to predict whatlife will be like in a hundred years. There are only a fewthings we can say with certainty. We know that everyone willdrive flying cars,that zoning laws will be relaxed to allow buildingshundreds of stories tall, that it will be dark most of thetime, and that women will all be trained in the martial arts. Here I want to zoom in on one detail of thispicture. What kind of programming language will they use towrite the software controlling those flying cars?This is worth thinking about not somuch because we'll actually get to use these languages as because,if we're lucky, we'll use languages on the path from thispoint to that.I think that, like species, languages will form evolutionary trees,with dead-ends branching off all over. We can see thishappening already.Cobol, for all its sometime popularity, does not seem to have anyintellectual descendants. It is an evolutionary dead-end-- aNeanderthal language.I predict a similar fate for Java. Peoplesometimes send me mail saying, "How can you say that Javawon't turn out to be a successful language? It's alreadya successful language." And I admit that it is, if youmeasure success by shelf space taken up by books on it(particularly individual books on it), or bythe number of undergrads who believe they have tolearn it to get a job. When I say Java won'tturn out to be a successful language, I mean something morespecific: that Javawill turn out to be an evolutionary dead-end, like Cobol.This is just a guess. I may be wrong. My point here is not to dis Java,but to raise the issue of evolutionarytrees and get people asking, where on the tree is language X?The reason to ask this question isn't just so thatour ghosts can say, in ahundred years, I told you so. It's because staying close to the main branches is a useful heuristic for finding languages that willbe good to program in now.At any given time, you're probably happiest onthe main branches of an evolutionary tree.Even when there were still plenty of Neanderthals, it must have sucked to be one. TheCro-Magnons would have been constantly coming over andbeating you up and stealing your food.The reason I want toknow what languages will be like in a hundred years is so thatI know what branch of the tree to bet on now.The evolution of languages differs from the evolution of speciesbecause branches can converge. The Fortran branch, for example,seems to be merging with the descendantsof Algol. In theory this is possible for species too, but it'snot likely to have happened to any bigger than a cell.Convergenceis more likely for languages partly because the space ofpossibilities is smaller, and partly because mutationsare not random. Language designers deliberately incorporateideas from other languages.It's especially useful for language designers to thinkabout where the evolution of programming languages is likelyto lead, because they can steer accordingly. In that case, "stay on a main branch" becomes more than away to choose a good language.It becomes a heuristic for making the right decisions aboutlanguage design.Any programming language can be divided intotwo parts: some set of fundamental operators that play the roleof axioms, and the rest of the language, which could in principlebe written in terms of these fundamental operators.I think the fundamental operators are the most important factor in alanguage's long term survival. The rest you can change. It'slike the rule that in buying a house you should considerlocation first of all. Everything else you can fix later, but youcan't fix the location.I think it's important not just that the axioms be well chosen, but that there be few of them. Mathematicians have always felt this way about axioms-- the fewer, the better-- and I think they'reonto something.At the very least, it has to be a useful exercise to look closelyat the core of a language to see if there are any axioms thatcould be weeded out. I've found in my long career as a slob thatcruft breeds cruft, and I've seen this happen in software aswell as under beds and in the corners of rooms.I have a hunch thatthe main branches of the evolutionary tree pass through the languagesthat have the smallest, cleanest cores.The more of a language you can write in itself,the better.Of course, I'm making a big assumption in even asking whatprogramming languages will be like in a hundred years.Will we even be writing programs in a hundred years? Won'twe just tell computers what we want them to do?There hasn't been a lot of progress in that departmentso far.My guess is that a hundred years from now people willstill tell computers what to do using programs we would recognizeas such. There may be tasks that wesolve now by writing programs and which in a hundred yearsyou won't have to write programs to solve, but I thinkthere will still be a good deal ofprogramming of the type that we do today.It may seem presumptuous to think anyone can predict whatany technology will look like in a hundred years. Butremember that we already have almost fifty years of history behind us.Looking forward a hundred years is a graspable ideawhen we consider how slowly languages have evolved in thepast fifty.Languages evolve slowly because they're not really technologies.Languages are notation. A program is a formal description of the problem you want a computer to solve for you. So the rateof evolution in programming languages is more like therate of evolution in mathematical notation than, say,transportation or communications.Mathematical notation does evolve, but not with the giantleaps you see in technology.Whatever computers are made of in a hundred years, it seems safe to predict they will be much faster thanthey are now. If Moore's Law continues to put out, they will be 74quintillion (73,786,976,294,838,206,464) times faster. That's kind ofhard to imagine. And indeed, the most likely prediction in thespeed department may be that Moore's Law will stop working.Anything that is supposed to double every eighteen months seemslikely to run up against some kind of fundamental limit eventually.But I have no trouble believing that computers will be very muchfaster. Even if they only end up being a paltry milliontimes faster, that should change the ground rules for programminglanguages substantially. Among other things, therewill be more room for whatwould now be considered slow languages, meaning languagesthat don't yield very efficient code.And yet some applications will still demand speed.Some of the problems we want to solve withcomputers are created by computers; for example, therate at which you have to process video images dependson the rate at which another computer cangenerate them. And there is another class of problemswhich inherently have an unlimited capacity to soak up cycles:image rendering, cryptography, simulations.If some applications can be increasingly inefficient whileothers continue to demand all the speed the hardware candeliver, faster computers will mean that languages haveto cover an ever wider range of efficiencies. We've seenthis happening already. Current implementations of somepopular new languages are shockingly wasteful by thestandards of previous decades.This isn't just something that happens with programminglanguages. It's a general historical trend. As technologies improve,each generation can do things that the previous generationwould have considered wasteful. People thirty years ago wouldbe astonished at how casually we make long distance phone calls.People a hundred years ago would be even more astonished that a package would one day travel from Boston to New York via Memphis.I can already tell you what's going to happen to all those extracycles that faster hardware is going to give us in the next hundred years. They're nearly all going to be wasted.I learned to program when computer power was scarce.I can remember taking all the spaces out of my Basic programsso they would fit into the memory of a 4K TRS-80. Thethought of all this stupendously inefficient softwareburning up cycles doing the same thing over and over seemskind of gross to me. But I think my intuitions here are wrong. I'mlike someone who grew up poor, and can't bear to spend moneyeven for something important, like going to the doctor.Some kinds of waste really are disgusting. SUVs, for example, wouldarguably be gross even if they ran on a fuel which would neverrun out and generated no pollution. SUVs are gross because they'rethe solution to a gross problem. (How to make minivans look moremasculine.)But not all waste is bad. Now that we have the infrastructureto support it, counting the minutes of your long-distancecalls starts to seem niggling. If you have theresources, it's more elegant to think of all phone calls asone kind of thing, no matter where the other person is.There's good waste, and bad waste. I'm interestedin good waste-- the kind where, by spending more, we can get simpler designs. How will we take advantage of the opportunitiesto waste cycles that we'll get from new, faster hardware?The desire for speed is so deeply engrained in us, with our puny computers, that it will take a conscious effortto overcome it. In language design, we should be consciously seeking outsituations where we can trade efficiency for even thesmallest increase in convenience.Most data structures exist because of speed. For example,many languages today have both strings and lists. Semantically, stringsare more or less a subset of lists in which the elements arecharacters. So why do you need a separate data type?You don't, really. Strings onlyexist for efficiency. But it's lame to clutter up the semanticsof the language with hacks to make programs run faster.Having strings in a language seems to be a case ofpremature optimization.If we think of the core of a language as a set of axioms, surely it's gross to have additional axioms that add no expressivepower, simply for the sake of efficiency. Efficiency isimportant, but I don't think that's the right way to get it.The right way to solve that problem, I think, is to separatethe meaning of a program from the implementation details. Instead of having both lists and strings, have just lists,with some way to give the compiler optimization advice that will allow it to lay out strings as contiguous bytes ifnecessary.Since speed doesn't matter in most of a program, you won'tordinarily need to bother withthis sort of micromanagement.This will be more and more true as computers get faster.Saying less about implementation should also make programsmore flexible.Specifications change while a program is being written, and this is notonly inevitable, but desirable.The word "essay" comesfrom the French verb "essayer", which means "to try".An essay, in the original sense, is something youwrite to try to figure something out. This happens insoftware too. I think some of the best programs were essays,in the sense that the authors didn't know when they startedexactly what they were trying to write.Lisp hackers already know about the value of being flexiblewith data structures. We tend to write the first version ofa program so that it does everything with lists. Theseinitial versions can be so shockingly inefficient that ittakes a conscious effort not to think about what they'redoing, just as, for me at least, eating a steak requires aconscious effort not to think where it came from.What programmers in a hundred years will be looking for, most ofall, is a language where you can throw together an unbelievablyinefficient version 1 of a program with the least possibleeffort. At least, that's how we'd describe it in present-dayterms. What they'll say is that they want a language that'seasy to program in.Inefficient software isn't gross. What's gross is a languagethat makes programmers do needless work. Wasting programmer timeis the true inefficiency, not wasting machine time. This willbecome ever more clear as computers get faster.I think getting rid of strings is already something wecould bear to think about. We did it in Arc, and it seemsto be a win; some operations that would be awkward todescribe as regular expressions can be describedeasily as recursive functions.How far will this flattening of data structures go? I can thinkof possibilities that shock even me, with my conscientiously broadenedmind. Will we get rid of arrays, for example? After all, they'rejust a subset of hash tables where the keys are vectors ofintegers. Will we replace hash tables themselves with lists?There are more shocking prospects even than that. The Lispthat McCarthy described in 1960, for example, didn'thave numbers. Logically, you don't need to have a separate notionof numbers, because you can represent them as lists: the integern could be represented as a list of n elements. You can do math thisway. It's just unbearably inefficient.No one actually proposed implementing numbers as lists inpractice. In fact, McCarthy's 1960 paper was not, at the time,intended to be implemented at all. It was a theoretical exercise,an attempt to create a more elegant alternative to the TuringMachine. When someone did, unexpectedly, take this paper andtranslate it into a working Lisp interpreter, numbers certainlyweren't represented as lists; they were represented in binary,as in every other language.Could a programming language go so far as to get rid of numbersas a fundamental data type? I ask this not so much as a seriousquestion as as a way to play chicken with the future. It's likethe hypothetical case of an irresistible force meeting an immovable object-- here, an unimaginably inefficientimplementation meeting unimaginably great resources.I don't see why not. The future is pretty long. If there'ssomething we can do to decrease the number of axioms in the corelanguage, that would seem to be the side to bet on as t approachesinfinity. If the idea still seems unbearable in a hundred years,maybe it won't in a thousand.Just to be clear about this, I'm not proposing that all numericalcalculations would actually be carried out using lists. I'm proposingthat the core language, prior to any additional notations aboutimplementation, be defined this way. In practice any programthat wanted to do any amount of math would probably representnumbers in binary, but this would be an optimization, not part ofthe core language semantics.Another way to burn up cycles is to have many layers ofsoftware between the application and the hardware. This too isa trend we see happening already: many recent languages arecompiled into byte code. Bill Woods once told me that,as a rule of thumb, each layer of interpretation costs afactor of 10 in speed. This extra cost buys you flexibility.The very first version of Arc was an extreme case of this sortof multi-level slowness, with corresponding benefits. Itwas a classic "metacircular" interpreter writtenon top of Common Lisp, with a definite family resemblanceto the eval function defined in McCarthy's original Lisp paper.The whole thing was only a couple hundred lines ofcode, so it was very easy to understand and change. The Common Lisp we used, CLisp, itself runs on topof a byte code interpreter. So here we had two levels ofinterpretation, one of them (the top one) shockingly inefficient,and the language was usable. Barely usable, I admit, butusable.Writing software as multiple layers is a powerful techniqueeven within applications. Bottom-up programming means writinga program as a series of layers, each of which serves as alanguage for the one above. This approach tends to yieldsmaller, more flexible programs. It's also the best route to that holy grail, reusability. A language is by definitionreusable. The moreof your application you can push down into a language for writingthat type of application, the more of your software will be reusable.Somehow the idea of reusability got attachedto object-oriented programming in the 1980s, and no amount ofevidence to the contrary seems to be able to shake it free. Butalthough some object-oriented software is reusable, what makesit reusable is its bottom-upness, not its object-orientedness.Consider libraries: they're reusable because they're language,whether they're written in an object-oriented style or not.I don't predict the demise of object-oriented programming, by theway. Though I don't think it has much to offer good programmers,except in certain specialized domains, it is irresistible to large organizations. Object-oriented programmingoffers a sustainable way to write spaghetti code. It lets you accreteprograms as a series of patches.Large organizationsalways tend to develop software this way, and I expect thisto be as true in a hundred years as it is today.As long as we're talking about the future, we had bettertalk about parallel computation, because that's where this idea seems to live. That is, no matter when you're talking, parallelcomputation seems to be something that is going to happenin the future.Will the future ever catch up with it? People have beentalking about parallel computation as something imminent for at least 20years, and it hasn't affected programming practice much so far.Or hasn't it? Alreadychip designers have to think about it, and so mustpeople trying to write systems software on multi-cpu computers.The real question is, how far up the ladder of abstraction willparallelism go?In a hundred years will it affect even application programmers? Orwill it be something that compiler writers think about, butwhich is usually invisible in the source code of applications?One thing that does seem likely is that most opportunities forparallelism will be wasted. This is a special case of my more general prediction that most of the extra computer power we'regiven will go to waste. I expect that, as with the stupendousspeed of the underlying hardware, parallelism will be somethingthat is available if you ask for it explicitly, but ordinarilynot used. This implies that the kind of parallelism we have ina hundred years will not, except in special applications, bemassive parallelism. I expect forordinary programmers it will be more like being able to fork offprocesses that all end up running in parallel.And this will, like asking for specific implementations of datastructures, be something that you do fairly late in the life of aprogram, when you try to optimize it. Version 1s will ordinarilyignore any advantages to be got from parallel computation, justas they will ignore advantages to be got from specific representationsof data.Except in special kinds of applications, parallelism won'tpervade the programs that are written in a hundred years. It would bepremature optimization if it did.How many programming languages will therebe in a hundred years? There seem to be a huge number of newprogramming languages lately. Part of the reason is thatfaster hardware has allowed programmers to make differenttradeoffs between speed and convenience, depending on theapplication. If this is a real trend, the hardware we'll have in a hundred years should only increase it.And yet there may be only a few widely-used languages in ahundred years. Part of the reason I say thisis optimism: it seems that, if you did a really good job,you could make a language that was ideal for writing a slow version 1, and yet with the right optimization adviceto the compiler, would also yield very fast code when necessary.So, since I'm optimistic, I'm going to predict that despitethe huge gap they'll have between acceptable and maximalefficiency, programmers in a hundred years will have languages that can span most of it.As this gap widens, profilers will become increasingly important.Little attention is paid to profiling now. Many people stillseem to believe that the way to get fast applications is towrite compilers that generate fast code. As the gap between acceptable and maximal performance widens, it will becomeincreasingly clear that the way to get fast applications is to have a good guide from one to the other.When I say there may only be a few languages, I'm not includingdomain-specific "little languages". I think such embedded languagesare a great idea, and I expect them to proliferate. But I expectthem to be written as thin enough skins that users can seethe general-purpose language underneath.Who will design the languages of the future? One of the most excitingtrends in the last ten years has been the rise of open-source languages like Perl, Python, and Ruby.Language design is being taken over by hackers. The resultsso far are messy, but encouraging. There are some stunningly novel ideas in Perl, for example. Many are stunningly bad, butthat's always true of ambitious efforts. At its current rateof mutation, God knows what Perl might evolve into in a hundredyears.It's not true that those who can't do, teach (some of the besthackers I know are professors), but it is true that there are alot of things that those who teach can't do. Research imposesconstraining caste restrictions. In any academicfield there are topics that are ok to work on and others thataren't. Unfortunately the distinction between acceptable andforbidden topics is usually based on how intellectualthe work sounds when described in research papers, rather thanhow important it is for getting good results. The extreme caseis probably literature; people studying literature rarely say anything that would be of the slightest use to thoseproducing it.Though the situation is better in the sciences,the overlap between the kind of work you're allowed to do and thekind of work that yields good languages is distressingly small.(Olin Shivers has grumbled eloquentlyabout this.) For example, types seem to be an inexhaustible sourceof research papers, despite the fact that static typingseems to preclude true macros-- without which, in my opinion, nolanguage is worth using.The trend is not merely toward languages being developedas open-source projects rather than "research", but towardlanguages being designed by the application programmers who needto use them, rather than by compiler writers. This seems a goodtrend and I expect it to continue.Unlike physics in a hundred years, which is almost necessarilyimpossible to predict, I think it may be possible in principleto design a language now that would appeal to users in a hundredyears.One way to design a language is to just write down the programyou'd like to be able to write, regardless of whether there is a compiler that can translate it or hardware that can run it.When you do this you can assume unlimited resources. It seemslike we ought to be able to imagine unlimited resources as welltoday as in a hundred years.What program would one like to write? Whatever is least work.Except not quite: whatever would be least work if your ideas aboutprogramming weren't already influenced by the languages you're currently used to. Such influence can be so pervasive that it takes a great effort to overcome it. You'd think it wouldbe obvious to creatures as lazy as us how to express a programwith the least effort. In fact, our ideas about what's possibletend to be so limited by whatever language we think in thateasier formulations of programs seem very surprising. They'resomething you have to discover, not something you naturallysink into.One helpful trick hereis to use the length of the program as an approximation forhow much work it is to write. Not the length in characters,of course, but the length in distinct syntactic elements-- basically,the size of the parse tree. It may not be quite true thatthe shortest program is the least work to write, but it'sclose enough that you're better off aiming for the solidtarget of brevity than the fuzzy, nearby one of least work.Then the algorithm for language design becomes: look at a programand ask, is there any way to write this that's shorter?In practice, writing programs in an imaginary hundred-yearlanguage will work to varying degrees dependingon how close you are to the core. Sort routines you canwrite now. But it would behard to predict now what kinds of libraries might be needed ina hundred years. Presumably many libraries will be for domains thatdon't even exist yet. If SETI@home works, for example, we'll need libraries for communicating with aliens. Unless of coursethey are sufficiently advanced that they already communicatein XML.At the other extreme, I think you might be able to design thecore language today. In fact, some might argue that it was alreadymostly designed in 1958.If the hundred year language were available today, would wewant to program in it? One way to answer this question is tolook back. If present-day programming languages had been availablein 1960, would anyone have wanted to use them?In some ways, the answer is no. Languages today assumeinfrastructure that didn't exist in 1960. For example, a languagein which indentation is significant, like Python, would notwork very well on printer terminals. But putting such problemsaside-- assuming, for example, that programs were all justwritten on paper-- would programmers of the 1960s have likedwriting programs in the languages we use now?I think so.Some of the less imaginative ones,who had artifacts of early languages built into their ideas of what a program was, might have had trouble. (How can you manipulatedata without doing pointer arithmetic? How can you implement flow charts without gotos?) But I think the smartest programmerswould have had no trouble making the most of present-daylanguages, if they'd had them.If we had the hundred-year language now, it would at least make agreat pseudocode. What about using it to write software? Since the hundred-year languagewill need to generate fast code for some applications, presumablyit could generate code efficient enough to run acceptably wellon our hardware. We might have to give more optimization advicethan users in a hundred years, but it still might be a net win.Now we have two ideas that, if you combine them, suggest interestingpossibilities: (1) the hundred-year language could, in principle, bedesigned today, and (2) such a language, if it existed, might be good toprogram in today. When you see these ideas laid out like that,it's hard not to think, why not try writing the hundred-year languagenow?When you're working on language design, I think it is good tohave such a target and to keep it consciously in mind. When youlearn to drive, one of the principles they teach you is toalign the car not by lining up the hood with the stripes paintedon the road, but by aiming at some point in the distance. Evenif all you care about is what happens in the next ten feet, thisis the right answer. Ithink we can and should do the same thing with programming languages.NotesI believe Lisp Machine Lisp was the first language to embodythe principle that declarations (except those of dynamic variables)were merely optimization advice,and would not change the meaning of a correct program. Common Lispseems to have been the first to state this explicitly.Thanks to Trevor Blackwell, Robert Morris, and Dan Giffin forreading drafts of this, and to Guido van Rossum, Jeremy Hylton, and therest of the Python crew for inviting me to speak at PyCon.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签