Programming is Hard, Let's Go Scripting...
by Larry Wall
|
Pages: 1, 2, 3
PHP
We've also seen the rise of PHP, which takes the worse-is-better approach to dazzling new depths, as it were. By and large PHP seems to be making the same progression of mistakes as early Perl did, only slower. The one thing it does better is packaging. And when I say packaging, I don't mean namespaces.
JavaScript
Then there's JavaScript, a nice clean design. It has some issues, but in the long run JavaScript might actually turn out to be a decent platform for running Perl 6 on. Pugs already has part of a backend for JavaScript, though sadly that has suffered some bitrot in the last year. I think when the new JavaScript engines come out we'll probably see renewed interest in a JavaScript backend.
Monad/PowerShell
I've looked a bit at Microsoft's Monad, and I'm pleased to note that it has object pipes like Perl 6. I just hope they don't patent it.
Lua, AppleScript
There are other scripting languages in wide use. Sadly, I must confess I never looked closely at Lua or AppleScript, probably because I'm not a game designer with a Mac.
Actually, I suspect it runs deeper than that, which brings us up to the present time.
The Present
When I look at the present situation, what I see is the various scripting communities behaving a lot like neighboring tribes in the jungle, sometimes trading, sometimes warring, but by and large just keeping out of each other's way in complacent isolation.
I tend to take an anthropological view of these things. Many of you here are Perl programmers, but some of you come from other programming tribes. And depending on your tribal history, you might think of a string as a pointer to a byte array if you're a C programmer, or as a list if you're a functional programmer, or as an object if you're a Java programmer. I view a string as a Text, with a capital T.
Text
I read that word from a postmodern perspective. Of course, the term Postmodern is itself context-sensitive. Some folks think Postmodernism means little more than the Empowerment of the Vulgar. Some folks think the same about Perl.
But I take Postmodernism to mean that a Text, whether spoken or written, is an act of communication requiring intelligence on both ends, and sometimes in the middle too. I don't want to talk to a stupid computer language. I want my computer language to understand the strings I type.
Perl is a postmodern language, and a lot of conservative folks feel like Postmodernism is a rather liberal notion. So it's rather ironic that my views on Postmodernism were primarily informed by studying linguistics and translation as taught by missionaries, specifically, the Wycliffe Bible Translators. One of the things they hammered home is that there's really no such thing as a primitive human language. By which they mean essentially that all human languages are Turing complete.
When you go out to so-called primitive tribes and analyze their languages, you find that structurally they're just about as complex as any other human language. Basically, you can say pretty much anything in any human language, if you work at it long enough. Human languages are Turing complete, as it were.
Human languages therefore differ not so much in what you can say but in what you must say. In English, you are forced to differentiate singular from plural. In Japanese, you don't have to distinguish singular from plural, but you do have to pick a specific level of politeness, taking into account not only your degree of respect for the person you're talking to, but also your degree of respect for the person or thing you're talking about.
So languages differ in what you're forced to say. Obviously, if your language forces you to say something, you can't be concise in that particular dimension using your language. Which brings us back to scripting.
How many ways are there for different scripting languages to be concise?
How many recipes for borscht are there in Russia?
Language designers have many degrees of freedom. I'd like to point out just a few of them.
early binding / late binding
Binding in this context is about exactly when you decide which routine you're going to call for a given routine name. In the early days of computing, most binding was done fairly early for efficiency reasons, either at compile time, or at the latest, at link time. You still tend to see this approach in statically typed languages. With languages like Smalltalk, however, we began to see a different trend, and these days most scripting languages are trending towards later binding. That's because scripting languages are trying to be dwimmy (Do What I Mean), and the dwimmiest decision is usually a late decision because you then have more available semantic and even pragmatic context to work with. Otherwise you have to predict the future, which is hard.
So scripting languages naturally tend to move toward an object-oriented point of view, where the binding doesn't happen 'til method dispatch time. You can still see the scars of conflict in languages like C++ and Java though. C++ makes the default method type non-virtual, so you have to say virtual explicitly to get late binding. Java has the notion of final classes, which force calls to the class to be bound at compile time, essentially. I think both of those approaches are big mistakes. Perl 6 will make different mistakes. In Perl 6 all methods are virtual by default, and only the application as a whole can tell the optimizer to finalize classes, presumably only after you know how all the classes are going to be used by all the other modules in the program.
single dispatch / multiple dispatch
In a sense, multiple dispatch is a way to delay binding even longer. You not only have to delay binding 'til you know the type of the object, but you also have to know the types of all rest of the arguments before you can pick a routine to call. Python and Ruby always do single dispatch, while Dylan does multiple dispatch. Here is one dimension in which Perl 6 forces the caller to be explicit for clarity. I think it's an important distinction for the programmer to bear in mind, because single dispatch and multiple dispatch are philosophically very different ideas, based on different metaphors.
With single-dispatch languages, you are basically sending a message to an object, and the object decides what to do with that message. With multiple dispatch languages, however, there is no privileged object. All the objects involved in the call have equal weight. So one way to look at multiple dispatch is that the objects are completely passive. But if the objects aren't deciding how to bind, who is?
Well, it's sort of a democratic thing. All the routines of a given name get together and hold a political conference. (Well, not really, but this is how the metaphor works.) Each of the routines is a delegate to the convention. All the potential candidates put their names in the hat. Then all the routines vote on who the best candidate is, and the next best, and the next best after that. And eventually the routines themselves decide what the best routine to call is.
So basically, multiple dispatch is like democracy. It's the worst way to do late binding, except for all the others.
But I really do think that's true, and likely to become truer as time goes on. I'm spending a lot of time on this multiple dispatch issue because I think programming in the large is mutating away from the command-and-control model implicit in single dispatch. I think the field of computation as a whole is moving more toward the kinds of decisions that are better made by swarms of insects or schools of fish, where no single individual is in control, but the swarm as a whole has emergent behaviors that are somehow much smarter than any of the individual components.
eager evaluation / lazy evaluation
Most languages evaluate eagerly, including Perl 5. Some languages evaluate all expressions as lazily as possible. Haskell is a good example of that. It doesn't compute anything until it is forced to. This has the advantage that you can do lots of cool things with infinite lists without running out of memory. Well, at least until someone asks the program to calculate the whole list. Then you're pretty much hosed in any language, unless you have a real Turing machine.
So anyway, in Perl 6 we're experimenting with a mixture of eager and lazy. Interestingly, the distinction maps very nicely onto Perl 5's concept of scalar context vs. list context. So in Perl 6, scalar context is eager and list context is lazy. By default, of course. You can always force a scalar to be lazy or a list to be eager if you like. But you can say things like for 1..Inf as long as your loop exits some other way a little bit before you run into infinity.
eager typology / lazy typology
Usually known as static vs. dynamic, but again there are various positions for the adjustment knob. I rather like the gradual typing approach for a number of reasons. Efficiency is one reason. People usually think of strong typing as a reason, but the main reason to put types into Perl 6 turns out not to be strong typing, but rather multiple dispatch. Remember our political convention metaphor? When the various candidates put their names in the hat, what distinguishes them? Well, each candidate has a political platform. The planks in those political platforms are the types of arguments they want to respond to. We all know politicians are only good at responding to the types of arguments they want to have...
There's another way in which Perl 6 is slightly more lazy than Perl 5. We still have the notion of contexts, but exactly when the contexts are decided has changed. In Perl 5, the compiler usually knows at compile time which arguments will be in scalar context, and which arguments will be in list context. But Perl 6 delays that decision until method binding time, which is conceptually at run time, not at compile time. This might seem like an odd thing to you, but it actually fixes a great number of things that are suboptimal in the design of Perl 5. Prototypes, for instance. And the need for explicit references. And other annoying little things like that, many of which end up as frequently asked questions.
limited structures / rich structures
Awk, Lua, and PHP all limit their composite structures to associative arrays. That has both pluses and minuses, but the fact that awk did it that way is one of the reasons that Perl does it differently, and differentiates ordered arrays from unordered hashes. I just think about them differently, and I think a lot of other people do too.
symbolic / wordy
Arguably APL is also a kind of scripting language, largely symbolic. At the other extreme we have languages that eschew punctuation in favor of words, such as AppleScript and COBOL, and to a lesser extent all the Algolish languages that use words to indicate blocks where the C-derived languages use curlies. I prefer a balanced approach here, where symbols and identifiers are each doing what theyre best at. I like it when most of the actual words are those chosen by the programmer to represent the problem at hand. I don't like to see words used for mere syntax. Such syntactic functors merely obscure the real words. That's one thing I learned when I switched from Pascal to C. Braces for blocks. It's just right visually.
Actually, there are languages that do it even worse than COBOL. I remember one Pascal variant that required your keywords to be capitalized so that they would stand out. No, no, no, no, no! You don't want your functors to stand out. It's shouting the wrong words: IF! foo THEN! bar ELSE! baz END! END! END! END!
Anyway, in Perl 6 we're raising the standard for where we use punctuation, and where we don't. We're getting rid of some of our punctuation that isn't really pulling its weight, such as parentheses around conditional expressions, and most of the punctuational variables. And we're making all the remaining punctuation work harder. Each symbol has to justify its existence according to Huffman coding.
Oddly, there's one spot where we're introducing new punctuation. After your sigil you can add a twigil, or secondary sigil. Just as a sigil tells you the basic structure of an object, a twigil tells you that a particular variable has a weird scope. This is basically an idea stolen from Ruby, which uses sigils to indicate weird scoping. But by hiding our twigils after our sigils, we get the best of both worlds, plus an extensible twigil system for weird scopes we haven't thought of yet.
We think about extensibility a lot. We think about languages we don't know how to think about yet. But leaving spaces in the grammar for new languages is kind of like reserving some of our land for national parks and national forests. Or like an archaeologist not digging up half the archaeological site because we know our descendants will have even better analytical tools than we have.
Really designing a language for the future involves a great deal of humility. As with science, you have to assume that, over the long term, a great deal of what you think is true will turn out not to be quite the case. On the other hand, if you don't make your best guess now, you're not really doing science either. In retrospect, we know APL had too many strange symbols. But we wouldn't be as sure about that if APL hadn't tried it first.

