The State of the Onion 5

Jul 25, 2001 by Simon Cozens

Larry’s State of the Onion presentation this year was, as every year, completely different from previous years. As he noted, previous talks have been about chemistry, theology, biology, and music; this year, for once, Larry actually talked about Perl.

And this year the format was rather different. Based on the success of the lightning talks at previous Perl conferences, Larry decided to adopt this for his presentation - he gave us thirty-three lightning talks of fifty-five seconds each, with his daughter Heidi ruthlessly manning the bell. So ruthless, in fact, that Larry had to encourage us to laugh quickly to avoid cutting into his talking time.

Perl 6 and Apocalypses

Larry Wall during his State of the Onion speech at TPC 5. Larry's time keeper for the lightening rounds was Heidi Alayne Wall. For more on TPC 5 and the O'Reilly Open Source Convention, visit O'Reilly Network's conference coverage page.

Photo courtesty D. Story/J. Blanchard/O'Reilly Network

Larry talked about both Perl 5 and Perl 6. He noted that many talented people were putting dedication and love into Perl 5, and Perl 5 is doing great. So much for Perl 5.

Perl 6 was obviously the major focus of Larry’s talk, with each lightning talk laying out the major points of an Apocalypse. As before, Apocalypses mirrored chapters of the Camel book, Programming Perl.

Larry hinted that he’d already recanted part of the second Apocalypse; the third Apocalypse was due to come out, but Larry got sick. On the other hand, this gave him a chance to go slowly, and to do it right.

So his lightning talks really began with apocalypse three, about operators. Larry noted that there had been a number of very specific proposals, but he wanted to concentrate on generalities. He also said that he was wavering on the idea of user-defined operators, and particularly Unicode operators. -> will become . - start accepting that now. This will mean that the concatenation operator will have to change, and this will probably become ~.

The bell tolled, and so Larry had to move onto talk four - control structures. To loud applause, he announced that Perl 6 would include a switch statement; to some bemusement, however, he let on that it would be called “given” - case statements would be called “when”. Another notable renaming: “local” will become “temp”.

Larry reiterated the need for optional type and property declarations; this isn’t the same as typing, but it’s a way of specifying metadata about a variable or subroutine. As you supply Perl with more metadata about variables, it will gain efficiency in terms of storage and manipulation. As the bell went, Larry was just explaining how 90% of code wouldn’t get that much faster but…

Regular Expressions, Formats, Packages and Modules

Larry’s been thinking a lot about direct assignment to variables from within a regular expressions, but has decided this isn’t the real problem - the real problem is that people want to build up structures of anonymous hashes or arrays from a regular expression.

There will be set operations on characters classes: for instance, you’ll be able to specify a match of all letters without the vowels. Actually, this will probably be in Perl 5 as well. There was also time to tell us that he’d like the /x modifier to be on by default.

The next talk was about subroutines. Prototypes will be extended into a complete type signature system; the sub keyword would be discarded on closures - in fact, any curly braces which are not immediately recognisable will be assumed to be a closure. Parameters to closures will be “self-identifying”, such as $^a and $^b.

Larry said that formats would be a module, so he didn’t have to say anything more about it.

The reference talk opened to the statement that pseudo-hashes must die; there was loud applause. Larry reiterated that dereferences will be assumed when a scalar is followed by braces - that is to say, what is now $foo->[$a] will be reduced to $foo[$a] in Perl 6. This will require prohibiting whitespace before hash subscripts.

In terms of data types, there will be compact arrays; pseudo-hashes will be replaced by opaque objects with named parameters. The => operator will now create a pair type; the range operators will create a slightly different type of pair, which gets expanded out to a range on demand.

There will be a distinction between classes and modules. Within a class or a module, there will be subpackages: there will be no more need to type out Myclass::SubclassA::SubclassB; just like Unix has relative pathnames and directories, Perl will have modules that can be specified as relative namespaces.

On the theme of modules, module names would be extended to include metadata on the version and the author’s name. The default use statement would allow the version and author’s name to be wildcards. There will also be virtual interface modules, and better automation of documentation based on module metadata.

Objects should be easy to declare and have accessible metadata. When you put an attribute onto the module, it will appear as a variable inside of the class; outside of the class, it can be accessed as a method. There will also be optional multimethods, and syntax like Class.bless($ref) to bless a reference into a class.

Addressing Concerns

Larry mentioned that overloading was a headache; you hate it in languages that have it, and you hate it when languages don’t have it. Sometimes there are gratuitous abuses of overloading, such as C++’s left-shift operator to add more arguments to a function. The overloading system will be a lot cleaner, by specifying operators as special method names. The vtable system will be leveraged to provide overloading on objects. There will also be overloading hooks in printf so that new formats can be defined.

Larry said that many of the proposals about tied variables missed the point; what is tied is the variable, and not the value. Tying needs to be naturally scoped to the variable’s lifetime, and tying needs to be declared in advance for efficiency.

He also begged for compassion towards programmers when dealing with Unicode; we don’t want to force people to have to deal with Unicode if they don’t want to, and equally we don’t want to leave Unicode people out in the cold. Strings need to be completely polymorphic, with internal routines able to specify what type of strings they’re able to cope with. Normalization will normally be done at the filehandle level, and the type system must remember whether or not some data has been normalized.

The IPC talk asked for “no pain” installation of new protocols; there will be easy mapping of high-level structured data, such as XML-RPC or SOAP, from the network onto Perl’s internal data structures. Perl 6 will continue to have safe signals. IPv6 will be supported.

The thread model will basically be ithreads, the new model in 5.6.0; variables may be shared by declaration, but will not be shared by default. The Pthread model will mean “share everything”. Modules ought to be thread safe by default - they should declare their thread-safety or otherwise in their metadata.

Perl will have a parser of its own, written in Perl. This will allow us to bytecompile the parser, and also have the parser modifiable. It’ll also help us port eval to Perl running on small virtual machines. Lexical analysis will remain as a one-pass process, and subroutines will be compiled immediately on parsing.

The command line interface will not dramatically change; only one RFC concerned it. Larry stressed Perl’s role as a glue language, and that it must cooperate with its environment.

Larry also had a shameful confession; he wrote the Perl debugger, but hardly ever uses it - he’s more a print-statement kinda guy. Hence, he’s happy to delegate the writing of the debugger to other people. He did, however, point out the heavy dependency a debugger has on the debugging facilities of the platform it runs on.

He’s also punting on the internals, leaving the details of that to Dan Sugalski, who’s talking tomorrow. However, he said that the internals will be much more modular; they will comprise a software CPU, and regular expressions will compile down to normal Perl opcodes. It will implement a variety of garbage collection models, and will use vtables to despatch operations on values.

What about CPAN?

CPAN got too big to download, and there’s a problem that ISPs never install enough of it for what their users want. Bundles are a partial solution to this, but of late there has been more interest in the development of an SDK for Perl.

Tainting will be implemented by the new property method, and sandboxing will be achieved by using separate interpreter threads.

Perl 6 will attempt to remove some common goofs: Perl 5 has already stopped taking up huge quantities of memory when you say foreach $i (1 .. 1_000_000_000) { ... };, but Perl 6 will also apply the same optimization to @big = 1 .. 1_000_000_000;. There’ll also be support for embedding your tests in POD.

Speaking of POD, =begin and =end will work for commenting because now =end will go back to the previous state rather than going to POD. Larry wants multiple POD streams, noting that the DATA filehandle is essentially just another POD stream.

Perl culture has been mostly self-correcting; the Perl 6 announcement drove people to work together more strongly and fix up some of the flaws of the community. However, Larry encouraged everyone to do their part; newbie friendliness is one thing, but it’s important to hold yourself to higher standards than those to which you hold yourself.

The cleanup of special variables will be an exercise in balancing cleanup with convenience. For instance $_ will obviously be staying; $(, on the other hand, will go. This allows us to free up $( ... ) for interpolating expressions. Bareword filehandles will go, and error status variables may be merged.

Built-in functions. For functions with terribly long return lists, such as stat, Perl will return modules; some subset of the proposed array operators (merge, partition, flatten, etc.) will be included in the core. The logical return values from functions such as system will be made more sensible; select will be removed.

The standard Perl library will be almost entirely removed. The point of this is to force ISPs to believe that Perl is essentially useless on its own, and hence they will need to install additional modlues. Pragmatic modules will have more freedom to warp Perl’s syntax and semantics, and will also provide real optimization hints.

Larry suggested that the standard module library could be downloaded basically on demand; there will be a few modules which support basic built-in functionality and their documentation. Perl 6 should deal with I18N and L10N issues, and also support the sort of exception handling that you would expect from languages of its type.

“There are 35 seconds left. Any questions?”

So that’s the rudiments of the design of Perl 6. Over the coming months, we’ll see the rest of the Apocalypses, backed up by Exegeses from Damian, and hopefully, eventually, code from Dan and the rest of the team. We’ll bring you our analysis of how the Perl 6 language is shaping up when we all get home from the conference!

Tags

community