Slangs Today

Language tweaks and DSLs in Perl 6

Change in language is inevitable, and often good. If you spend a lot of time analyzing literature, for example, the terms you use change to better represent your ideas. Similarly, if you have to spend a long time carefully doing finances, mathematical language makes the details of calculations clearer than if they were written in English. Perl 6 has a similar approach, letting you code in a way that makes more sense based on what you’re doing, and then being able to switch back out to vanilla Perl 6. These switches, referred to as “slangs”, can either be into modified version of Perl 6. While the general interface for making and interacting with slangs is still being developed, slangs can still be created today by those adventurous enough.

v5

2013 saw the creation of the first Perl 6 slang: v5, an ambitious project by Tobias Leich, A.K.A., FROGGS, that has been the major catalyst for slang development. It implemented switching from Perl version 6, to version 5, and back, as has been long demanded by Synopsis 1. It works something like this:

use v6;
# ...some Perl 6 code…
sub concat1 (Str $s1, Str $s2) { $s1 ~ $s2 }
{
    use v5;
    # ...some Perl 5 code…
    sub concat2 {
        my ($s1, $s2) = @_;
        $s1 . $s2;
    }
    {
        use v6;
        # ...more Perl 6 code…
        say concat1 “Hel”, “lo ”;
        say concat2 “Wor”, “ld!”;
    }
}

Originally, v5 had to be implemented in NQP, the low-level subset of Perl 6 that Rakudo’s parser is implemented in. Nowadays, slangs can be written in Perl 6, as v5 is now, although there are some difficulties that must be overcome when crossing between the NQP/Perl 6 object spaces.

Like the Perl 6 parser itself, slangs are implemented using the principles of Perl 6 grammars. See /language/grammars for more details.

&EXPORT and the language braid

The first thing we’re going to do, however, is set up a slang. It’s much easier to figure out a slang when it can be played with and tested while it’s being written, rather than attempting to write it all at once. The first component of writing a slang is creating the &EXPORT function. Currently, this is the only point where a slang can hook itself into the main parser.

The &EXPORT function is important because when a library is imported, with use My::Module;, any terms that might change the state of the parser have to be imported before the parser tries to parse any more of the original file. Because of this, if My::Module defines it, &EXPORT is called from within Perl6::Grammar to specify any additional objects for My::Module to export. Being called from the grammar gives us the ability to also manipulate special variables inside the parser, including the ones that control what’s known as "the language braid," the set of mini-languages that make up the Perl 6 parser.

sub EXPORT {
    # The alias in %*LANG of the current language,
    # in this case $*MAIN eq “MAIN”.
    say $*MAIN;

    # Accessing the current language within the language braid
    say %*LANG{$*MAIN}.name;

    # Accessing the Perl 6 actions class
    say %*LANG<MAIN-actions>.name;

    # Return an empty hash, to specify that
    # we’re not exporting anything extra
    {}
}

To get the &EXPORT subroutine to be run, put this in a file called, e.g., MySlang.pm, and execute perl6 -I. -e 'use MySlang;' on the command line.

From here, there are two paths. One path creates a whole new language, like v5, and the other just augments the Perl 6 parser or another slang. If you are traveling the former path, your language is most likely going to have a grammar, to parse things, and an actions class, to create the meaning for what was parsed. If your language’s grammar has been defined as grammar MyLang::Grammar and the corresponding action class as class MyLang::Actions, these can be hooked into the parser with an &EXPORT sub like so:

grammar MyLang::Grammar { ... }
class   MyLang::Actions { ... }
sub EXPORT {
    # Let the parser know it’s a slang
    %*LANG<MyLang>         := MyLang::Grammar;
    %*LANG<MyLang-actions> := MyLang::Actions;
    # Tell the parser to switch to “MyLang”
    $*MAIN := “MyLang”;
    {}
}

Now, if this is saved in MyLang.pm, you could start writing in your language by just writing use MyLang; ... . Note the use of := instead of =$*MAIN and %*LANG are from NQP land, so binding has to be used instead of assignment.

On the other hand, if you want to augment Perl 6 — say, if you want to add your own quote operators or some special macros — you would use Roles that are mixed-in to the original grammar and/or actions:

# (helper function)
sub slangify ($role, :$into = 'MAIN') {
    nqp::bindkey(%*LANG, $into, %*LANG{$into}.HOW.mixin(%*LANG{$into}, $role));
}
role MyLang::FancyQuotes       { ... }
role MyLang::FancyQuoteActions { ... }
sub EXPORT {
    slangify MyLang::FancyQuotes;
    slangify MyLang::FancyQuoteActions, :into<MAIN-actions>;
    {}
}

Slang Grammars

From this point, it’s a matter of writing the grammars and actions, keeping in mind how easy it is to add to a grammar category, by writing the category name and then :sym<…>. For example, to add a with keyword to the same category that parses while, for, if, etc., you can add a rule statement_control:sym<with> to the role you're going to mixin to Perl6::Grammar. In fact, the best way to figure out how to do things when making a slang is by searching for similar situations in the official grammar, or in the grammar actually used by Rakudo itself. The advantage of the latter being that it is closer to what will actually have to be written for a slang.

For example, to add fancy quotes to the language, you could write:

role MyLang::FancyQuotes {
    token quote:sym<“ ”> {
        \“ ~ \” <nibble=-[”]>*
    }
}

Another possibility is adding to a one of the sub-languages. For example, it's possible to add to the regex language.[1]

my role Callsame::Grammar {
    token assertion:sym<*> {
        <sym>
    }
}

This parses "*" in the category assertion, which means it'll be parsed when writing a regex like / foo <*> bar /.

Slang Actions

Just parsing, however, isn't going to do much that's very useful. The next step is writing an actions class.

First, a distinction has to be made between parse- or compile-time, and run-time.

A file that uses your slang is parsed up to that use statement. Then, the file that contains your slang is parsed normally, and then executed. Perl 6 then returns to the original file and parses the rest of it with however you have modified the parser. Then, and only then, is the run-time executed, making variables from the first program available, preforming computations, creating objects, calling functions, and so on. You have no direct access to the run-time of the calling program.

How, then, can you write an action class to do anything even remotely interesting with the program?

What actions classes to is create an AST, or abstract syntax tree. This is where the flat, one-dimensional program is turned into a structure than can be interpreted by a compiler like the JVM or MoarVM. Rakudo uses a specific set of classes to create these trees called QAST::Nodes. So before even writing an actions class, import this by adding to the top of your slang file:

use QAST:from<NQP>;

Documentation exists in the NQP repository.

Actions classes have the same methods as the grammar classes. By default, these methods aren't going to be called. Action methods are only called when calling .ast or .made on the Match object created by a regex. This will invoke the same method in the actions class and then, if a value was .make-ed during that time, that value will returned. To continue from the last example:

my role Callsame::Actions {
    method assertion:sym<*> (Mu $/) {
        $/.make: ...
    }
}

Keep in mind that $/, which contains the Match object, is still from NQP land, so special care must be taken with it and it must be marked as being of type Mu, since NQP classes exist outside of the normal Perl 6 type heirarchy. In actuality, it's not going to be of type Match, but of type NQPMatch.

Cargo-culting an Actions Class

Actually figuring out how to balance all these different factors and work with the guts of Rakudo is generally the most involved part of writing a slang. Because of this, if you want to write an actions class, your best bet is to find similar things in the Rakudo and NQP sources, and copy what you can.

For the previous example, we still have to figure out what the AST is going to look like. In this case, we're going to define <*> in a regex as calling the same regex or method, but one level up on the heirarchy. So if you overwrite a regex with a mixin, the original regex can still be matched against by the new one.

Looking at the code for other methods of the assertion category, it looks like to interpolate correctly we need to make a QAST::Regex object that contains a QAST::NodeList, and has the attributes :rxtype<subrule>, :subtype<method>, :node($/).[2]

my role Callsame::Actions {
    method assertion:sym<*> (Mu $/) {
        $/.make: QAST::Regex.new: QAST::NodeList.new(
            ...
        ), :rxtype<subrule>, :subtype<method>, :node($/);
    }
}

There are other methods of the assertion category that take a value from somewhere else and interpolate it into the regex, like assertion:sym<{ }>. The basic format of this can be cargo-culted too:

my role Callsame::Actions {
    method assertion:sym<*> (Mu $/) {
        $/.make: QAST::Regex.new: QAST::NodeList.new(
            QAST::SVal.new( :value('INTERPOLATE') ),
            ...
            QAST::IVal.new( :value(%*RX<i> ?? 1 !! 0) ),
            QAST::IVal.new( :value($*SEQ ?? 1 !! 0) ),
        ), :rxtype<subrule>, :subtype<method>, :node($/);
    }
}

We want to interpolate the same thing that callsame executes. While a higher-level interface would be nice, it looks like Rakudo's core setting defines it using something called nqp::p6finddispatcher('callsame').

Based on the QAST docs it looks like we can call the low-level nqp::some_fn functions using QAST::Op.new( :op<some_fn> ), and create string values with QAST::SVal.new( :value<some_string> ), so we can create a QAST structure representing nqp::p6finddispatcher('callsame') like this:

QAST::Op.new( :op<p6finddispatcher>, QAST::SVal.new( :value<callsame> ))

After we call p6finddispatcher, we need to figure out how to get the candidate before the current one. Investigative work leads to Metamodel::Dispatchers, which indicates that we need to do something like

nqp::p6finddispatcher('callsame').candidates[0]

Except that since @!candidates is from NQP, we can't just call [0] on it; according to the nqp::ops documentation we have to use nqp::atpos(..., 0).

Assembling these things, we get:

QAST::Op.new( :op<atpos>,
    QAST::Op.new( :op<callmethod>, :name<candidates>,
        QAST::Op.new( :op<p6finddispatcher>, QAST::SVal.new( :value<callsame> ))
    ),
    QAST::IVal.new( :value(1) )
)

Putting it all together:

my role Callsame::Actions {
    method assertion:sym<*> (Mu $/) {
        $/.make: QAST::Regex.new: QAST::NodeList.new(
            QAST::SVal.new( :value<INTERPOLATE> ),
            QAST::Op.new( :op<atpos>,
                QAST::Op.new( :op<callmethod>, :name<candidates>,
                    QAST::Op.new( :op<p6finddispatcher>, QAST::SVal.new( :value<callsame> ))
                ),
                QAST::IVal.new( :value(1) )
            ),
            QAST::IVal.new( :value(%*RX<i> ?? 1 !! 0) ),
            QAST::IVal.new( :value($*SEQ ?? 1 !! 0) ),
        ), :rxtype<subrule>, :subtype<method>, :node($/);
    }
}