Tunes HLL Meta-Level Architecture

The language, in order to be the adaptible medium that the Tunes project requires to be a properly unified system, has to have an open generic architecture that can encompass both the most complicated formal systems and the least complicated simple evaluation and encoding schemes.

Overview

The intent is to generalize the canonical Lisp read-eval-print loop and apply Tunes principles as uniformly across the resulting stages as possible. Here are the proposed new stages:

Parse
Expand
Search
Dispatch
Rewrite
Response (optional)

The concept to keep in mind about these phases is that they are ignorant of what is compiled and what occurs at run-time. That is, it works in the sense that it is subjectable to Tunes partial evaluation in various ways. So this is an abstract chain of phases.

Furthermore, these phases can be somewhat overlapping in the case of movement of binding of various parts of a whole language. Also a single language environment built with from this system may define several grammars or dispatch mechanims, and in each case, these may be defined to have static or dynamic binding independently. So this architecture is not really one that encompasses everything at once, but it defines re-usable components for a language to combine per-feature, and as a result these phases may overlap between compilation time and evaluation time (but not per expression). Finally, Tunes promises that partial evaluation will be applicable wherever possible, and that this can further re-define the staging within the allowances of the particular resulting language's semantics.

Comparison with REPL

Overall, the major shift is that the phases have been divided into further parts. Also important is that reading and evaluating are divided by a run-time boundary instead of Lisp's boundary where the output of the read phase enters evaluation which is then divided between compile- and run-times. So the major phases are now "input", "run-time", and "output" where input and output are interfaces to other contexts, such as text entry or some encoding for transport.

Parse

Reading is generalized into a phase which consists of two parts: reading/parsing and expansion. The general principle of this phase is that it concerns the extrapolation of the input encoding into run-time feature calls, so in essence the actions within take a certain meta-level to the code itself. The details follow:

The Read operation itself turns into Parse. Within Lisp, Read takes in a stream of character objects and dispatches on them according to a flat table. Other tables may be substituted, particularly by the reader-macros indexed by this table, but this is not usually done, and a general architecture for it is not commonly known or looked into. The Tunes substitution for this idea employs the generalized concept of Dispatch noted below (but applied to parse-level inputs) to give this a more expandable architecture. The overall intent here is to provide a framework at least as general as that for syntax in Maude, which means as much as context-free grammar parsing, with control over precedence and fixity of specifications. In summary, applying dispatch together with Maude's (meta-)parsing library architecture is what makes Parse so unique and powerful.
Expansion encompasses all code-transformation techniques that are transparent to the semantics of the program specification itself. It begins as a generalization of Lisp's macro-expansion phase, but is intended to provide a more general system. For example, it should provide for syntax-tree annotation features that should collaborate with the implementation to encode information that is otherwise lost in the expansion. This would allow verification performed at the result of one expansion site to be applied where possible to other such sites without having to fully replicate the verification.

Eval

Evaluation is separated and generalized. The separation leads to three phases: Search, Dispatch, and Rewrite.

Search encompasses the control of evaluation order all the way up to strategies for logical search. Essentially the viewpoint is that terms are computed lazily according to a strategy, and that this strategy has to satisfy the logical requirements of the dispatching order specification. However, within that satisfaction, there can be variance, which this segment is a hook for.
Dispatch encompasses a wide variety of techniques that all amount to taking each application expression, consisting of a function name and its arguments, and determining according to the environment and its function-definition semantics, and identifying the right function-definition or combinations thereof to apply. The more general cases of pattern-matching and unification amount to the use of this concept with side-effects within that phase, primarily in the identification of term parts.
Rewrite consists of the replacement of parts of the environment with the effects (the results) of the function application, including side-effects and the modularity aspects of the environment. Each of these may be fully-general or very specific and straight-forward.

Print

Printing is conceptually the same as in Lisp, except that there is an intent again to generalize beyond an interactive listener (see REPL in the normal sense. We say that this phase is optional now, since it encompasses part of the Migration subproject's functions of determining how to communicate object identity; we term this transitioned form of printing response. Of course, by invoking the Migration concept, we are suggesting that listening might happen in a distributed manner, and also that the semantic context might have a translation bridge to gap. Linda seems a sufficient example to illustrate this.

Polymorphism and Specialization

At each level, there are varying degrees of expressiveness, partially-ordered. That means that some means of expression can quite easily express others, but there's no strictly upward direction for this relation. In general, this relationship should be captured explicitly within the system, so that a partial-evaluator may use contextual information to determine how to structure the phases to adapt them to each task.

Parsing

This is really intended to cover the spectrum of complexities of grammars and their application. The Chomsky hierarchy defines 4 major categories of these grammars, and defines what kind of program is required to comprehend them. Maude's meta-parser handles up to and including Context-Free Grammars, and of course a full programming language is required in general, with no real system for predicting which language from the language examples themselves.

Hundreds of language types have been enumerated, so the main task here is merely to catalog and provide the major convenient types of syntax parsers and the information that relates their specializations.

Ideally, a system for embedding expressions from one language to a less-expressive language should be available, though each method per pair may have to be made manually. It is nevertheless useful to have them fit into a single functionality abstraction.

Expanding

Code expansion expressiveness depends on the level of information provided by the previous Parse phase, as detailed by these following methods of expansion or semantics-preserving transformation:

Syntactic Macros: These consist of code-transformations which work at a level of agnosticism with respect to the types of the parts of the expression not associated with the bare shape of the expression. The prime example of this concept is the Lisp macro system, operating on (quasi-)quoted terms as lists. This is no coincidence, as Lisp's syntax is designed to show all expressions uniformly, which in combination with late-bound semantics, leaves the expansion system with only shape information about the term.
Semantic Macros: This encompasses expansion when the Parse phase provides distinct information about the types of the arguments. This kind of system has a different philosophy about the purpose of its syntax than Lisp. In so doing, it provides the expansion system with more information to work with, but at the same time, these systems are even less widely-known than Lisp's.

In some sense, it seems that this phase could successfully contain non-trivial Dispatch, depending on the topology of the type relations of the Parse phase's output. Also, Rewrite is obviously being employed here, in most systems in a non-side-effecting sense (macros replace themselves usually). Whether more general Rewrite would be beneficial and safe if allowed has not been explored.

Searching

The amount of evaluation that can be done on the input argument terms (the extent of eager evaluation) is limited by the particular dispatch semantics chosen for the context. Search parameters only really become non-trivial when the dispatch/rewrite combination is lazy in some way; if some information is not relevant immediately. Otherwise, an evaluation order is implied or a lazy evaluation policy is prescribed.

The following outline the aspects of this phase which are relevant and separate concepts:

Null-search: No argument terms are checked (evaluated) at all before dispatching. This can only precede a trivial "null-dispatch" semantics, since no evaluation means no type information.
Partial Type Information: In keeping with the Tunes technical embrace of the philosophy that computational objects' types are most generally just predicates of various kinds, there is also room among pattern-matching semantics to infer partial information about type before committing to full evaluation of the object. For example, determining that an object must be a list before determining its attributes qualifies to match against a list lazily, and it does not require special support for lists as types. Keep in mind of course that attributes contribute to the predicates that are true of the object, and so constitute some type information about it in the Tunes sense.
Direction: Arguments are evaluated according to a directed heuristic. Examples in common use follow the simple argument conglomeration types which are linearly-ordered collections (to match the source code which has linear form), so evaluating forwards and backwards are the commonly though-off possibilities.
Structural Recursion Ordering: When pattern-matching is the context's semantics, some evaluation of attributes is required in order to match structure, and this can affect the semantics and performance when the function-checking order is not specified. So for this purpose, this extra level of specification of strategy helps determine this as well as preventing unnecessary evaluation in cases where matching can occur based on incomplete evaluation. Two common examples of this type of strategy are depth-first and breadth-first searching for both tree and graph topologies, which could be further specialized in certain cases.

The freedom specified for the search phase within an environment or context limits the amount of polymorphism and overloading that is logically allowable for the same context; conversely, the amount of freedom desired in overloading limits the range of search allowed, since the search results must conform to the dispatcher's requirements for an appropriate answer. (Editor's Note: Provide some specific examples about that.)

Dispatching

As described in the Glossary entry for dispatch, there are several algorithms and environment semantics associated with multiple bindings of a function name, and their proper choice. There is a general partial ordering among them, and in more realistic cases, a multiplicity of combinations of these semantics, so that the general case for specialization will be quite complex to automate.

Nevertheless, we will outline here some common reductions in expressiveness among those major methods:

Normalizing or normalizing dispatch reduces to pattern-matching dispatch when the pattern rules only use templates on one side of the equation, and only bare structure, value, and type are taken into account.
Pattern-matching dispatch reduces to multimethod dispatch with prototypes if the structure matched is a delegation structure.
Pattern-matching dispatch reduces to multimethod dispatch on types or classes if the structure matched is a delegation structure and value is not matched.
Multimethod dispatch reduces to single dispatch when only the first argument's type information is employed.
Prototype-specific dispatch reduces to class or type-based dispatch if all of the matching is performed on objects whose behavior is shared (or at least reified).
Multiple-inheritance dispatch reduces to single-inheritance dispatch when inheritance is from just one item or when the protocol is only used through one chain of inheritance.
Method-combining dispatch reduces to single-choice dispatch when no combinations apply.
Optional-argument signature matching reduces to single-signature dispatch when no optional arguments are employed.
Single-inheritance dispatch reduces to lexically-overridden definitions when no resends or super/inner sends occur.
Lexical-override dispatch reduces to global dispatch (null-dispatch) when there are no lexical overrides.

Each of these has their own research areas in terms of algorithms, data structures and storage, and their efficiency. The use of each is commonly capable of expressing some class of dispatch, and the choice is determined by what kind is possible or occurs in practice. This means that the optimization can be chosen because a domain isn't allowed to exceed some semantics or because it has been observed and verified not to do same.

To summarize, these methods are distinguished from each other not by implementation, but by the nature of how names are organized. In a certain sense, this is about the shape of the namespaces' organization.

Rewriting

The object-attribute relation mechanism is the fundamental concept underlying this phase. However, the core semantic primitive of abstraction of terms (lambda abstraction) has to fit into this category, and the category of rewrite is separate from dispatch. So, the semantics of an abstraction must include the search, dispatch, and rewrite semantics in order to be migrated successfully, but the basic specification means will be identical within the environments' differing dispatch semantics.

A rewrite rule, then, is a pair of quoted terms in the most general when its associated dispatch is unification. The nature of the quotation will be considerably significant to this definition. The quoted terms are essentially objects describing the "Situation" of objects required in order to perform a successful match.

As an example of the range of such terms' nature, for each type of dispatch, we will associate the structure shape needed. In each example, a common base of a conglomeration of objects to pass as a single input is implicit. We do not specify the shape of the passed configuration: Lisp's argument lists or other languages' tuples for argument passing are too specific a structure. Positional or keyword passing may be specified separately. The core point here is that the arguments and inputs are matchable in some way. With that in mind, here are the examples, in increasing orders of complexity:

For anything less expressive than pattern-matching dispatch, the conglomeration plus annotations that cover traditional type information is all that is necessary for the input term. For the output term, another conglomeration is specified with possible type annotations, but more importantly, there will be annotations covering the computational links between the output terms and the input terms, which amounts to abstract source code. The annotation system between the input and output system are considered the essential semantics of the rewrite term, and are considered the explicit abstraction. The implicit part which combines to form the whole term is how the dispatch system of the context in which the rewrite term was defined complete the semantics. Those are essential to consider when migrating the rule to another context with less-expressive dispatch.
For pattern-matching level of dispatch, some of the annotations will inter-relate the objects involved, and must be matched in the input conglomeration.
For unification-based dispatch, both input and output conglomerations may have the same kind of annotations that pattern-matching needs for its input.

In any case, rewrite reduces to the simplest idea of the object-attribute annotations discussed in the HLL General Concepts outline, since the conglomeration reduces to a single object. In that case, the selector for the attribute is the function's name (a key to a binding, in a sense).

Responding

Since our means of specifying rewrites in its most general form uses configurations as outputs, with annotations, there is no longer a trivial answer to the question of how to respond to the listener with a representation of the answer that can be read/parsed back in. For this reason, the migration concept is employed.

The simplest case is common, where the director of the interactive session is in the same scope and interacts with the same environment, so there is no difference in semantics to adapt. However, even in this case, there is the question of what to represent and what to leave implicit, so migration still applies (even if trivial null-migration is applied).

Examples

See the HLL Examples area.

Conclusion

This architecture seems to represent a regular organization scheme reflected in a significant spectrum of the programming languages and systems that are considered useful to Tunes in its sense of the word. This architecture answers the question of whether this usefulness can be reflected in practice, while also allowing the HLL design to be a flexible framework instead of a "One True Language".