Automatic Interface Generation

We must build, for text, graphical, audio, and other UI's, some automatic (but programmable) interface generators for structures. That is, if I define a structure, automatically generate the behavior of the structured editor for it. These would be more than, but in some ways a generalization of, visual wrappers or presentation types for a given type. Such wrapper systems operate on a generic type and use some reflective access to specialize themselves dynamically for each sub-type instance that they encounter. The full scope of this concept would couple the wrapper concept with partial evaluation.

Automatic interface generation is an application of the HLL Choice axiom. In this system, information is specified incrementally, with the linguistic context inferring from this information as it is given, and determining identity or contradiction when it can. The user-interface analogue of this will often include interactions with defaults, whether those are manually-specified or specified as some mixture of program and preference. This is an important semantic difference, in that information presumed out of the need to properly represent a requested object that is impartially specified, is different from the specification information. Furthermore, the user should have default access to a semantic layer where the distinction between semantic query and an attribute access is nullified.

For instance, if you want a scrollbar, ask for a scrollbar; but don't draw the rectangle yourself, let the system do it; you don't care if the scrollbar looks like Athena, or Motif or Windows 3.x or Win95, or MacOS, or another widget set, or any combination thereof, whatever; some users might care, so don't waste your time at working against them.
Also, you might want not to be even as precise as to ask for a scrollbar. After all, what you want is that the (large) text be interfaced. The blind, as well as many other people, might want a voice-based interface, that has its own "scrolling" mechanism to interface a large text. Why prevent your fine program from being used without a screen? Is a screen essential to it? Perhaps you do have good ideas to interface your program specifically to a screen; but then, Tunes allows you to specify these ideas apart from your program itself, by (partially) specifying the epsilon-filling tactics for it.

It might appear unobvious at first how to formalize interfaces in such a way that a computer could automatically reason about them, verify human usability constraints, and automatically generate interfaces that satisfy those constraints. Yet, this unobviousness is largely due to people being wrongly used to programming interfaces in ad-hoc imperative ways, instead of declarative ways. With a bit of imagination, the constraints of what make a good interface, either in the details, or in the overall design, can all be formalized. And this formalization can incrementally bring the system from an originally all-imperative interface to higher and higher levels of declarativity.

For instance, here is how we can formalize in a very general way the fact that an interface that allows to browse through all of a document is better than an interface that can only dump all the document sequentially:

Channel

A channel is a structure with which an object can send data to another; it need not be a stream, and the data need not be raw binary.

Display

A display function for type T on output O is the data of function f from T to O, and a reconstruction function g from O to T, such that someone observing the output from f applied to an object of type T can reconstruct the object thanks to g.

Example: the sequence of characters "123896" is an integer being displayed; you can reconstruct the meaning of the integer from this string output, knowing also what context is assumed for that channel at the moment.

Example: outputing a 1000-line text on a tty without scrolling is displaying the text. Because of the time needed to read things, one may consider that an acceptable text display should pause before scrolling forward, but this is not logically-apparent or part of the normal assumed means of the presentation of text; this assumption depends on and belongs with the output device itself, though it must affect up-stream program logic.

Interactions

For any input I and output O connected to a same external object, the couple (I,O) is named an I/O terminal.

I could be from: {keyboard, keyboard & mouse, a few buttons, braille input, voice analyzer} and O from: {text array display, graphic display, loud-speakers, braille output, voice synthetizer}.

A reversible interaction on an I/O terminal is a function from I to O with internal state S, such that for any two states s1,s2, there exist input sequences t1 and t2 to go from state s1 to s2 and from state s2 to state s1 respectively.

(Note: an I/O-interface is not a particular O-display.)

An interface function for type T on terminal I/O is a display function from type T to reversibly interacting objects on terminal I/O.

Example: piping a 1000-line text into /usr/bin/less that allows scrolling up and down is interfacing the text; piping it to /bin/more that only scrolls forward isn't.

Example: broadcasting a song on radio is displaying the song, but not interfacing it; putting it on a tape that one can rewind is interfacing it.

You can add criteria on the above objects for the computability/acceptability/complexity/whatever of displays and interfaces.

By having informational and energetical models of the above objects, the relative utility of interfaces could be evaluated: a probability distribution for expected objects in T would allow to evaluate the information contained in the output, the complexity of the output-decoding function g and of the interaction driving functions ()->s , (s1,s2)->t1 relatively to the admitted cultural background would allow to evaluate the ease-of-use of the interface; space cost and time delay for all the human and computer functions involved could be taken into account; etc.

Not that all these should actually be systematically evaluated (this might be a good master thesis for a research student), but this shows that computers can express a lot more than people would generally have them do. As for the large indetermination in the above utility-related concepts, of course it shouldn't be neglected. Epsilon constructs are precisely there to help handle such indetermination.

Another research topic on interface generation is using knowledge-based (relevance-prioritized, constraint-based) programming techniques for the interface generator's "AI".

Categories of Information

The IG should maintain several different loose kinds of knowledge:

The IG should first of all know have a model of the information to make available, that is, the structure of the object to interface. This structure dictates what tactics may be used to communicate its contents to the user; some given constructors have known tactics, whereas other constructors will inherit generic tactics from the meta-constructors from which they were obtained, and yet other constructors will be eliminated in favor of lower-level constructors with which they are expressed.
Then it should have an model of the information the user already has, that is, his expectations about the object, including its type and general semantical "context", cultural conventions about how to interface objects of such type in such context, relative relevance of various aspects of the object, so as to know which aspects to insist on in case the object has too many aspects for them to be all displayed at once, and which partial aspects to simultaneously display and relate to each other. This information can direct the IG as to what information it should be displaying in priority, so as to actually inform the user; it also directs the level of redundancy with which information has to be displayed so that the user can be considered to safely have understood it.
Then, it should know about of its means of input and output communication. Inputs include keyboard, mouse touchpad or trackball, touch-screen, joystick, mono or stereo sound input, odometer, light-detector, temperature detector, etc. Outputs include sound channels, text screens, bitmap screen, odor generators, LEDs, articulated robot arms, cooking machines, tunable thermostats, ad nauseum. Inputs and Outputs are generally grouped together in ways inherited from the physical capabilities of humans and their cultural conventions (for instance, we may loosely relate sound output with voice control and text screen output with keyboard control, bitmap screen output with pointing devices, etc); these groupings may be broken, though the human cultural tradeoffs of doing it should be taken into account.
Then it should have a model of their relative capabilities and resource usage. The capabilities of devices mean the bandwidth and precision with which they allow to "display" information of a certain "kind" and their suitability for a certain "kind of use": for instance, a voice reader, might be very good to sequentially read a large text - particularly if annotated by human and/or AI with pronounciation and intonation information - but it is not as good as a text display to do "random access" in this same text; color text and bitmap display may help put different emphases in the text. Resource usage includes machine usage as well as human brain resources. If enough machine resources are available, they might be used to simultaneously and redundantly interface a "same" document with several "orthogonal" devices (that use independant brain resources).
And of course, the IG should know about the user's preferences, his habits, his tastes, his mood, so as to adapt the interface.

All the available knowledge constrains the way objects are to be interfaced. This usually leaves some freedom of choice to the system, which is why a set of low-priority arbitrary disambiguating rules will be added. The system of all rules will then be self-contradictory, at which point rules will be removed in increasing priority order until a complete consistent decision is found.

The knowledge has to be dynamically maintained, as the objects being interfaced may be modified, as as the very interfacing modifies the model the IG has of the user (which must be confirmed or infirmed by the user's feedback).

All this knowledge model may be complicated, and use up a lot of resources; now, it is only a general model of how things may happen; in low-resource systems, many rules may be wired-in and compiled-in, human-related knowledge may not be maintained, and replaced with conservative assumptions, etc. In other words, this will be partially-evaluated away as necessary to fit the device and only include the assumptions that matter.

Also, the complication of this knowledge makes it unmanageable by an unhelped human programmer. Most of the priorities will hence have to be dynamically deduced from a long persistent tradition of computer-human interaction and meta-interaction (i.e. interaction with proficient user who give information about interaction).