A language for Moon

I've been writing in C for some years and have decided that it would be a good idea to switch to a better language now.

The language itself will be purely functional (no side effects) and will probably have lazy evaluation.

Some mechanism for multiple return values will be there. Maybe I will solve this by making construction and splitting of records very easy and only allowing one parameter and one result for a function.

Syntax

At first glance it seems that syntax isn't a big problem as long as it can be parsed with no more than a moderate effort. But program sources are not only parsed by the compiler, also by tools for analyzing programs, like creating cross reference listings or creating a tags file containing the source lines of function definitions for an editor. Even searching for all occurences of a variable name in order to rename it is a quite common task.

Therefore I'll try not to store program source as a sequence of ascii characters, but instead always use the program's syntax tree. The actual encoding will use my data format specification ideas.

Of course Moon will need tools that deal with structured data, like Unix has tools that deal with ascii files.

Modules and interfaces

As the source will be treated as a syntax tree anyway, there's no point in using files to define modules. Instead the whole program should be treated as a directed graph, each use of a name pointing to its definition.

Interfaces then contain dummy names that point to the actual definitions and are pointed at by the actual locations of use.

Modules are the components of the graph that remain when the dummy names in interfaces are cut away.

Thus, a module can be split by breaking up links and putting interface dummy names in between. When no more direct links exist between different parts of the old module those parts will be the new modules.

Conversely modules can be combined by dropping the dummy names that connect them, and make the links direct.

Note that a link that is internal to a module might still use an interface. This is necessary for gradually introducing and removing interfaces. A tool might be written to look for such unnecessary use of interfaces.

Further, new versions of names can be created in order to change an interface and migrate modules towards the new interface.

Data types

One aspect that has often made things unnecessarily complicated for me in C is the type system, specifically the integer types. My language will support arbitrarily large integers and integer subtypes with ranges specified by the programmer. The latter will be optimized to machine words by the compiler if the range is small enough.

A high quality compiler could also put bigger data structures into registers, like unions of enumeration types.

The type system will probably be connected to my data format specification project.


To the projects overview.

Page created: Jul 19, 1997 - last update: Nov 26, 2002 - version 2.2
Jörg Czeranski (Impressum)