Designing and Implementing a Domain-Specific Language

“Like everything metaphysical the harmony between thought and reality is to be found in the grammar of the language.”—Ludwig Wittgenstein

In Star Trek V: The Final Frontier, Scotty reminds a junior engineer to use “the right tool, for the right job”. This sage advice is applicable to computer users as well as starship engineers. GNU/Linux users have a particularly impressive assortment of tools at their disposal, many of which feature unique syntaxes that facilitate concise expression of complex operations.

Good tools will reflect the specific needs of the individual problems they are designed to solve. Consider the highly effective text processing utilities awk and sed. With simple commands, users can perform efficient search and replace operations on streams or filter complex data. How much C code would it take to do the same things? Even with a concise general-purpose language like Python, the tasks still will require more typing than the equivalent command-line tools do. Utilities like awk and sed are effective because they interface well with other command-line utilities, and they leverage the power of domain-specific languages (DSLs), syntaxes specialized for a particular group of related tasks.

Despite the vast number of powerful applications developed for GNU/Linux, the right tool isn't always available for any given job. What should a resourceful user do when options are limited? In most cases, it is possible to combine existing tools, possibly creating a new tool in the process. Sometimes, a new tool needs to be made from scratch with a general-purpose programming language. Developers can add value to a new tool and increase its productive potential by implementing a domain-specific language for it.

Development time is an investment, and many programmers endeavor to maximize the return on that investment by writing reusable code libraries. Tools developed with a specialized code library generally expose only a limited subset of the library's features. Developers can provide more extensive access to library functionality by constructing a domain-specific language that can act as an interface. A well-built DSL allows users to employ an intuitive and self-documenting syntax to construct a multitude of highly specialized tools rapidly.

Pitfalls

Implementation of a DSL can be tricky business. Code that parses and validates specialized syntax is difficult to write and maintain, especially if the DSL supports sophisticated control structures. Tools written with DSLs are notoriously difficult to debug, and there will be no IDE available for your new language unless you make one.

One of the most compelling arguments against corporate use of DSLs is the so-called tower of Babel affect. When a number of developers all construct their own individual DSLs, the sheer number of disparate syntaxes can create a tremendous amount of confusion.

When developers perpetually increase the scope of a DSL's target domain, they risk under-specialization. When the target domain grows to unmanageable proportion, the DSL will transmogrify into a personal Perl implementation, and it will cease to fulfill the needs adequately of the individual tasks associated with the actual domain.

Implementation

Meta-programming is the art of writing code that generates or manipulates code. It is the basis for language implementation, and there many ways to do it. Meta-programming is either static or dynamic, depending on the type system of the implementation language. Static meta-programming typically is done with a preprocessor, and dynamic meta-programming typically is done with macros that are evaluated at runtime.

A number of excellent open-source language development platforms are available for GNU/Linux. One of the most impressive static meta-programming utilities is Camlp4, an extensible preprocessor for Inria's Ocaml programming language. Camlp4 facilitates rapid development of efficient, type-safe DSLs. Of the available dynamic meta-programming platforms, the best is Logix, an extremely versatile language design system implemented for and with Python.

Looking at Logix

LiveLogix is a consulting and development firm with big plans and innovative ideas. Logix, available under the GPL, is their first major release and the vanguard of their LiveLogix Application Platform, an assortment of versatile and dynamic development tools currently in the early stages of development. Inspired by the dynamicism of Python, the syntactic grace of Haskell and the mutability of Lisp, Logix is a unique fusion of features and flexibility.

Logix developers do not build complete formal grammars, they incrementally define the individual operators that make up a language. It is then possible to combine these operators to form expressions, which the Logix processor can parse and convert into Python byte-code. Logix DSLs optionally can leverage powerful Python language features like control structures, object orientation and list processing. Seamless Python integration and access to the tremendous number of useful libraries and modules available to Python further increase the power and value of Logix.

Logix developers build their programs with either the standard or base Logix dialects. The syntax of the base dialect is like normal Python syntax with a few additional features for language extension. The standard dialect includes a wide variety of excellent syntactic enhancements and unique features.

Experienced Python developers quickly adapt to standard Logix idioms. The documentation contains an excellent introduction for Python programmers that fully explores the syntactic divergences. Many of the substantial differences relate to Logix's special treatment of expressions. All statements return values, so it is possible to write code like this:

x = if 10 * 2 == 20: "yes it is!" else: "no"

A function call is written as a series of expressions:

min 2 6

In the standard dialect, parentheses distinguish individual expressions just as they do in algebra. Parentheses are not a part of the actual call nomenclature. The standard Logix expression:

min 2 6 (min 10 15)

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix