In the beginning, there were not programming languages, just machine code. Automatic programming came to save all us by casting power rings to rule the machines: Programming languages. They diversified, inspired new ones, constituted a rich ecosystem. Among them appeared a highly useful kind of computer languages: DSLs.

DSLs

Domain Specific (computer) Languages have been around for quite a long time. The academic description of this kind of languages states that they are centred and used for an specific application domain. DSLs are small and concise which implies that they guide their users in the process of describing actions, entities or relations within their domain of application. What’s more important, they are made to fit their purpose.

If I am doing statistics research, why should worry about memory management?!!

It isn’t surprising that a whole kingdom of these mini-languages has evolved and took over machine learning tools, scientific programming, hardware design and management, modeling…

Wasn’t this a blog on Scala? Leave the history for academics!

Yeah this might sound as the stories from the past. However, in the context of the current times, when we all seem obsessed by data acquisition, storage and analysis and taking into account that this data is usually complex to manage because of its variability. We are forced to deal with dozens of different ways of managing data and many of them pass through the use of DSLs: SQL, HiveQL, CQL, bash, grep, awk, R, C’mon! How can I even finish the list! Let’s forget of what is to come.

What if a tool would give as the power to create simple, guided and short languages to perform specific domain tasks within. What if Scala was a kind of DSL where the D stands for the Domain of Creating new DSLs?!

When its creators named Scala they were not just thinking on its capabilities in code reuse and potential use in horizontal concurrent environments; they also kept in mind the extensibility of the language. Some of the features in that direction are:

Infix notation: objectA.method(objectB) can be written as objectA method objectB
Lack of operators: There are no operators as separated entities, just methods. Precedence order and associativity rules are provided by the last character of each method name.This way, any method name NOT ending with `:` gets associated from left to right:
obj1 + obj2 is the same as writing obj1.+(obj2) whereas obj1 +: obj2 is as writing obj2.+:(obj1).
Similarly, operator precedence is provided by a priority list of method name ending characters. e.g: `*` gets a higher priority than `+`, obj1 + obj2 * obj3 is always interpreted as obj1 + (obj2 * obj3).The mentioned precedence priority list is as follows:
- Any letter, no matter case.
- |
- ^
- &
- Symbols = and !
- Symbols < and >
- :
- Arithmetic operations + and –
- Arithmetic operations *, / and %
- Any other special character.
Advanced object oriented features: object, trait, …

These features can be combined to model and build internal DSLs within the Scala programming language.

Scala DSLs 101

Infix notation is the main feature to create our own embedded languages.

Consider the following trait:

trait MovingRobot {
  def moveForward(): MovingRobot = {
    println("Robot moved one position forward")
    this
  }
  def moveBackward(): MovingRobot = {
    println("Robot moved one position backward")
    this
  }
}

Which can be mixed in an object declaration as:

object robot extends MovingRobot

Its methods can be called using traditional dot notation:

robot.moveForward.moveBackward

But infix notation give us a more natural way to talk with this simple bot:

robot moveForward() moveBackward
robot moveForward
robot moveBackward

This is the simplest of all possible DSLs.

State transitions

Yes, simple but rather imperative and useless. Commands are not changing the system state besides the lateral effect behind println:

At this point, there are two options to model the effects of the DSL instructions:

The mutable approach: Somehow easiest to Scala new comers from imperative languages but it is way more bug prone. This one is rather similar to the approach followed by so many builders in Java. Check Java’s StringBuilder:

The builder state is the string that is being composed. Methods, such as append(double d), return a reference to the instance of the builder whose state has been altered by the very same method. Hence, the same reference is always returned since is the same StringBuilder instance which is mutating call after call, sounds familiar?!
The immutable one (or the path of wisdom): Do not change anything, return a new state with the attributes derived from the previous state your action. From now, this post will only cover this approach.

The beauty of the second solution is that each action returns a new state object having a 1 to 1 relation with the system state. That is, the code entities are a perfect reflection of all the changes. Moreover the state is immutable by definition.

state /steɪt/ n., adj., v., stat•ed, stat•ing.
n.

the condition of a person or thing with respect to circumstances or experiences;
the way something is[countable; usually singular]the state of one’s health.

the condition of substances with respect to structure, form, etc.[countable]Water in a gaseous state is steam.

(www.wordreference.com)

Discussing why immutability drives to way less buggy systems is out of the scope of this post, hundreds of explanations can be found by googling «immutability reduces bugs». Even Java creators decided it was better, at least for their strings.

Each transition returning a whole new state reduces its responsibility to just one: To generate a new state hence simplifying the DSL design. No changes in the state are to be expected beyond explicit transition calls.

The nitty-gritties of immutable state transitions

Following the utterly complex example of our uni-dimensional robot API (at this point you must have realized that the previous Scalera Challenge included a beautiful DSL), it can be altered to make it follow the above-described functional approach:

// All states extend `RobotState`
trait RobotState {
  def position: Int
}

// Transitions which can be mixed with any state for which they
// make sense.

trait MovementTransitions {
  self: RobotState =&amp;gt;

  def moveForward(nSteps: Int = 1): RobotState with MovementTransitions

  def moveBackward(nSteps: Int = 1): RobotState with MovementTransitions

}

// States
// In this example, states only differ in the robot position so they all
// are represented by the same case class.
case class Robot(position: Int) extends RobotState with MovementTransitions {

  def moveForward(nSteps: Int = 1) =
    Robot(position + nSteps)

  def moveBackward(nSteps: Int = 1) =
    Robot(position - nSteps)

}

// Initial state
val robot = new Robot(0)

And its use:

robot moveForward(10) moveBackward() position

The code above is an oversimplification but shows the basic tricks behind Scala DSLs, namely: The use of infix notation, families of states and transitions only usable within state definitions.

A bit of theory: Really? State Machines?

Is the state machine model actually needed to implement DSLs? Yes, if you like avoiding shooting yourself in the foot.

shootfoot

Immutable state machines are easy to understand, maintain and expand.

On the other hand, DSLs are languages, formal languages with grammars with a place in Noam Chomsky’s classification, commonly, Regular Grammars and Context-Free grammar.

Which theoretical machine is able to recognize/process languages with regular grammars? A finite state automaton
In the case of Context-Free grammar languages, they can be processed by push-down automatons which (ALERT! Oversimplification ahead) can be regarded as a finite automaton enjoying the perk of making use of its own stack to place and read symbols.

The transition model described afore seems to be just made for implementing this kind of machines. A self-answered question arises as to whether DSLs’ developers should dedicate their efforts to find buggy and flimsy solutions when such a solid model is available.