Jim Lawless' Blog


A DSL in JavaScript

Originally published on: Wed, 20 May 2009 02:24:36 +0000

Domain-Specific Languages (DSL's) have gotten quite a bit of press in the last few years. The idea is that DSL's should provide a means to control a piece of software or larger system. Some believe that DSL's should be restricted functionally so that they can only act as controlling agents. Others believe that DSL's should provide general programming facilities.

I believe that the answer really depends on what one's goals are in using the DSL. If one of the goals is to limit access to things that the DSL programmer should not be using, then a restricted DSL is likely better for that purpose.

I do believe that a DSL should always be Turing-complete, but I don't think all DSL's should have complete I/O function bindings.

There are a myriad of approaches to building DSL's ... some people build them in XML using a tag syntax and evaluate them in Java or C# or whatever by using an XML parser to traverse the script.

Some use compiler development tools such as ANTLR to transform the DSL into something more easily interpreted or perhaps source-code for another programming language.

The language Groovy has internal facilities for devising JSON-like builder structures intended for DSL development.

A simple approach to DSL implementation using dynamic languages such as Perl, Ruby, Python, or the Lisp family ( among many others ) is to simply call that language's eval() function to dynamically evaluate/execute a given snippet of code. This, of course, limits the syntax to that of the hosting language and would not likely allow customized syntax.

Calling eval() would then allow the DSL script author to leverage any facility that can legally be invoked by eval(). So, if one uses eval() in Perl to expose a DSL whose purpose is to monitor web site up-time, the author could send e-mails or write to databases or anything that Perl is capable of within the DSL.

I decided to explore the concept of using eval() to implement a DSL, but I wanted to be able to limit access to various system resources. I chose to use JavaScript as a target language, but this approach can be used with numerous dynamic languages. Let's refer to this mini-language as ProtoDSL.

My goals for ProtoDSL

  1. The ProtoDSL code block will be transformed into a legal block of JavaScript and will then be evaluated via eval().
  2. No direct access to the creation or usage of objects will be allowed; authors will be able to call functions and such but can neither create nor use objects directly.
  3. JavaScript keywords should be honored with the exception of new. In this example, others were omitted for brevity.
  4. Non-keyword identifiers in the ProtoDSL source will be prefixed with the string "dsl_" so that naming collisions do not arise.
  5. Built-in ProtoDSL functions can be provided by defining any function with a "dsl_" prefix. In my example, I defined a function dsl_print() in the JavaScript code that can be called in ProtoDSL by invoking print(). Using this metaphor, we can limit the access to the JavaScript host's I/O facilities.

In the numerous DSL samples I've seen, many like to use OOP in their DSL leveraging the dot operator in their examples to chain together a series of invocations into one big line. In an act of sheer heresy, I've gone the other direction; I'm limiting the ProtoDSL user to a syntax more akin to AWK as opposed to the chained-object syntax.

Let's take a peek at the full code and then let's run the example.

dsl.htm


<!--
   License: MIT / X11
   Copyright (c) 2009 by James K. Lawless
   jimbo@radiks.net http: www.radiks.net/~jimbo
   http: www.mailsend-online.com
  
   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation
   files (the "Software"), to deal in the Software without
   restriction, including without limitation the rights to use,
   copy, modify, merge, publish, distribute, sublicense, and/or sell
   copies of the Software, and to permit persons to whom the
   Software is furnished to do so, subject to the following
   conditions:
  
   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.
  
   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
   OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
   HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
   OTHER DEALINGS IN THE SOFTWARE.
 -->
<html><head><title>DSL In JavaScript demo</title></head>
<body>
<script type="text/javascript">
keyword={
   "for" : "1" ,
   "do" : "1",
   "while" : "1",
   "if" : "1",
   "else" : "1",
   "var" : "1",
   "function" : "1"
   // "switch","try","catch" and some others have
   // been omitted for brevity
};

function transform_dsl(s) {
   var ar; // token array
   var ecode; // newly genned code
   var i,t;
      // a simple lexer regex
   ar=s.match(/[\"][^\"]*[\"]|[^a-zA-Z0-9\_\$]|[a-zA-Z0-9\_\$]+/g);
   ecode="";
   for(i=0;i<ar.length;i++) {
         // get the next token
      t=ar[i];
         // c will hold the first character
      c=t.substring(0.1);
         // no dots allowed
      if(c==".") {
         ecode+=" dot ";
         continue;
      }
         // is it an identifier and not a keword? If so, prefix it with "dsl_"
      if((c=="$")||(c=="_")||((c>="a")&&(c<="z"))||((c>="a")&&(c<="z"))) {
         if(keyword[t]==null) {
            ecode+="dsl_"+t;
            continue;
         }
      }
      ecode+=""+t;
   }
  return(ecode);
}

function eval_string(x) {
   try {
      eval(x);
   }
   catch(e) {
      alert(e.message);
   }
}
function dsl_print(x) {
   myform.outputtext.value+=x+"\r\n";
}
</script>
<form name="myform">
<textarea name="dsltext" rows="6" cols="40">
for(i=0;i<10;i++) {
   print(i + " test.");
}
</textarea>
<p>
<textarea name="outputtext" rows="15" cols="40">

</textarea>
<p>
<input type="button" value="Show Transformed Code"
  onClick="alert(transform_dsl(myform.dsltext.value));">
  &nbsp;
  
<input type="button" value="Transform and Eval!"
  onClick="eval_string(transform_dsl(myform.dsltext.value));">
</form>
</body></html>

I first define a map of keywords called keyword. This map is incomplete ... you might wish to add other keywords, but I suspect that you may implement this overall concept in something other than JavaScript anyhow.

Next, I defined a function called transform_dsl() whose purpose is to transform a snippet of ProtoDSL text into a snippet of JavaScript text, ready for eval()'ing. The transform_dsl() function leverages a regular expression as a simple lexical analyzer / scanner. This may need more fine-tuning as I try to write more complicated ProtoDSL scripts.

The code in the function then splits the ProtoDSL into an array of tokens based on the regex and begins the transformation.

If the token is an identifier, the code checks to see if the map keyword has an associated entry with a value of "1". If so, the token is left untouched. Otherwise, the string "dsl_" is added to the beginning of the token.

All non-identifier tokens with the exception of the dot are passed through as-is. The new keyword is not allowed in ProtoDSL and becomes dsl_new.

So, the ProtoDSL code snippet below:


for(i=0;i<10;i++) {
   print(i + " test.");
}

becomes:


for(dsl_i=0;dsl_i<10;dsl_i++) {
   dsl_print(dsl_i + " test.");
}

Note that the transformed code is legal JavaScript.

Let's try this using a browser.

http://www.mailsend-online.com/wp/dsl.htm

If you click the Show Transformed Code button, you should then see an alert() message with the code after transformation.

If you click the Transform and Eval! button, the code should tranform and then be evaluated, which will cause the following to appear in the lower text area:

0 test. 1 test. 2 test. 3 test. 4 test. 5 test. 6 test. 7 test. 8 test. 9 test.

So, we have a limited DSL that allows the author to compute just about anything they want and display the results in the lower text area.

A rather large caveat is that since we've sidestepped formal compilation procedures ... true lexical analysis, parsing, and then semantic analysis, it's difficult to convey meaningful information to the author should the ProtoDSL system encounter a syntax error in the DSL text.

Go back to the web page above and change the for( text so that it is syntactically incorrect; let's add an extra "r" making it forr(.


forr(dsl_i=0;dsl_i<10;dsl_i++) {
   dsl_print(dsl_i + " test.");
}

When the system transforms this code, it becomes:


dsl_forr(dsl_i=0;dsl_i<10;dsl_i++) {
   dsl_print(dsl_i + " test.");
}

Note that since we did not use the for keyword properly, the identifier forr was transformed into dsl_forr. There's no such keyword, so our evaluation function eval_string() catches the exception in the try/catch block and displays an error from the JavaScript engine. In Firefox 2.x, the message reads:

missing ) after argument list

...because the system thinks we're trying to call a function forr() and have interjected some semicolons in there.

I think that the ProtoDSL concept has merit, but I believe we should leverage a true set of compiler tools (such as ANTLR) to handle the syntax transformation process so that meaningful error messages can be conveyed to the author. If we go that route, we could also then customize the syntax more directly to suit the problem domain.

Unless otherwise noted, all code and text entries are Copyright ©2009 by James K. Lawless.

del_icio_us Save to del.icio.us
stumbleupon Save to StumbleUpon
digg Digg it
reddit Save to Reddit
facebook Share on Facebook
twitter Share on Twitter
aolfav More bookmarks


Search this Blog (and site)

Search this Site with PicoSearch


Subscribe to this Blog

 Subscribe!


Contact Me

Email: jimbo@radiks.net


Follow me on Twitter

http://twitter.com/lawlessGuy


Recent Posts

Mad Schemes : Learning Lisp via SICP

Auto Save Clipboard Images Redux

Extending SpiderMonkey JavaScript on Windows

Rhino JavaScript to EXE with launch4j

Compiling Rhino JavaScript to Java

Directory Traversal in Rhino JavaScript

Taking Shape

We've Moved!


Popular Posts

A Command-Line MP3 Player for Windows

Auto Save Images from the Clipboard

Java in a Windows EXE with launch4j

An Interview with Tom Zimmer: Forth System Developer

Setting Windows Console Text Colors in C


Random Posts

Throwaway Software: HangUp

Internet Protocols and Rhino JavaScript

A Command Line Scheduler

A Data Manipulation Library for TAP

A Scrolling Banner using Canvas and JavaScript

My Foray into Shareware

Embedding JavaScript in a Batch File

A Simple ROT13 Macro

Structuring my Thinking

Understanding TRS-80 CMD Files


Full List of Posts

http://www.mailsend-online.com/bloglist.htm


Blogroll

MicroISV on a Shoestring
DadHacker
The Bottom Feeder
Writin' That Code!
The Recursive ISV
The Thomsen Blog
Prototypically Speaking
The Reinvigorated Programmer