Jim Lawless' Blog


A DSL in JavaScript

Originally published on: Wed, 20 May 2009 02:24:36 +0000

Domain-Specific Languages (DSL's) have gotten quite a bit of press in the last few years. The idea is that DSL's should provide a means to control a piece of software or larger system. Some believe that DSL's should be restricted functionally so that they can only act as controlling agents. Others believe that DSL's should provide general programming facilities.

I believe that the answer really depends on what one's goals are in using the DSL. If one of the goals is to limit access to things that the DSL programmer should not be using, then a restricted DSL is likely better for that purpose.

I do believe that a DSL should always be Turing-complete, but I don't think all DSL's should have complete I/O function bindings.

There are a myriad of approaches to building DSL's ... some people build them in XML using a tag syntax and evaluate them in Java or C# or whatever by using an XML parser to traverse the script.

Some use compiler development tools such as ANTLR to transform the DSL into something more easily interpreted or perhaps source-code for another programming language.

The language Groovy has internal facilities for devising JSON-like builder structures intended for DSL development.

A simple approach to DSL implementation using dynamic languages such as Perl, Ruby, Python, or the Lisp family ( among many others ) is to simply call that language's eval() function to dynamically evaluate/execute a given snippet of code. This, of course, limits the syntax to that of the hosting language and would not likely allow customized syntax.

Calling eval() would then allow the DSL script author to leverage any facility that can legally be invoked by eval(). So, if one uses eval() in Perl to expose a DSL whose purpose is to monitor web site up-time, the author could send e-mails or write to databases or anything that Perl is capable of within the DSL.

I decided to explore the concept of using eval() to implement a DSL, but I wanted to be able to limit access to various system resources. I chose to use JavaScript as a target language, but this approach can be used with numerous dynamic languages. Let's refer to this mini-language as ProtoDSL.

My goals for ProtoDSL

  1. The ProtoDSL code block will be transformed into a legal block of JavaScript and will then be evaluated via eval().
  2. No direct access to the creation or usage of objects will be allowed; authors will be able to call functions and such but can neither create nor use objects directly.
  3. JavaScript keywords should be honored with the exception of new. In this example, others were omitted for brevity.
  4. Non-keyword identifiers in the ProtoDSL source will be prefixed with the string "dsl_" so that naming collisions do not arise.
  5. Built-in ProtoDSL functions can be provided by defining any function with a "dsl_" prefix. In my example, I defined a function dsl_print() in the JavaScript code that can be called in ProtoDSL by invoking print(). Using this metaphor, we can limit the access to the JavaScript host's I/O facilities.

In the numerous DSL samples I've seen, many like to use OOP in their DSL leveraging the dot operator in their examples to chain together a series of invocations into one big line. In an act of sheer heresy, I've gone the other direction; I'm limiting the ProtoDSL user to a syntax more akin to AWK as opposed to the chained-object syntax.

Let's take a peek at the full code and then let's run the example.

dsl.htm


<!--
   License: MIT / X11
   Copyright (c) 2009 by James K. Lawless
   jimbo@radiks.net http: www.radiks.net/~jimbo
   http: www.mailsend-online.com
  
   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation
   files (the "Software"), to deal in the Software without
   restriction, including without limitation the rights to use,
   copy, modify, merge, publish, distribute, sublicense, and/or sell
   copies of the Software, and to permit persons to whom the
   Software is furnished to do so, subject to the following
   conditions:
  
   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.
  
   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
   OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
   HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
   OTHER DEALINGS IN THE SOFTWARE.
 -->
<html><head><title>DSL In JavaScript demo</title></head>
<body>
<script type="text/javascript">
keyword={
   "for" : "1" ,
   "do" : "1",
   "while" : "1",
   "if" : "1",
   "else" : "1",
   "var" : "1",
   "function" : "1"
   // "switch","try","catch" and some others have
   // been omitted for brevity
};

function transform_dsl(s) {
   var ar; // token array
   var ecode; // newly genned code
   var i,t;
      // a simple lexer regex
   ar=s.match(/[\"][^\"]*[\"]|[^a-zA-Z0-9\_\$]|[a-zA-Z0-9\_\$]+/g);
   ecode="";
   for(i=0;i<ar.length;i++) {
         // get the next token
      t=ar[i];
         // c will hold the first character
      c=t.substring(0.1);
         // no dots allowed
      if(c==".") {
         ecode+=" dot ";
         continue;
      }
         // is it an identifier and not a keword? If so, prefix it with "dsl_"
      if((c=="$")||(c=="_")||((c>="a")&&(c<="z"))||((c>="a")&&(c<="z"))) {
         if(keyword[t]==null) {
            ecode+="dsl_"+t;
            continue;
         }
      }
      ecode+=""+t;
   }
  return(ecode);
}

function eval_string(x) {
   try {
      eval(x);
   }
   catch(e) {
      alert(e.message);
   }
}
function dsl_print(x) {
   myform.outputtext.value+=x+"\r\n";
}
</script>
<form name="myform">
<textarea name="dsltext" rows="6" cols="40">
for(i=0;i<10;i++) {
   print(i + " test.");
}
</textarea>
<p>
<textarea name="outputtext" rows="15" cols="40">

</textarea>
<p>
<input type="button" value="Show Transformed Code"
  onClick="alert(transform_dsl(myform.dsltext.value));">
  &nbsp;
  
<input type="button" value="Transform and Eval!"
  onClick="eval_string(transform_dsl(myform.dsltext.value));">
</form>
</body></html>

I first define a map of keywords called keyword. This map is incomplete ... you might wish to add other keywords, but I suspect that you may implement this overall concept in something other than JavaScript anyhow.

Next, I defined a function called transform_dsl() whose purpose is to transform a snippet of ProtoDSL text into a snippet of JavaScript text, ready for eval()'ing. The transform_dsl() function leverages a regular expression as a simple lexical analyzer / scanner. This may need more fine-tuning as I try to write more complicated ProtoDSL scripts.

The code in the function then splits the ProtoDSL into an array of tokens based on the regex and begins the transformation.

If the token is an identifier, the code checks to see if the map keyword has an associated entry with a value of "1". If so, the token is left untouched. Otherwise, the string "dsl_" is added to the beginning of the token.

All non-identifier tokens with the exception of the dot are passed through as-is. The new keyword is not allowed in ProtoDSL and becomes dsl_new.

So, the ProtoDSL code snippet below:


for(i=0;i<10;i++) {
   print(i + " test.");
}

becomes:


for(dsl_i=0;dsl_i<10;dsl_i++) {
   dsl_print(dsl_i + " test.");
}

Note that the transformed code is legal JavaScript.

Let's try this using a browser.

http://www.mailsend-online.com/wp/dsl.htm

If you click the Show Transformed Code button, you should then see an alert() message with the code after transformation.

If you click the Transform and Eval! button, the code should tranform and then be evaluated, which will cause the following to appear in the lower text area:

0 test. 1 test. 2 test. 3 test. 4 test. 5 test. 6 test. 7 test. 8 test. 9 test.

So, we have a limited DSL that allows the author to compute just about anything they want and display the results in the lower text area.

A rather large caveat is that since we've sidestepped formal compilation procedures ... true lexical analysis, parsing, and then semantic analysis, it's difficult to convey meaningful information to the author should the ProtoDSL system encounter a syntax error in the DSL text.

Go back to the web page above and change the for( text so that it is syntactically incorrect; let's add an extra "r" making it forr(.


forr(dsl_i=0;dsl_i<10;dsl_i++) {
   dsl_print(dsl_i + " test.");
}

When the system transforms this code, it becomes:


dsl_forr(dsl_i=0;dsl_i<10;dsl_i++) {
   dsl_print(dsl_i + " test.");
}

Note that since we did not use the for keyword properly, the identifier forr was transformed into dsl_forr. There's no such keyword, so our evaluation function eval_string() catches the exception in the try/catch block and displays an error from the JavaScript engine. In Firefox 2.x, the message reads:

missing ) after argument list

...because the system thinks we're trying to call a function forr() and have interjected some semicolons in there.

I think that the ProtoDSL concept has merit, but I believe we should leverage a true set of compiler tools (such as ANTLR) to handle the syntax transformation process so that meaningful error messages can be conveyed to the author. If we go that route, we could also then customize the syntax more directly to suit the problem domain.

Unless otherwise noted, all code and text entries are Copyright ©2009 by James K. Lawless.



Views expressed in this blog are those of the author and do not necessary reflect those of the author's employer. Views expressed in the comments are those of the responding individual.

stumbleupon Save to StumbleUpon
digg Digg it
reddit Save to Reddit
facebook Share on Facebook
twitter Share on Twitter
aolfav More bookmarks


Previous post: BPL: Batch Programming Language Interpreter
Next post:Twimmando: A Command-line Twitter Client


About Jim ...


Click **here**
to try out MailWrench;
a command-line SMTP /
SMTPS (Google Gmail)
mailer for Windows.


Follow me on Twitter

http://twitter.com/lawlessGuy


Recent Posts

A JavaScript REPL for Android Devices

MailSend is Free

My Blog Engine

The October 10th Bug

A Review of Kevin Mitnick's Book Ghost in the Wires

Spellbound by Web Programming

Backlinks to my Blog Posts

Play MP3 Files with Python on Windows


Random Posts

Mad Schemes : Learning Lisp via SICP

E-mail cleansing

BPL: Batch Programming Language Interpreter

An Interview with Game Developer James Hague

Directory Traversal in Rhino JavaScript

An Interview with Brad Templeton

WSH2EXE part 2

A Simple Media Control Interface Script Processor

Structuring my Thinking

An SMTP Server Simulator in Perl


Full List of Posts

http://www.mailsend-online.com/bloglist.htm


Recent Posts from my Other Blog

Remembering Dr. San Guinary

Why Some Web Sites will go Dark on Jan 18th

SNL Superhero Skit

More Ruby Games

My Ruby Game Challenge Entry

Steal this Bookmarklet

Nerd Toys

Learn New Jargon, You Must

Spot the Wiebe

Tech Magazine Glory Days

Book Review : Paull Allen - Idea Man

A 90's Experiment in Online Systems - The U.S. West CommunityLink Service