You are on page 1of 10

Build Your Own Scripting Language for Java

An introduction to JSR 223

Summary

The upcoming J2SE 6.0 release will include an implementation of JSR 223 --Scripting for the
Java Platform. The JSR is about programming languages and their integration with Java. This
article will demonstrate the power and potential of JSR 223 through the implementation of a
simple Boolean language. Throughout the example, you will see how to program to the Scripting
API (javax.script.*), how to package and deploy a language implementation in accordance with
the Script Engine Discovery Mechanism, and how to make your script engine compilable as well
as invocable the JSR 223 way.

What is JSR 223 and Why should We Care

Don’t worry whether you have prior experience of constructing a programming language of your
own. This article is not about programming languages but about a contract between programming
languages and Java so that the languages are not isolated islands. The contract we are talking
about here is JSR 223.

Before JSR 223 (and its predecessor BSF; see the Resources section for more information),
there existed many languages that communicated with Java already. There were languages that
took textual code as input from a Java program and returned the evaluation result back. There
were also languages that could keep references to objects in a Java program, invoke methods on
those objects, or create new instances of a Java class. The problem was that each language
communicated with Java in its own way. As a Java developer, every time you wanted to use a
script engine in your Java program, you had to learn the script engine’s proprietary programming
interface.

To solve this problem, JSR 223 defines a contract that all script engines conforming to the
specification must honor. The contract consists of a set of Java interfaces and classes, as well as
a mechanism for packaging and deploying a script engine. As a Java developer, when you work
with script engines conforming to JSR 223, you’ll always program to the same set of interfaces
defined by JSR 223. The script engine specific details are well encapsulated and you’ll never
need to concern yourself with them.

JSR 223 helps not only consumers but also producers of script engines. If you have designed
and implemented a programming language, you can reach out to a broader audience and make
your software friendlier to use by wrapping it with a layer that implements JSR 223 interfaces.

Before we look at the JSR 223 interfaces and our implementations of them, I’d like to point out
that just because the name of JSR and the title of this article both contain the word “Scripting”, it
doesn’t mean that there needs to be some limitations on the languages that can be integrated
with Java the JSR 223 way. You can take any language you fancy and wrap it with a layer that
conforms to the contract laid out in JSR 223. The language can be object-oriented, functional or
in any other programming paradigm. It can be strongly typed, weakly typed, or not typed at all. In
fact, before writing this article, I had implemented a JSR 223 wrapper for Scheme, a weakly
typed, functional programming language and put it up on
http://sourceforge.net/projects/model4lang. For this article, however, we are going to look at a
much simpler language that I designed particularly for this article so that we can stay focused on
the topic of JSR 223 without being overwhelmed by the details of a complex language.

BoolScript Engine
The figure below is a pictorial view that shows all the parties in our example and how they relate
to each other. This article’s example defined a very simple language which I affectionately named
BoolScript. We will refer to the program that compiles and executes BoolScript code as
BoolScript engine. Besides compiling and executing BoolScript code, to qualify itself as a JSR
223 script engine, the BoolScript engine also implements the contract defined in JSR 223. As
depicted in the figure, all the code of BoolScript engine is packaged into a single jar file called
boolscript.jar.

Throughout this article, when we say JSR 223, we mean the specification itself. We will refer to a
realization of the specification as a JSR 223 framework. The JSR 223 framework we used in this
article is the one included in J2SE 6.0 beta. Our example also consists of a Java program that
uses the BoolScript engine. The Java program hosts the BoolScript engine and its code is in
BoolScriptHostApp.java. In the figure, notice that a host Java program always interacts with a
script engine indirectly via a JSR 223 framework.

Figure 1, Overview of the BoolScript example

To run the example, all you need is J2SE 6.0 beta and this article’s binaries. The exact version of
J2SE 6.0 I used for developing the example is build 77. You can download it from
http://download.java.net/jdk6. The J2SE 6.0 beta available at
http://java.sun.com/javase/6/download.jsp should also work too.

The article’s code example comes in several files. Here is a rundown of what each file is about.

• BoolScriptEngine-Source.zip contains the source code of the BoolScript engine.


• BoolScriptHostExample-Source.zip contains the source code of the host Java program.
• BoolScriptHostExample.zip contains the binary of the BoolScript engine and the host
Java program.

To run the example, unzip BoolScriptHostExample.zip to a folder of your choice and run the host
Java program (BoolScriptHostApp.class). The zip file contains three jar files. You need to include
those three jar files in the Java classpath when running the host Java program. You can find an
exemplary command line for this in run.bat also included in BoolScriptHostExample.zip. After
running the example, you will see an output like this:

Mozilla Rhino
Bool Script Engine
answer of boolean expression is: false
answer of boolean expression is: true
answer of boolean expression is: false
BoolScript Language

Before we delve into the details of JSR 223, let’s quickly go over the BoolScript language.
BoolScript is so simple that all you can do with it is evaluating Boolean expressions. Here’s what
code written in BoolScript looks like:

(True | False) & True


(True & x) | y

As you can see, BoolScript supports two operators & (logic AND) and | (logic OR). Besides
operators, it supports three operands: True, False and variables whose values might be either
True or False. That’s all it is for BoolScript.

Script Engine Discovery Mechanism

To see what a JSR 223 engine does in between a host Java program and a script engine, let’s
assume that you want to use a script engine in your Java program. Here are the steps you’ll
typically perform. First, you’ll need to create an instance of the script engine. Second, you’ll need
to pass textual code to the engine and have the engine evaluate it. Alternatively, you might want
the engine to compile the code and save the compiled code for later execution. Let’s see how to
accomplish the steps one by one and at the same time, bear in mind that whatever we do, we can
only use the script engine through the JSR 223 framework.

To create an instance of a script engine, you first create an instance of


javax.script.ScriptEngineManager and then use it to query the existence of a script engine. You
can query the existence of a script engine by its name, by its mime types or by file extensions. If
we store BoolScript code in *.bool files, then the file extension in our case would be “bool”. The
code below queries the existence of BoolScript engine by file extension.

ScriptEngineManager engineMgr = new ScriptEngineManager();


ScriptEngine bsEngine = engineMgr.getEngineByExtension("bool");

But where do we specify the name, mime types and file extensions of our script engine? We
specify them in BoolScriptEngineFactory. The class implements the methods getExtensions(),
getMimeTypes() and getNames() of the javax.script.ScriptEngineFactory interface. And it is in
those methods that we declare the name, mime types and file extensions of the BoolScript
engine. The code for the getExtensions() method in BoolScriptEngineFactory looks like this:

public List getExtensions()


{
ArrayList<String> extList = new ArrayList<String>();
extList.add("bool");
return extList;
}

You might wonder why bother using ScriptEngineManager to create an instance of


BoolScriptEngine instead of just creating it ourselves like this:

ScriptEngine bsEngine = new BoolScriptEngine();

Well, you can certainly do that. In fact, I did that a few times for the purpose of quick testing when
I developed the example code. Creating a script engine directly might be okay for testing a script
engine, but for a real usage scenario, it violates the principle that a client Java program should
always interacts with a script engine indirectly via a JSR 223 framework. It defeats JSR 223’s
purpose of information hiding. JSR 223 achieves information hiding by using the Factory Method
design pattern to decouple script engine creation from a host Java program. Another problem of
directly instantiating an instance of a script engine is that it bypasses any initializations that
ScriptEngineManager might perform on a newly created script engine instance. Are there
initializations like that? Read on.

Given the string “bool”, how does ScriptEngineManager find BoolScriptEngine and create an
instance of it? The answer to the question is something called Script Engine Discovery
Mechanism in JSR 223. It’s the mechanism by which ScriptEngineManager finds
BoolScriptEngine and here’s how it works. And at the end of this discussion on Script Engine
Discovery Mechanism, you will see what initializations ScriptEngineManager will do to a script
engine and why.

According to the Script Engine Discovery Mechanism, a script engine provider needs to package
all the classes that implement a script engine plus one extra file in a jar file. The extra file must
have the name javax.script.ScriptEngineFactory. The jar file must have the folder META-
INF/services and the file javax.script.ScriptEngineFactory must reside in that folder. If you take a
look at the contents in boolscript.jar, you will see this file and folder structure.

The content of the file META-INF/services/javax.script.ScriptEngineFactory must contain the full


names of the classes that implement ScriptEngineFactory in the script engine. In our example, we
have only one such class and the file META-INF/services/javax.script.ScriptEngineFactory looks
like this:

net.sf.model4lang.boolscript.engine.BoolScriptEngineFactory

After a script engine provider packages his or her script engine in a jar file and releases it, users
of the script engine install the script engine by putting the jar file in the Java classpath. The figure
below shows the events that take place when a host Java program asks the JSR 223 framework
to discovery a script engine.

Figure 2, How a host Java program discovers a script engine

When asked to find a particular script engine by name, mime types or file extensions, a
ScriptEngineManager will go over the list of ScriptEngineFactory classes (i.e., classes that
implement the ScriptEngineFactory interface) that it finds in the classpath. If it finds a match, it will
create an instance of the engine factory and use the engine factory to create an instance of the
script engine. A script engine factory does the job of creating a script engine in its
getScriptEngine() method. It is the script engine provider’s responsibility to implement the
method. If you take a look at BoolScriptEngineFactory, you’ll see that our implementation for
getScriptEngine() looks like this:

public ScriptEngine getScriptEngine()


{
return new BoolScriptEngine();
}

The method is very simple. It just creates an instance of our script engine and returns it to
ScriptEngineManager (or whoever the caller is). What’s interesting is after ScriptEngineManager
gets the script engine instance, and before it returns the engine instance back to the client Java
program, it initializes the engine instance by calling the engine’s setBindings() method. This
brings us to one of the core concepts of JSR 223 – Java Bindings. After we explain the concepts
and constructs of Bindings, Scope and Context, you will know what the setBindings() call does to
a script engine.

Bindings, Scope and Context

Remember the BoolScript language allows you to write code like this:

(True & x) | y

But it doesn’t have any language construct for you to assign values to the variables x and y. I
could have designed the language to accept code like this:

x = True
y = False
(True & x) | y

But I purposely left out the assignment operator “=” and required that BoolScript code must
execute in a context where the values of the variables are defined. This means that when a host
Java program passes textual code to the BoolScript engine for evaluation, it also needs to pass a
context to the script engine or at least tell the script engine which context to use.

You can think of a context as a bag that contains data you want to pass back and forth between a
host Java program and a script engine. The construct that JSR 223 defined to model the concept
of context is the interface javax.script.ScriptContext. A bag would be messy if we put a lot of
things in it without some type of organization. So to be neat and tidy, a script context (i.e., an
instance of ScriptContext) partitions data it holds into scopes. The construct that JSR 223 defined
to model the concept of scope is the interface javax.script.Bindings. Here’s a pictorial view of a
context, its scopes and data stored therein.
There are several important things to notice in the figure above:

1. A script engine contains a script context.


2. A script engine manager (i.e. an instance of ScriptEngineManager) can be used to create
multiple script engines.
3. A script engine manager contains a scope called Global Scope but it does not contain a
context.
4. Each scope is basically just a collection of name-value pairs. The figure above shows
that one of the scopes contains a slot whose name is x and a slot whose name is y. And
remember that a scope is an instance of javas.script.Bindings.
5. The context in a script engine contains a Global Scope, an Engine Scope and zero or
more other scopes.
6. A script engine can be used to evaluate multiple scripts (i.e., separated code snippets
written in the script language).

We explained why there are scopes in a context. But what are the Global Scope and Engine
Scope in the figure? A Global Scope is a scope shared by multiple script engines. If you want
some piece of data to be accessible across multiple script engines, a Global Scope is the place to
put the data in. Note that a Global Scope is not global to all script engines. It’s only global to the
script engines created by the script engine manager in which the global scope resides. Now you
know why there is a Global Scope in a script engine manager as depicted in the figure above.

An Engine Scope is a scope shared by multiple scripts. If you want some piece of data to be
accessible across multiple scripts, an Engine Scope is the place to put the data in. For example, if
we have two scripts like this:
(True & x) | y //script A

(True & x) //script B

If we want to share the same value for x across the two scripts, we can put that value in the
Engine Scope held by the script engine that we will use to evaluate the two scripts. And suppose
we want to keep the value of y only to script A. To do that, we can create a scope, remember to
ourselves that this scope visible only to script A and put the value of y in it.

As an example, the main method of BoolScriptHostApp has the following code for evaluating (x &
y):

//bsEngine is an instance of ScriptEngine


bsEngine.put("x", BoolTermEvaluator.tTrue);
bsEngine.put("y", BoolTermEvaluator.tTrue);
bsEngine.eval("x & y\n\n");

The code puts the values of both x and y in the engine scope. Then it calls the eval() method on
the engine to evaluate the BoolScript code. If you look at the ScriptEngine interface, you’ll see
that the eval() method is overloaded with different parameters. If we call eval() with a string just
as what we did in the code snippet above, the script engine will evaluate the code in its context. If
we don’t want to evaluate the code in the script engine’s context, then we have to supply the
context we’d like to use when we call eval().

Our implementation of the eval() method delegates the job of evaluating BoolScript code all the
way down the method invocation chain until the following method in BoolTermEvaluator is called.

public static BoolTerm evaluate(BoolTerm term, ScriptContext context)


{
...
else if (term instanceof Var)
{
Var var = (Var) term;
Bindings bindings =
context.getBindings(ScriptContext.ENGINE_SCOPE);
if (!bindings.containsKey(var.getName()))
throw new IllegalArgumentException("Variable " +
var.getName() + " not set.");

Boolean varValue = (Boolean) bindings.get(var.getName());


if (varValue == Boolean.TRUE)
return BoolTermEvaluator.tTrue;
else
return BoolTermEvaluator.tFalse;
}
...
}

The method evaluates BoolScript code by evaluating terms that are True, False or variables.
When it sees that a term is a variable as shown in the code excerpt above, it gets a reference to
the engine scope by calling getBindings() on the context that’s passed to it as a parameter.
Because there might be more than one scopes in a context, we indicate that we want to get the
engine scope by passing the constant ScriptContex.ENGINE_SCOPE to getBindings(). After we
get the engine scope, we look up the variable’s value by the variable’s name in the engine scope.
If we cannot find a value for the variable, we throw an exception. Otherwise, we have successfully
evaluated the variable and we return the value back.
Now finally we are ready to explain why a script engine manager initializes a script engine by
calling the engine’s setBindings() method. When a script engine manager calls an engine’s
setBindings() method, it passes its Global Scope as a parameter to the method. The engine’s
implementation of the setBinding() method is expected to store the Global Scope in the engine’s
script context.

Before we leave this section, let’s take a look at a few classes in the Scripting API. We said that a
ScriptEngineManager contains an instance of Bindings that represents a Global Scope. If you
look at the javax.script.ScriptEngineManager class, you’ll see that there is a method getBindings()
for getting the Bindings and a method setBindings() for setting the Bindings in a
ScriptEngineManager.

Similarly, a ScriptEngine contains an instance of ScriptContext. If you look at the


javax.script.ScriptEngine interface, you’ll see a method getContext() and a method setContext()
for getting and setting the script context in a ScriptEngine.

So there’s nothing to prevent you from sharing a Global Scope among several script engine
managers. To do that, you just need to call getBindings() on one script engine manager to get its
Global Scope and then call setBindings() with that Global Scope on other script engine
managers.

If you look at our example script engine class BoolScriptEngine, you won’t see it keeping a
reference to an instance of ScriptContext explicitly. That is because BoolScriptEngine inherits
from AbstractScriptEngine and AbstractScriptEngine already has an instance of ScriptContext as
its member. If you ever need to implement a script engine from scratch without inheriting from a
class such as AbstractScriptEngine, you will need to keep an instance of ScriptContext in your
script engine and implement the getContext() and setContext() methods accordingly.

Compilable and Invocable

By now, we have implemented the minimum for our BoolScript engine to qualify as a JSR 223
script engine. Every time a Java client program wants to use our script engine, it passes the
BoolScript code as a string. Internally the script engine has a parser that parses the string into a
tree of objects commonly called an abstract syntax tree. And then it passes the tree to the
BoolTermEvaluator.evaluate() method we saw earlier. This whole process of evaluating
BoolScript code is called interpretation as opposed to compilation. And the BoolScript engine in
this role is called an interpreter as opposed to a compiler. To be a compiler, the BoolScript engine
needs to transform the textual BoolScript code into an intermediate form so that it won’t have to
parse the code into an abstract syntax tree every time it wants to evaluate it. And that is the goal
of this section.

Java programs are compiled into an intermediate form called Java bytecode and stored in .class
files. At runtime, .class files are loaded by class loaders and Java bytecode is executed by Java
Virtual Machine. Instead of defining our own intermediate form and implementing our own virtual
machine, we’ll simply stand on the shoulder of Java by compiling BoolScript code into Java
bytecode.

The construct JSR 223 defined to model the concept of compilation is javax.script.Compilable
and that’s the interface BoolScriptEngine needs to implement. The following code in
BoolScriptHostApp shows how to use a compilable script engine to compile and execute script
code.

List<Boolean> boolAnswers = null;


//bsEngine is an instance of ScriptEngine
Compilable compiler = (Compilable) bsEngine;
CompiledScript compiledScript = compiler.compile("x & y\n\n");
Bindings bindings = new SimpleBindings();
bindings.put("x", new Boolean(true));
bindings.put("y", new Boolean(true));
boolAnswers = (List<Boolean>) compiledScript.eval(bindings);
printAnswers(boolAnswers);

Invocable invocable = (Invocable) bsEngine;


boolAnswers = (List<Boolean>) invocable.invoke("eval", new
Boolean(true), new Boolean(false));
printAnswers(boolAnswers);

In the code above, bsEngine is an instance of ScriptEngine that we know also implements the
Compilable interface. We cast it to an instance of Compiler and call its compile() method to
compile the code “x & y”. Internally, the compile() method transforms “x & y” into the following
Java code:

package net.sf.model4lang.boolscript.generated;
import java.util.*;
import java.lang.reflect.*;

class TempBoolClass {
public static List<Boolean> eval(boolean x, boolean y)
{
List<Boolean> resultList = new ArrayList<Boolean>();
boolean result = false;
result = x & y;
resultList.add(new Boolean(result));
return resultList;
}
}

The transformation converts BoolScript code into a Java method inside a Java class. The class
name and method name are hard coded. Each variable in BoolScript code becomes a parameter
in the Java method.

Transforming BoolScript code to Java code is just half of the story. The other half is compiling the
generated Java code into bytecode. I chose to compile the generated Java code in-memory using
JSR 199 Java Compiler API, another new feature in J2SE 6.0. Details of Java Compiler API are
out of the scope of this article and might be a good topic for another article.

The Compilable interface dictates that the compile() method must return an instance of
CompiledScript. The class CompiledScript is the construct JSR 223 defined to model the result of
a compilation. No matter how we compile our script code, after all is said and done, we need to
package the compilation result as an instance of CompiledScript. In the example code, we
defined a class BoolCompiledScript and derived it from CompiledScript to store the compiled
BoolScript code.

Once the script code is compiled, the client Java program can repeatedly execute the compiled
code by calling the eval() method on the CompiledScript instance that represents the compilation
result. In our case, as shown in the code excerpt from BoolScriptHostApp listed above, when we
call the eval() method on the CompiledScript instance, we need to pass in a script context that
contains the values for variables x and y.
The eval() method of CompiledScript is not the only way to execute compiled script code. If the
script engine implements the Invocable interface, we can call the invoke() method of the
Invocable interface to execute compiled script code too. In our simple example, there might not
seem to be any difference between using CompiledScript and Invocable for script code execution.
However, practically, users of a script engine will use CompiledScript to execute a whole script
file and Invocable to execute individual functions (methods in Java terms) in a script. And it’s not
difficult to tell this difference between CompiledScript and Invocable if we look at the invoke()
method of Invocable. Unlike the eval() method of CompiledScript that takes an optional script
context as parameter, the invoke() method takes as parameter the name of the particular function
you’d like to invoke in the compiled script.

In the code excerpt from BoolScriptHostApp above, bsEngine is an instance of ScriptEngine that
we know also implements the Invocable interface. We cast it to an instance of Invocable and call
its invoke() method. Invoking a compiled script function is much like invoking a Java method
using Java Reflection. You have to tell the invoke() method the name of the function you want to
invoke and you also need to supply the invoke() method with the parameters required by the
function. We know that in our generated Java code, the method name is hard coded as “eval”. So
we pass the string “eval” as the first parameter to invoke(). We also know that the eval() method
takes two Boolean values as its input parameters. So we pass two Boolean values to invoke() as
well.

Conclusion

In this article, we’ve covered several major areas of JSR 223 such as Script Engine Discovery
Mechanism, Java Bindings, Compilable and Invocable. One part of JSR 223 not mentioned in this
article is Web Scripting. If we implement Web Scripting in the BoolScript engine, then clients of
our script engine will be able to use it to generate web contents in a Servlet container. Developing
a language compiler or interpreter is a huge undertaking let along integrating it with Java.
Although depending on the complexity of the language you want to design, developing a compiler
or interpreter can still be a daunting task. Thanks to JSR 223, the integration between your
language and Java has never been easier.

Resources

Bean Scripting Framework (BSF): http://jakarta.apache.org/bsf/

A JSR 223 script engine for Scheme: http://sourceforge.net/projects/model4lang

GoF’s Design Patterns book, ISBN 0201633612, Addison-Wesley, 1995

JSR 223 Specifications: http://www.jcp.org/en/jsr/detail?id=223

You might also like