JNode Command and Syntax APIs

This page is an overview of the JNode APIs that are involved in the new syntax mechanisms. For more nitty-gritty details, please refer to the relevant javadocs.
Note:

  1. These APIs still change a bit from time to time. (But if your code is in the JNode code base, you won't need to deal with these changes.)
  2. The javadocs on the JNode website currently do not include the "shell" APIs.
    You can generate the javadocs in a JNode build sandbox by running "./build.sh javadoc".

  3. If the javadocs are inadequate, please let us know via a JNode "bug" request.

Java package structure

The following classes mostly reside in the "org.jnode.shell.syntax" package. The exceptions are "Command" and "AbstractCommand" which live in "org.jnode.shell". (Similarly named classes in the "org.jnode.shell.help" and "org.jnode.shell.help.args" packages are part of the old-style syntax support.)

Command
The JNode command shell (or more accurately, the command invokers) understand two entry points for launching classes as "commands". The first entry point is the "public static void main(String[])" entry point used by classic Java command line applications. When a command class has (just) a "main" method, the shell will launch it by calling the method, passing the command arguments. What happens next is up to the command class:

  • A non-JNode application will typically deal with the command arguments itself, or using some third party class like "gnu.getopt.GetOpt".
  • A JNode-aware application can also use the old-style syntax method directly, by calling a "Help.Info" object's "parse(String[])" method on the argument strings.

The preferred entry point for a JNode command class is the "Command.execute(CommandLine, InputStream, PrintStream, PrintStream)" method. On the face of it, this entry point offers a number of advantages over the "main" entry point:

  • The "execute" method provides command's IO streams explicitly, rather than relying on the "System.{in,out,err}" statics. (Those statics are problematic, unless you are using proclets or isolates.)
  • The "execute" method gives the application access to more information gleaned from the command line; e.g. the command name (alias) supplied by the user.

Unless you are using the "default" command invoker, a command class with an "execute" entry point will be invoked via that entry point, even it it also has a "main" entry point. What happens next is up to the command class:

  • The "execute" method may fetch the user's argument strings from the CommandLine object and do its own argument analysis.
  • If the command class is designed to use old-style syntax mechanisms, the "execute" method will typically call the "parse(String[])" method and proceed as described above.
  • If the command class is designed to use new-style syntax mechanisms, argument analysis will already have been done. This can only happen if the command class extends the AbstractCommand class; see below.

AbstractCommand

The AbstractCommand class is a base class for JNode-aware command classes. For command classes that do their own argument processing, or that use the old-stle syntax mechanisms, use of this class is optional. For commands that want to use the new-style syntax mechanisms, the command class must be a direct or indirect subclass of AbstractCommand.

The AbstractCommand class provides helper methods useful to all command class.

  • The "exit(int)" method can be called from the command thread terminate command execution with an return code. This is roughly equivalent to a classic Java application calling "System.exit(int)".
  • The "getInput()", "getOutput()", "getError()" and "getIO(int)" methods return "CommandIO" instances that can be used to get a command's "standard io" streams as
    Java Input/OutputStream or Reader/Writer objects.

The "getCommandLine" method returns a CommandLine instance that holds the command's command name and unparsed arguments.

But more importantly, the AbstractCommand class provides infrastructure that is key to the new-style syntax mechanism. Specifically, the AbstractCommand maintains an ArgumentBundle for each command instance. The ArgumentBundle is created when either of the following happens:

  1. The child class constructor chains the AbstractCommand(String) constructor. In this case an (initially) empty ArgumentBundle is created.
  2. The child class constructor calls the "registerArgument(Argument ...)" method. In this case, an ArgumentBundle is created (if necessary) and the arguments are added to it.

If it was created, the ArgumentBundle is populated with argument values before the "execute" method is called. The existence of an ArgumentBundle determines whether the shell uses old-style or new-style syntax, for command execution and completion. (Don't try to mix the two mechanisms: it is liable to lead to inconsistent command behavior.)

Finally, the AbstractCommand class provides an "execute(String[])" method. This is intended to provide a bridge between the "main" and "execute" entry points for situations where a JNode-aware command class has to be executed via the former entry point. The "main" method should be implemented as follows:

    public static void main(String[] args) throws Exception {
        new XxxClass().execute(args);
    }

CommandIO and its implementation classes

The CommandIO interfaces and its implementation classes allow commands to obtain "standard io" streams without knowing whether the underlying data streams are byte or character oriented. This API also manages the creation of 'print' wrappers.

Argument and sub-classes

The Argument classes play a central place in the new syntax mechanism. As we have seen above, the a command class creates Argument instances to act as value holders for its formal arguments, and adds them to its ArgumentBundle. When the argument parser is invoked, traverses the command syntax and binds values to the Arguments in the bundle. When the command's "execute" entry point is called, the it can access the values bound to the Arguments.

The most important methods in the Argument API are as follows:

  • The "accept(Token)" method is called by the parser when it has a candidate token for the Argument. If the supplied Token is acceptable, the Argument uses "addValue(...)" to add the Token to its collection. If it is not acceptable, "SyntaxErrorException" is thrown.
  • The "doAccept(Token)" abstract method is called by "accept" after it has done the multiplicity checks. It is required to either return a non-null value, or throw an exception; typically SyntaxErrorException.
  • In completion mode, the parser calls the "complete(...)" method to get Argument specific completions for a partial argument. The "complete" method is supplied a CompletionInfo object, and should use it to record any completions.
  • The "isSet()", "getValue()" and "getValues()" methods are called by a command class to obtain the value of values bouond to an Argument.

The constructors for the descendent classes of Argument provide the following common parameters:

  • The "label" parameter provides a name for the Attribute that is used to bind the Argument to Syntax elements. It must be unique in the context of the command's ArgumentBundle.
  • The "flags" parameter specify the Argument's multiplicity; i.e how many values are allowed or required for the Argument. The allowed flag values are defined in the Argument class. A well-formed "flags" parameter consists of OPTIONAL or MANDATORY "or-ed" with SINGLE or MULTIPLE.
  • The "description" parameter gives a default description for the Argument that can be used in "help" messages.

The descendent classes of Argument correspond to different kinds of argument. For example:

  • StringArgument accepts any String value,
  • IntegerArgument accepts and (in some cases) completes an Integer value,
  • FileArgument accepts a pathname argument and completes it against paths for existing objects in the file system, and
  • DeviceArgument accepts a device name and completes it against the registered device names.

There are two abstract sub-classes of Argument:

  • EnumArgument accepts values for a given Java enum.
  • MappedArgument accepts values based on a String to value mapping supplied as a Java Map.

Please refer to the javadoc for an up-to-date list of the Argument classes.

Syntax and sub-classes

As we have seen above, Argument instances are used to specify the command class'es argument requirements. These Arguments correspond to nodes in one or more syntaxes for the command. These syntaxes are represented in memory by the Syntax classes.

A typical command class does not see Syntax objects. They are typically created by loading XML (as specified here), and are used by various components of the shell. As such, the APIs need not concern the application developer.

ArgumentBundle

This class is largely internal, and a JNode application programmer doesn't need to access it directly. Its purpose is to act as the container for the new-style Argument instances that belong to a command class instance.

MuSyntax and sub-classes

The MuSyntax class and its subclasses represent the BNF-like syntax graphs that the command argument parser actually operate on. These graphs are created by the "prepare" method of new-style Syntax objects, in two stages. The first stage is to build a tree of MuSyntax objects, using symbolic references to represent cycles. The second stage is to traverse the tree, replacing the symbolic references with their referents.

There are currently 6 kinds of MuSyntax node:

  • MuSymbol - this denotes a symbol (keyword) in the syntax. When a MuSymbol is match, no argument capture takes place.
  • MuArgument - this denotes a placeholder for an Argument in the syntax. When a MuArgument is encountered, the corresponding Argument's "accept" method is called to see if the current token is acceptable. If it is, the token is bound to the Argument; otherwise the parser starts backtracking.
  • MuPreset - this is a variation on a MuArgument in which a "preset" token is passed to the Argument. Unlike MuArgument and MuSymbol, a MuPreset does not cause the parser to advance to the next token.
  • MuSequence - this denotes that a list of child MuSyntax nodes must be matches in a given sequence.
  • MuAlternation - this denotes that a list of child MuSyntax nodes must be tried one at a time in a given order.
  • MuBackReference - this denotes a reference to an ancestor node in the MuSyntax tree. These nodes are replaced with their referents before parsing takes place.

MuParser

The MuParser class does the real work of command line parsing. The "parse" method takes input parameters that provide a MuSyntax graph, a TokenSource and some control parameters.

The parser maintains three stacks:

  • The "syntaxStack" holds the current "productions" waiting to be matched against the token stream.
  • The "choicePointStack" holds "choicePoint" objects that represent alternates that the parser hasn't tried yet. The choicepoints also record the state of the "syntaxStack" when the alternation was encountered, and top of the "argsModified" stack.
  • The "argsModified" stack keeps track of the Arguments that need to be "unbound" when the parser backtracks.

In normal parsing mode, the "parse" method matches tokens until either the parse is complete, or an error occurs. The parse is complete if the parser reaches the end of the token stream and discovers that the syntax stack is also empty. The "parse" method then returns, leaving the Arguments bound to the relevant source tokens. The error case occurs when a MuSyntax does not match the current token, or the parser reaches the end of the TokenSource when there are still unmached MuSyntaxes on the syntax stack. In this case, the parser backtracks to the last "choicepoint" and then resumes parsing with the next alternative. If no choicepoints are left, the parse fails.

In completion mode, the "parse" method behaves differently when it encounters the end of the TokenSource. The first thing it does is to attempt to capture a completion; e.g. by calling the current Argument's "complete(...)" method. Then itstarts backtracking to find more completions. As a result, a completion parse may do a lot more work than a normal parse.

The astute reader may be wondering what happens if the "MuParser.parse" method is applied to a pathological MuSyntax; e.g. one which loops for ever, or that requires exponential backtracking. The answer is that the "parse" method has a "stepLimit" parameter that places an upper limit on the number of main loop iterations that the parser will perform. This indirectly addresses the issue of space usage as well, though we could probably improve on this. (In theory, we could analyse the MuSyntax for common pathologies, but this would degrade parser performance for non-pathological MuSyntaxes. Besides, we are not (currently) allowing applications to supply MuSyntax graphs directly, so all we really need to do is ensure that the Syntax classes generate well-behaved MuSyntax graphs.)