org.modeshape.common.text
Class TokenStream

java.lang.Object
  extended by org.modeshape.common.text.TokenStream
Direct Known Subclasses:
DdlTokenStream

@NotThreadSafe
public class TokenStream
extends Object

A foundation for basic parsers that tokenizes input content and allows parsers to easily access and use those tokens. A TokenStream object literally represents the stream of TokenStream.Token objects that each represent a word, symbol, comment or other lexically-relevant piece of information. This simple framework makes it very easy to create a parser that walks through (or "consumes") the tokens in the order they appear and do something useful with that content (usually creating another representation of the content, such as some domain-specific Abstract Syntax Tree or object model).

The parts

This simple framework consists of a couple of pieces that fit together to do the whole job of parsing input content.

The TokenStream.Tokenizer is responsible for consuming the character-level input content and constructing TokenStream.Token objects for the different words, symbols, or other meaningful elements contained in the content. Each Token object is a simple object that records the character(s) that make up the token's value, but it does this in a very lightweight and efficient way by pointing to the original character stream. Each token can be assigned a parser-specific integral token type that may make it easier to do quickly figure out later in the process what kind of information each token represents. The general idea is to keep the Tokenizer logic very simple, and very often Tokenizers will merely look for the different kinds of characters (e.g., symbols, letters, digits, etc.) as well as things like quoted strings and comments. However, Tokenizers are never called by the parser, but instead are always given to the TokenStream that then calls the Tokenizer at the appropriate time.

The TokenStream is supplied the input content, a Tokenizer implementation, and a few options. Its job is to prepare the content for processing, call the Tokenizer implementation to create the series of Token objects, and then provide an interface for walking through and consuming the tokens. This interface makes it possible to discover the value and type of the current token, and consume the current token and move to the next token. Plus, the interface has been designed to make the code that works with the tokens to be as readable as possible.

The final component in this framework is the Parser. The parser is really any class that takes as input the content to be parsed and that outputs some meaningful information. The parser will do this by defining the Tokenizer, constructing a TokenStream object, and then using the TokenStream to walk through the sequence of Tokens and produce some meaningful representation of the content. Parsers can create instances of some object model, or they can create a domain-specific Abstract Syntax Tree representation.

The benefit of breaking the responsibility along these lines is that the TokenStream implementation is able to encapsulate quite a bit of very tedious and very useful functionality, while still allowing a lot of flexibility as to what makes up the different tokens. It also makes the parser very easy to write and read (and thus maintain), without placing very many restrictions on how that logic is to be defined. Plus, because the TokenStream takes responsibility for tracking the positions of every token (including line and column numbers), it can automatically produce meaningful errors.

Consuming tokens

A parser works with the tokens on the TokenStream using a variety of methods:

  • The matchesAnyOf(String, String...) methods look at the current token and check whether the token's value matches at least one of the values supplied as method parameters. The method then returns whether there was a match, but does not advance the current token pointer. Similarly, the matchesAnyOf(int, int...) method checks the token's type rather than the value.
  • With these methods, it's very easy to create a parser that looks at the current token to decide what to do, and then consume that token, and repeat this process.

    Example parser

    Here is an example of a very simple parser that parses very simple and limited SQL SELECT and DELETE statements, such as SELECT * FROM Customers or SELECT Name, StreetAddress AS Address, City, Zip FROM Customers or DELETE FROM Customers WHERE Zip=12345:

     public class SampleSqlSelectParser {
         public List<Statement> parse( String ddl ) {
             TokenStream tokens = new TokenStream(ddl, new SqlTokenizer(), false);
             List<Statement> statements = new LinkedList<Statement>();
             token.start();
             while (tokens.hasNext()) {
                 if (tokens.matches("SELECT")) {
                     statements.add(parseSelect(tokens));
                 } else {
                     statements.add(parseDelete(tokens));
                 }
             }
             return statements;
         }
     
         protected Select parseSelect( TokenStream tokens ) throws ParsingException {
             tokens.consume("SELECT");
             List<Column> columns = parseColumns(tokens);
             tokens.consume("FROM");
             String tableName = tokens.consume();
             return new Select(tableName, columns);
         }
     
         protected List<Column> parseColumns( TokenStream tokens ) throws ParsingException {
             List<Column> columns = new LinkedList<Column>();
             if (tokens.matches('*')) {
                 tokens.consume(); // leave the columns empty to signal wildcard
             } else {
                 // Read names until we see a ','
                 do {
                     String columnName = tokens.consume();
                     if (tokens.canConsume("AS")) {
                         String columnAlias = tokens.consume();
                         columns.add(new Column(columnName, columnAlias));
                     } else {
                         columns.add(new Column(columnName, null));
                     }
                 } while (tokens.canConsume(','));
             }
             return columns;
         }
     
         protected Delete parseDelete( TokenStream tokens ) throws ParsingException {
             tokens.consume("DELETE", "FROM");
             String tableName = tokens.consume();
             tokens.consume("WHERE");
             String lhs = tokens.consume();
             tokens.consume('=');
             String rhs = tokens.consume();
             return new Delete(tableName, new Criteria(lhs, rhs));
         }
      }
      public abstract class Statement { ... }
      public class Query extends Statement { ... }
      public class Delete extends Statement { ... }
      public class Column { ... }
     
    This example shows an idiomatic way of writing a parser that is stateless and thread-safe. The parse(...) method takes the input as a parameter, and returns the domain-specific representation that resulted from the parsing. All other methods are utility methods that simply encapsulate common logic or make the code more readable.

    In the example, the parse(...) first creates a TokenStream object (using a Tokenizer implementation that is not shown), and then loops as long as there are more tokens to read. As it loops, if the next token is "SELECT", the parser calls the parseSelect(...) method which immediately consumes a "SELECT" token, the names of the columns separated by commas (or a '*' if there all columns are to be selected), a "FROM" token, and the name of the table being queried. The parseSelect(...) method returns a Select object, which then added to the list of statements in the parse(...) method. The parser handles the "DELETE" statements in a similar manner.

    Case sensitivity

    Very often grammars to not require the case of keywords to match. This can make parsing a challenge, because all combinations of case need to be used. The TokenStream framework provides a very simple solution that requires no more effort than providing a boolean parameter to the constructor.

    When a false value is provided for the the caseSensitive parameter, the TokenStream performs all matching operations as if each token's value were in uppercase only. This means that the arguments supplied to the match(...), canConsume(...), and consume(...) methods should be upper-cased. Note that the actual value of each token remains the actual case as it appears in the input.

    Of course, when the TokenStream is created with a true value for the caseSensitive parameter, the matching is performed using the actual value as it appears in the input content

    Whitespace

    Many grammars are independent of lines breaks or whitespace, allowing a lot of flexibility when writing the content. The TokenStream framework makes it very easy to ignore line breaks and whitespace. To do so, the Tokenizer implementation must simply not include the line break character sequences and whitespace in the token ranges. Since none of the tokens contain whitespace, the parser never has to deal with them.

    Of course, many parsers will require that some whitespace be included. For example, whitespace within a quoted string may be needed by the parser. In this case, the Tokenizer should simply include the whitespace characters in the tokens.

    Writing a Tokenizer

    Each parser will likely have its own TokenStream.Tokenizer implementation that contains the parser-specific logic about how to break the content into token objects. Generally, the easiest way to do this is to simply iterate through the character sequence passed into the tokenize(...) method, and use a switch statement to decide what to do.

    Here is the code for a very basic Tokenizer implementation that ignores whitespace, line breaks and Java-style (multi-line and end-of-line) comments, while constructing single tokens for each quoted string.

      public class BasicTokenizer implements Tokenizer {
          public void tokenize( CharacterStream input,
                                Tokens tokens ) throws ParsingException {
              while (input.hasNext()) {
                  char c = input.next();
                  switch (c) {
                      case ' ':
                      case '\t':
                      case '\n':
                      case '\r':
                          // Just skip these whitespace characters ...
                          break;
                      case '-':
                      case '(':
                      case ')':
                      case '{':
                      case '}':
                      case '*':
                      case ',':
                      case ';':
                      case '+':
                      case '%':
                      case '?':
                      case '$':
                      case '[':
                      case ']':
                      case '!':
                      case '<':
                      case '>':
                      case '|':
                      case '=':
                      case ':':
                          tokens.addToken(input.index(), input.index() + 1, SYMBOL);
                          break;
                      case '.':
                          tokens.addToken(input.index(), input.index() + 1, DECIMAL);
                          break;
                      case '\"':
                      case '\"':
                          int startIndex = input.index();
                          Position startingPosition = input.position();
                          boolean foundClosingQuote = false;
                          while (input.hasNext()) {
                              c = input.next();
                              if (c == '\\' && input.isNext('"')) {
                                  c = input.next(); // consume the ' character since it is escaped
                              } else if (c == '"') {
                                  foundClosingQuote = true;
                                  break;
                              }
                          }
                          if (!foundClosingQuote) {
                              throw new ParsingException(startingPosition, "No matching closing double quote found");
                          }
                          int endIndex = input.index() + 1; // beyond last character read
                          tokens.addToken(startIndex, endIndex, DOUBLE_QUOTED_STRING);
                          break;
                      case '\'':
                          startIndex = input.index();
                          startingPosition = input.position();
                          foundClosingQuote = false;
                          while (input.hasNext()) {
                              c = input.next();
                              if (c == '\\' && input.isNext('\'')) {
                                  c = input.next(); // consume the ' character since it is escaped
                              } else if (c == '\'') {
                                  foundClosingQuote = true;
                                  break;
                              }
                          }
                          if (!foundClosingQuote) {
                              throw new ParsingException(startingPosition, "No matching closing single quote found");
                          }
                          endIndex = input.index() + 1; // beyond last character read
                          tokens.addToken(startIndex, endIndex, SINGLE_QUOTED_STRING);
                          break;
                      case '/':
                          startIndex = input.index();
                          if (input.isNext('/')) {
                              // End-of-line comment ...
                              boolean foundLineTerminator = false;
                              while (input.hasNext()) {
                                  c = input.next();
                                  if (c == '\n' || c == '\r') {
                                      foundLineTerminator = true;
                                      break;
                                  }
                              }
                              endIndex = input.index(); // the token won't include the '\n' or '\r' character(s)
                              if (!foundLineTerminator) ++endIndex; // must point beyond last char
                              if (c == '\r' && input.isNext('\n')) input.next();
                              if (useComments) {
                                  tokens.addToken(startIndex, endIndex, COMMENT);
                              }
                          } else if (input.isNext('*')) {
                              // Multi-line comment ...
                              while (input.hasNext() && !input.isNext('*', '/')) {
                                  c = input.next();
                              }
                              if (input.hasNext()) input.next(); // consume the '*'
                              if (input.hasNext()) input.next(); // consume the '/'
                              if (useComments) {
                                  endIndex = input.index() + 1; // the token will include the '/' and '*' characters
                                  tokens.addToken(startIndex, endIndex, COMMENT);
                              }
                          } else {
                              // just a regular slash ...
                              tokens.addToken(startIndex, startIndex + 1, SYMBOL);
                          }
                          break;
                      default:
                          startIndex = input.index();
                          // Read until another whitespace/symbol/decimal/slash is found
                          while (input.hasNext() && !(input.isNextWhitespace() || input.isNextAnyOf("/.-(){}*,;+%?$[]!<>|=:"))) {
                              c = input.next();
                          }
                          endIndex = input.index() + 1; // beyond last character that was included
                          tokens.addToken(startIndex, endIndex, WORD);
                  }
              }
          }
      }
     
    Tokenizers with exactly this behavior can actually be created using the basicTokenizer(boolean) method. So while this very basic implementation is not meant to be used in all situations, it may be useful in some situations.


    Nested Class Summary
    static class TokenStream.BasicTokenizer
              A basic TokenStream.Tokenizer implementation that ignores whitespace but includes tokens for individual symbols, the period ('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.
    protected  class TokenStream.CaseInsensitiveToken
               
     class TokenStream.CaseInsensitiveTokenFactory
               
    protected  class TokenStream.CaseSensitiveToken
              An immutable TokenStream.Token that implements matching using case-sensitive logic.
     class TokenStream.CaseSensitiveTokenFactory
               
    static class TokenStream.CharacterArrayStream
              An implementation of TokenStream.CharacterStream that works with a single character array.
    static interface TokenStream.CharacterStream
              Interface used by a TokenStream.Tokenizer to iterate through the characters in the content input to the TokenStream.
    static interface TokenStream.Token
              The interface defining a token, which references the characters in the actual input character stream.
    protected  class TokenStream.TokenFactory
               
    static interface TokenStream.Tokenizer
              Interface for a Tokenizer component responsible for processing the characters in a TokenStream.CharacterStream and constructing the appropriate TokenStream.Token objects.
    static interface TokenStream.Tokens
              A factory for Token objects, used by a TokenStream.Tokenizer to create tokens in the correct order.
     
    Field Summary
    static int ANY_TYPE
              A constant that can be used with the matches(int), matches(int, int...), consume(int), and canConsume(int) methods to signal that any token type is allowed to be matched.
    static String ANY_VALUE
              A constant that can be used with the matches(String), matches(String, String...), consume(String), consume(String, String...), canConsume(String) and canConsume(String, String...) methods to signal that any value is allowed to be matched.
    protected  String inputString
               
    protected  String inputUppercased
               
     
    Constructor Summary
    TokenStream(String content, TokenStream.Tokenizer tokenizer, boolean caseSensitive)
               
     
    Method Summary
    static TokenStream.BasicTokenizer basicTokenizer(boolean includeComments)
              Obtain a basic TokenStream.Tokenizer implementation that ignores whitespace but includes tokens for individual symbols, the period ('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.
     boolean canConsume(char expected)
              Attempt to consume this current token if it matches the expected value, and return whether this method was indeed able to consume the token.
     boolean canConsume(int expectedType)
              Attempt to consume this current token if it matches the expected token type, and return whether this method was indeed able to consume the token.
     boolean canConsume(Iterable<String> nextTokens)
              Attempt to consume this current token and the next tokens if and only if they match the expected values, and return whether this method was indeed able to consume all of the supplied tokens.
     boolean canConsume(String expected)
              Attempt to consume this current token if it matches the expected value, and return whether this method was indeed able to consume the token.
     boolean canConsume(String[] nextTokens)
              Attempt to consume this current token and the next tokens if and only if they match the expected values, and return whether this method was indeed able to consume all of the supplied tokens.
     boolean canConsume(String currentExpected, String... expectedForNextTokens)
              Attempt to consume this current token and the next tokens if and only if they match the expected values, and return whether this method was indeed able to consume all of the supplied tokens.
     boolean canConsumeAnyOf(int[] typeOptions)
              Attempt to consume the next token if it matches one of the supplied types.
     boolean canConsumeAnyOf(int firstTypeOption, int... additionalTypeOptions)
              Attempt to consume the next token if it matches one of the supplied types.
     boolean canConsumeAnyOf(Iterable<String> options)
              Attempt to consume the next token if it matches one of the supplied values.
     boolean canConsumeAnyOf(String[] options)
              Attempt to consume the next token if it matches one of the supplied values.
     boolean canConsumeAnyOf(String firstOption, String... additionalOptions)
              Attempt to consume the next token if it matches one of the supplied values.
     String consume()
              Return the value of this token and move to the next token.
     void consume(char expected)
              Attempt to consume this current token as long as it matches the expected character, or throw an exception if the token does not match.
     void consume(int expectedType)
              Attempt to consume this current token as long as it matches the expected character, or throw an exception if the token does not match.
     void consume(Iterable<String> nextTokens)
              Attempt to consume this current token as the next tokens as long as they match the expected values, or throw an exception if the token does not match.
     void consume(String expected)
              Attempt to consume this current token as long as it matches the expected value, or throw an exception if the token does not match.
     void consume(String[] nextTokens)
              Attempt to consume this current token as the next tokens as long as they match the expected values, or throw an exception if the token does not match.
     void consume(String expected, String... expectedForNextTokens)
              Attempt to consume this current token as the next tokens as long as they match the expected values, or throw an exception if the token does not match.
     boolean consumeBoolean()
              Convert the value of this token to an integer, return it, and move to the next token.
     int consumeInteger()
              Convert the value of this token to an integer, return it, and move to the next token.
     long consumeLong()
              Convert the value of this token to a long, return it, and move to the next token.
     String getContentBetween(Position starting, Position end)
              Gets the content string starting at the first position (inclusive) and continuing up to the end position (exclusive).
     boolean hasNext()
              Determine if this stream has another token to be consumed.
    protected  List<TokenStream.Token> initializeTokens(List<TokenStream.Token> tokens)
              Method to allow subclasses to preprocess the set of tokens and return the correct tokens to use.
     boolean matches(char expected)
              Determine if the current token matches the expected value.
     boolean matches(int expectedType)
              Determine if the current token matches the expected token type.
     boolean matches(int[] typesForNextTokens)
              Determine if the next few tokens have the supplied types.
     boolean matches(int currentExpectedType, int... expectedTypeForNextTokens)
              Determine if the next few tokens have the supplied types.
     boolean matches(Iterable<String> nextTokens)
              Determine if the next few tokens match the expected values.
     boolean matches(String expected)
              Determine if the current token matches the expected value.
     boolean matches(String[] nextTokens)
              Determine if the next few tokens match the expected values.
     boolean matches(String currentExpected, String... expectedForNextTokens)
              Determine if the next few tokens match the expected values.
     boolean matchesAnyOf(int[] typeOptions)
              Determine if the next token have one of the supplied types.
     boolean matchesAnyOf(int firstTypeOption, int... additionalTypeOptions)
              Determine if the next token have one of the supplied types.
     boolean matchesAnyOf(Iterable<String> options)
              Determine if the next token matches one of the supplied values.
     boolean matchesAnyOf(String[] options)
              Determine if the next token matches one of the supplied values.
     boolean matchesAnyOf(String firstOption, String... additionalOptions)
              Determine if the next token matches one of the supplied values.
     Position nextPosition()
              Get the position of the next (or current) token.
     Position previousPosition()
              Get the position of the previous token.
     void rewind()
              Method to allow tokens to be re-used from the start without re-tokenizing content.
     TokenStream start()
              Begin the token stream, including (if required) the tokenization of the input content.
    protected  void throwNoMoreContent()
               
     String toString()
              
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
     

    Field Detail

    ANY_VALUE

    public static final String ANY_VALUE
    A constant that can be used with the matches(String), matches(String, String...), consume(String), consume(String, String...), canConsume(String) and canConsume(String, String...) methods to signal that any value is allowed to be matched.

    Note that this exact instance must be used; an equivalent string will not work.

    See Also:
    Constant Field Values

    ANY_TYPE

    public static final int ANY_TYPE
    A constant that can be used with the matches(int), matches(int, int...), consume(int), and canConsume(int) methods to signal that any token type is allowed to be matched.

    See Also:
    Constant Field Values

    inputString

    protected final String inputString

    inputUppercased

    protected final String inputUppercased
    Constructor Detail

    TokenStream

    public TokenStream(String content,
                       TokenStream.Tokenizer tokenizer,
                       boolean caseSensitive)
    Method Detail

    start

    public TokenStream start()
                      throws ParsingException
    Begin the token stream, including (if required) the tokenization of the input content.

    Returns:
    this object for easy method chaining; never null
    Throws:
    ParsingException - if an error occurs during tokenization of the content

    initializeTokens

    protected List<TokenStream.Token> initializeTokens(List<TokenStream.Token> tokens)
    Method to allow subclasses to preprocess the set of tokens and return the correct tokens to use. The default behavior is to simply return the supplied tokens.

    Parameters:
    tokens -
    Returns:
    list of tokens.

    rewind

    public void rewind()
    Method to allow tokens to be re-used from the start without re-tokenizing content.


    previousPosition

    public Position previousPosition()
    Get the position of the previous token.

    Returns:
    the previous token's position; never null
    Throws:
    IllegalStateException - if this method was called before the stream was started
    NoSuchElementException - if there is no previous token

    nextPosition

    public Position nextPosition()
    Get the position of the next (or current) token.

    Returns:
    the current token's position; never null
    Throws:
    IllegalStateException - if this method was called before the stream was started
    NoSuchElementException - if there is no previous token

    consumeInteger

    public int consumeInteger()
                       throws ParsingException,
                              IllegalStateException
    Convert the value of this token to an integer, return it, and move to the next token.

    Returns:
    the current token's value, converted to an integer
    Throws:
    ParsingException - if there is no such token to consume, or if the token cannot be converted to an integer
    IllegalStateException - if this method was called before the stream was started

    consumeLong

    public long consumeLong()
                     throws ParsingException,
                            IllegalStateException
    Convert the value of this token to a long, return it, and move to the next token.

    Returns:
    the current token's value, converted to an integer
    Throws:
    ParsingException - if there is no such token to consume, or if the token cannot be converted to a long
    IllegalStateException - if this method was called before the stream was started

    consumeBoolean

    public boolean consumeBoolean()
                           throws ParsingException,
                                  IllegalStateException
    Convert the value of this token to an integer, return it, and move to the next token.

    Returns:
    the current token's value, converted to an integer
    Throws:
    ParsingException - if there is no such token to consume, or if the token cannot be converted to an integer
    IllegalStateException - if this method was called before the stream was started

    consume

    public String consume()
                   throws ParsingException,
                          IllegalStateException
    Return the value of this token and move to the next token.

    Returns:
    the value of the current token
    Throws:
    ParsingException - if there is no such token to consume
    IllegalStateException - if this method was called before the stream was started

    throwNoMoreContent

    protected void throwNoMoreContent()
                               throws ParsingException
    Throws:
    ParsingException

    consume

    public void consume(String expected)
                 throws ParsingException,
                        IllegalStateException
    Attempt to consume this current token as long as it matches the expected value, or throw an exception if the token does not match.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    expected - the expected value of the current token
    Throws:
    ParsingException - if the current token doesn't match the supplied value
    IllegalStateException - if this method was called before the stream was started

    consume

    public void consume(char expected)
                 throws ParsingException,
                        IllegalStateException
    Attempt to consume this current token as long as it matches the expected character, or throw an exception if the token does not match.

    Parameters:
    expected - the expected character of the current token
    Throws:
    ParsingException - if the current token doesn't match the supplied value
    IllegalStateException - if this method was called before the stream was started

    consume

    public void consume(int expectedType)
                 throws ParsingException,
                        IllegalStateException
    Attempt to consume this current token as long as it matches the expected character, or throw an exception if the token does not match.

    The ANY_TYPE constant can be used in the expected values as a wildcard.

    Parameters:
    expectedType - the expected token type of the current token
    Throws:
    ParsingException - if the current token doesn't match the supplied value
    IllegalStateException - if this method was called before the stream was started

    consume

    public void consume(String expected,
                        String... expectedForNextTokens)
                 throws ParsingException,
                        IllegalStateException
    Attempt to consume this current token as the next tokens as long as they match the expected values, or throw an exception if the token does not match.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    expected - the expected value of the current token
    expectedForNextTokens - the expected values fo the following tokens
    Throws:
    ParsingException - if the current token doesn't match the supplied value
    IllegalStateException - if this method was called before the stream was started

    consume

    public void consume(String[] nextTokens)
                 throws ParsingException,
                        IllegalStateException
    Attempt to consume this current token as the next tokens as long as they match the expected values, or throw an exception if the token does not match.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    nextTokens - the expected values for the next tokens
    Throws:
    ParsingException - if the current token doesn't match the supplied value
    IllegalStateException - if this method was called before the stream was started

    consume

    public void consume(Iterable<String> nextTokens)
                 throws ParsingException,
                        IllegalStateException
    Attempt to consume this current token as the next tokens as long as they match the expected values, or throw an exception if the token does not match.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    nextTokens - the expected values for the next tokens
    Throws:
    ParsingException - if the current token doesn't match the supplied value
    IllegalStateException - if this method was called before the stream was started

    canConsume

    public boolean canConsume(String expected)
                       throws IllegalStateException
    Attempt to consume this current token if it matches the expected value, and return whether this method was indeed able to consume the token.

    The ANY_VALUE constant can be used in the expected value as a wildcard.

    Parameters:
    expected - the expected value of the current token token
    Returns:
    true if the current token did match and was consumed, or false if the current token did not match and therefore was not consumed
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsume

    public boolean canConsume(char expected)
                       throws IllegalStateException
    Attempt to consume this current token if it matches the expected value, and return whether this method was indeed able to consume the token.

    Parameters:
    expected - the expected value of the current token token
    Returns:
    true if the current token did match and was consumed, or false if the current token did not match and therefore was not consumed
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsume

    public boolean canConsume(int expectedType)
                       throws IllegalStateException
    Attempt to consume this current token if it matches the expected token type, and return whether this method was indeed able to consume the token.

    The ANY_TYPE constant can be used in the expected type as a wildcard.

    Parameters:
    expectedType - the expected token type of the current token
    Returns:
    true if the current token did match and was consumed, or false if the current token did not match and therefore was not consumed
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsume

    public boolean canConsume(String currentExpected,
                              String... expectedForNextTokens)
                       throws IllegalStateException
    Attempt to consume this current token and the next tokens if and only if they match the expected values, and return whether this method was indeed able to consume all of the supplied tokens.

    This is not the same as calling canConsume(String) for each of the supplied arguments, since this method ensures that all of the supplied values can be consumed.

    This method is equivalent to calling the following:

     
     if (tokens.matches(currentExpected, expectedForNextTokens)) {
         tokens.consume(currentExpected, expectedForNextTokens);
     }
     
     

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    currentExpected - the expected value of the current token
    expectedForNextTokens - the expected values fo the following tokens
    Returns:
    true if the current token did match and was consumed, or false if the current token did not match and therefore was not consumed
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsume

    public boolean canConsume(String[] nextTokens)
                       throws IllegalStateException
    Attempt to consume this current token and the next tokens if and only if they match the expected values, and return whether this method was indeed able to consume all of the supplied tokens.

    This is not the same as calling canConsume(String) for each of the supplied arguments, since this method ensures that all of the supplied values can be consumed.

    This method is equivalent to calling the following:

     
     if (tokens.matches(currentExpected, expectedForNextTokens)) {
         tokens.consume(currentExpected, expectedForNextTokens);
     }
     
     

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    nextTokens - the expected values of the next tokens
    Returns:
    true if the current token did match and was consumed, or false if the current token did not match and therefore was not consumed
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsume

    public boolean canConsume(Iterable<String> nextTokens)
                       throws IllegalStateException
    Attempt to consume this current token and the next tokens if and only if they match the expected values, and return whether this method was indeed able to consume all of the supplied tokens.

    This is not the same as calling canConsume(String) for each of the supplied arguments, since this method ensures that all of the supplied values can be consumed.

    This method is equivalent to calling the following:

     
     if (tokens.matches(currentExpected, expectedForNextTokens)) {
         tokens.consume(currentExpected, expectedForNextTokens);
     }
     
     

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    nextTokens - the expected values of the next tokens
    Returns:
    true if the current token did match and was consumed, or false if the current token did not match and therefore was not consumed
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsumeAnyOf

    public boolean canConsumeAnyOf(String firstOption,
                                   String... additionalOptions)
                            throws IllegalStateException
    Attempt to consume the next token if it matches one of the supplied values.

    Parameters:
    firstOption - the first option for the value of the current token
    additionalOptions - the additional options for the value of the current token
    Returns:
    true if the current token's value did match one of the suplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsumeAnyOf

    public boolean canConsumeAnyOf(String[] options)
                            throws IllegalStateException
    Attempt to consume the next token if it matches one of the supplied values.

    Parameters:
    options - the options for the value of the current token
    Returns:
    true if the current token's value did match one of the suplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsumeAnyOf

    public boolean canConsumeAnyOf(Iterable<String> options)
                            throws IllegalStateException
    Attempt to consume the next token if it matches one of the supplied values.

    Parameters:
    options - the options for the value of the current token
    Returns:
    true if the current token's value did match one of the suplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsumeAnyOf

    public boolean canConsumeAnyOf(int firstTypeOption,
                                   int... additionalTypeOptions)
                            throws IllegalStateException
    Attempt to consume the next token if it matches one of the supplied types.

    Parameters:
    firstTypeOption - the first option for the type of the current token
    additionalTypeOptions - the additional options for the type of the current token
    Returns:
    true if the current token's type matched one of the supplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    canConsumeAnyOf

    public boolean canConsumeAnyOf(int[] typeOptions)
                            throws IllegalStateException
    Attempt to consume the next token if it matches one of the supplied types.

    Parameters:
    typeOptions - the options for the type of the current token
    Returns:
    true if the current token's type matched one of the supplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(String expected)
                    throws IllegalStateException
    Determine if the current token matches the expected value.

    The ANY_VALUE constant can be used as a wildcard.

    Parameters:
    expected - the expected value of the current token token
    Returns:
    true if the current token did match, or false if the current token did not match
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(char expected)
                    throws IllegalStateException
    Determine if the current token matches the expected value.

    Parameters:
    expected - the expected value of the current token token
    Returns:
    true if the current token did match, or false if the current token did not match
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(int expectedType)
                    throws IllegalStateException
    Determine if the current token matches the expected token type.

    Parameters:
    expectedType - the expected token type of the current token
    Returns:
    true if the current token did match, or false if the current token did not match
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(String currentExpected,
                           String... expectedForNextTokens)
                    throws IllegalStateException
    Determine if the next few tokens match the expected values.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    currentExpected - the expected value of the current token
    expectedForNextTokens - the expected values for the following tokens
    Returns:
    true if the tokens did match, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(String[] nextTokens)
                    throws IllegalStateException
    Determine if the next few tokens match the expected values.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    nextTokens - the expected value of the next tokens
    Returns:
    true if the tokens did match, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(Iterable<String> nextTokens)
                    throws IllegalStateException
    Determine if the next few tokens match the expected values.

    The ANY_VALUE constant can be used in the expected values as a wildcard.

    Parameters:
    nextTokens - the expected value of the next tokens
    Returns:
    true if the tokens did match, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(int currentExpectedType,
                           int... expectedTypeForNextTokens)
                    throws IllegalStateException
    Determine if the next few tokens have the supplied types.

    The ANY_TYPE constant can be used in the expected values as a wildcard.

    Parameters:
    currentExpectedType - the expected type of the current token
    expectedTypeForNextTokens - the expected type for the following tokens
    Returns:
    true if the tokens did match, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matches

    public boolean matches(int[] typesForNextTokens)
                    throws IllegalStateException
    Determine if the next few tokens have the supplied types.

    The ANY_TYPE constant can be used in the expected values as a wildcard.

    Parameters:
    typesForNextTokens - the expected type for each of the next tokens
    Returns:
    true if the tokens did match, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matchesAnyOf

    public boolean matchesAnyOf(String firstOption,
                                String... additionalOptions)
                         throws IllegalStateException
    Determine if the next token matches one of the supplied values.

    Parameters:
    firstOption - the first option for the value of the current token
    additionalOptions - the additional options for the value of the current token
    Returns:
    true if the current token's value did match one of the suplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matchesAnyOf

    public boolean matchesAnyOf(String[] options)
                         throws IllegalStateException
    Determine if the next token matches one of the supplied values.

    Parameters:
    options - the options for the value of the current token
    Returns:
    true if the current token's value did match one of the suplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matchesAnyOf

    public boolean matchesAnyOf(Iterable<String> options)
                         throws IllegalStateException
    Determine if the next token matches one of the supplied values.

    Parameters:
    options - the options for the value of the current token
    Returns:
    true if the current token's value did match one of the suplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matchesAnyOf

    public boolean matchesAnyOf(int firstTypeOption,
                                int... additionalTypeOptions)
                         throws IllegalStateException
    Determine if the next token have one of the supplied types.

    Parameters:
    firstTypeOption - the first option for the type of the current token
    additionalTypeOptions - the additional options for the type of the current token
    Returns:
    true if the current token's type matched one of the supplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    matchesAnyOf

    public boolean matchesAnyOf(int[] typeOptions)
                         throws IllegalStateException
    Determine if the next token have one of the supplied types.

    Parameters:
    typeOptions - the options for the type of the current token
    Returns:
    true if the current token's type matched one of the supplied options, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    hasNext

    public boolean hasNext()
    Determine if this stream has another token to be consumed.

    Returns:
    true if there is another token ready for consumption, or false otherwise
    Throws:
    IllegalStateException - if this method was called before the stream was started

    toString

    public String toString()

    Overrides:
    toString in class Object
    See Also:
    Object.toString()

    getContentBetween

    public String getContentBetween(Position starting,
                                    Position end)
    Gets the content string starting at the first position (inclusive) and continuing up to the end position (exclusive).

    Parameters:
    starting - the position marking the beginning of the desired content string.
    end - the position located directly after the returned content string; can be null, which means end of content
    Returns:
    the content string; never null

    basicTokenizer

    public static TokenStream.BasicTokenizer basicTokenizer(boolean includeComments)
    Obtain a basic TokenStream.Tokenizer implementation that ignores whitespace but includes tokens for individual symbols, the period ('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.

    Note that the resulting Tokenizer may not be appropriate in many situations, but is provided merely as a convenience for those situations that happen to be able to use it.

    Parameters:
    includeComments - true if the comments should be retained and be included in the token stream, or false if comments should be stripped and not included in the token stream
    Returns:
    the tokenizer; never null


    Copyright © 2008-2010 JBoss, a division of Red Hat. All Rights Reserved.