Class TokenStream.BasicTokenizer

  extended by org.modeshape.common.text.TokenStream.BasicTokenizer
All Implemented Interfaces:
Enclosing class:

public static class TokenStream.BasicTokenizer
extends Object
implements TokenStream.Tokenizer

A basic TokenStream.Tokenizer implementation that ignores whitespace but includes tokens for individual symbols, the period ('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.

Note this Tokenizer may not be appropriate in many situations, but is provided merely as a convenience for those situations that happen to be able to use it.

Field Summary
static int COMMENT
          The token type for tokens that consist of all the characters between "/*" and "*/" or between "//" and the next line terminator (e.g., '\n', '\r' or "\r\n").
static int DECIMAL
          The token type for tokens that consist of an individual '.' character.
          The token type for tokens that consist of all the characters within double-quotes.
          The token type for tokens that consist of all the characters within single-quotes.
static int SYMBOL
          The token type for tokens that consist of an individual "symbol" character.
static int WORD
          The token type for tokens that represent an unquoted string containing a character sequence made up of non-whitespace and non-symbol characters.
Constructor Summary
protected TokenStream.BasicTokenizer(boolean useComments)
Method Summary
 void tokenize(TokenStream.CharacterStream input, TokenStream.Tokens tokens)
          Process the supplied characters and construct the appropriate TokenStream.Token objects.
Methods inherited from class java.lang.Object
Field Detail


public static final int WORD
The token type for tokens that represent an unquoted string containing a character sequence made up of non-whitespace and non-symbol characters.

public static final int SYMBOL
The token type for tokens that consist of an individual "symbol" character. The set of characters includes: -(){}*,;+%?$[]!<>|=:

public static final int DECIMAL
The token type for tokens that consist of an individual '.' character.

public static final int SINGLE_QUOTED_STRING
The token type for tokens that consist of all the characters within single-quotes. Single quote characters are included if they are preceded (escaped) by a '\' character.

public static final int DOUBLE_QUOTED_STRING
The token type for tokens that consist of all the characters within double-quotes. Double quote characters are included if they are preceded (escaped) by a '\' character.

public static final int COMMENT
The token type for tokens that consist of all the characters between "/*" and "*/" or between "//" and the next line terminator (e.g., '\n', '\r' or "\r\n").

Constructor Detail


protected TokenStream.BasicTokenizer(boolean useComments)
Method Detail


public void tokenize(TokenStream.CharacterStream input,
                     TokenStream.Tokens tokens)
              throws ParsingException
Process the supplied characters and construct the appropriate TokenStream.Token objects.

Specified by:
tokenize in interface TokenStream.Tokenizer
input - the character input stream; never null
tokens - the factory for TokenStream.Token objects, which records the order in which the tokens are created
ParsingException - if there is an error while processing the character stream (e.g., a quote is not closed, etc.)
