public static class TokenStream.BasicTokenizer extends Object implements TokenStream.Tokenizer
TokenStream.Tokenizer
implementation that ignores whitespace but includes tokens for individual symbols, the period
('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.
Note this Tokenizer may not be appropriate in many situations, but is provided merely as a convenience for those situations that happen to be able to use it.
Modifier and Type | Field and Description |
---|---|
static int |
COMMENT
The
token type for tokens that consist of all the characters between "/*" and "*/" or between
"//" and the next line terminator (e.g., '\n', '\r' or "\r\n"). |
static int |
DECIMAL
The
token type for tokens that consist of an individual '.' character. |
static int |
DOUBLE_QUOTED_STRING
The
token type for tokens that consist of all the characters within double-quotes. |
static int |
SINGLE_QUOTED_STRING
The
token type for tokens that consist of all the characters within single-quotes. |
static int |
SYMBOL
The
token type for tokens that consist of an individual "symbol" character. |
static int |
WORD
The
token type for tokens that represent an unquoted string containing a character sequence made
up of non-whitespace and non-symbol characters. |
Modifier | Constructor and Description |
---|---|
protected |
BasicTokenizer(boolean useComments) |
Modifier and Type | Method and Description |
---|---|
void |
tokenize(TokenStream.CharacterStream input,
TokenStream.Tokens tokens)
Process the supplied characters and construct the appropriate
TokenStream.Token objects. |
public static final int WORD
token type
for tokens that represent an unquoted string containing a character sequence made
up of non-whitespace and non-symbol characters.public static final int SYMBOL
token type
for tokens that consist of an individual "symbol" character. The set of characters
includes: -(){}*,;+%?$[]!<>|=:
public static final int DECIMAL
token type
for tokens that consist of an individual '.' character.public static final int SINGLE_QUOTED_STRING
token type
for tokens that consist of all the characters within single-quotes. Single quote
characters are included if they are preceded (escaped) by a '\' character.public static final int DOUBLE_QUOTED_STRING
token type
for tokens that consist of all the characters within double-quotes. Double quote
characters are included if they are preceded (escaped) by a '\' character.public static final int COMMENT
token type
for tokens that consist of all the characters between "/*" and "*/" or between
"//" and the next line terminator (e.g., '\n', '\r' or "\r\n").public void tokenize(TokenStream.CharacterStream input, TokenStream.Tokens tokens) throws ParsingException
TokenStream.Tokenizer
TokenStream.Token
objects.tokenize
in interface TokenStream.Tokenizer
input
- the character input stream; never nulltokens
- the factory for TokenStream.Token
objects, which records the order in which the tokens are createdParsingException
- if there is an error while processing the character stream (e.g., a quote is not closed, etc.)Copyright © 2008–2016 JBoss, a division of Red Hat. All rights reserved.