org.modeshape.sequencer.ddl
Class DdlTokenStream

java.lang.Object
  extended by org.modeshape.common.text.TokenStream
      extended by org.modeshape.sequencer.ddl.DdlTokenStream

public class DdlTokenStream
extends TokenStream

A TokenStream implementation designed around requirements for tokenizing and parsing DDL statements.

Because of the complexity of DDL, it was necessary to extend TokenStream in order to override the basic tokenizer to tokenize the in-line comments prefixed with "--". In addition, because there is not a default ddl command (or statement) terminator, an override method was added to TokenStream to allow re-tokenizing the initial tokens to re-type the tokens, remove tokens, or any other operation to simplify parsing.

In this case, both reserved words (or key words) and statement start phrases can be registered prior to the TokenStream 's start() method. Any resulting tokens that match the registered string values will be re-typed to identify them as key words (DdlTokenizer.KEYWORD) or statement start phrases (DdlTokenizer.STATEMENT_KEY).


Nested Class Summary
static class DdlTokenStream.DdlTokenizer
           
 
Nested classes/interfaces inherited from class org.modeshape.common.text.TokenStream
TokenStream.BasicTokenizer, TokenStream.CaseInsensitiveToken, TokenStream.CaseInsensitiveTokenFactory, TokenStream.CaseSensitiveToken, TokenStream.CaseSensitiveTokenFactory, TokenStream.CharacterArrayStream, TokenStream.CharacterStream, TokenStream.Token, TokenStream.TokenFactory, TokenStream.Tokenizer, TokenStream.Tokens
 
Field Summary
protected  Set<String> registeredKeyWords
           
protected  List<String[]> registeredStatementStartPhrases
           
 
Fields inherited from class org.modeshape.common.text.TokenStream
ANY_TYPE, ANY_VALUE, inputString, inputUppercased
 
Constructor Summary
DdlTokenStream(String content, TokenStream.Tokenizer tokenizer, boolean caseSensitive)
           
 
Method Summary
static DdlTokenStream.DdlTokenizer ddlTokenizer(boolean includeComments)
          Obtain a ddl DdlTokenStream.DdlTokenizer implementation that ignores whitespace but includes tokens for individual symbols, the period ('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.
 String getMarkedContent()
          Returns the string content for characters bounded by the previous marked position and the position of the currentToken (inclusive).
protected  List<TokenStream.Token> initializeTokens(List<TokenStream.Token> tokens)
          Method to allow subclasses to preprocess the set of tokens and return the correct tokens to use.
protected  boolean isKeyWord(String word)
           
 boolean isNextKeyWord()
          Method to determine if the next token is of type DdlTokenStream.DdlTokenizer KEYWORD.
 boolean isNextStatementStart()
          Method to determine if next tokens match a registered statement start phrase.
 void mark()
          Marks the current position (line & column number) of the currentToken
 void registerKeyWord(String keyWord)
          Register a single key word.
 void registerKeyWords(List<String> keyWords)
          Register an List of key words.
 void registerKeyWords(String[] keyWords)
          Register an array of key words.
 void registerStatementStartPhrase(String[] phrase)
          Register a phrase representing the start of a DDL statement Examples would be: {"CREATE", "TABLE"} {"CREATE", "OR", "REPLACE", "VIEW"} see DdlConstants for the default SQL 92 representations.
 void registerStatementStartPhrase(String[][] phrases)
           
 
Methods inherited from class org.modeshape.common.text.TokenStream
basicTokenizer, canConsume, canConsume, canConsume, canConsume, canConsume, canConsume, canConsumeAnyOf, canConsumeAnyOf, canConsumeAnyOf, canConsumeAnyOf, canConsumeAnyOf, consume, consume, consume, consume, consume, consume, consume, consumeBoolean, consumeInteger, consumeLong, getContentBetween, hasNext, matches, matches, matches, matches, matches, matches, matches, matches, matchesAnyOf, matchesAnyOf, matchesAnyOf, matchesAnyOf, matchesAnyOf, nextPosition, previousPosition, rewind, start, throwNoMoreContent, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

registeredStatementStartPhrases

protected List<String[]> registeredStatementStartPhrases

registeredKeyWords

protected Set<String> registeredKeyWords
Constructor Detail

DdlTokenStream

public DdlTokenStream(String content,
                      TokenStream.Tokenizer tokenizer,
                      boolean caseSensitive)
Parameters:
content -
tokenizer -
caseSensitive -
Method Detail

initializeTokens

protected List<TokenStream.Token> initializeTokens(List<TokenStream.Token> tokens)
Method to allow subclasses to preprocess the set of tokens and return the correct tokens to use. The default behavior is to simply return the supplied tokens.

Overrides:
initializeTokens in class TokenStream
Returns:
list of tokens.
See Also:
TokenStream.initializeTokens(java.util.List)

registerStatementStartPhrase

public void registerStatementStartPhrase(String[] phrase)
Register a phrase representing the start of a DDL statement

Examples would be: {"CREATE", "TABLE"} {"CREATE", "OR", "REPLACE", "VIEW"}

see DdlConstants for the default SQL 92 representations.

Parameters:
phrase -

registerStatementStartPhrase

public void registerStatementStartPhrase(String[][] phrases)

registerKeyWord

public void registerKeyWord(String keyWord)
Register a single key word.

Parameters:
keyWord -

registerKeyWords

public void registerKeyWords(List<String> keyWords)
Register an List of key words.

Parameters:
keyWords -

registerKeyWords

public void registerKeyWords(String[] keyWords)
Register an array of key words.

Parameters:
keyWords -

isKeyWord

protected boolean isKeyWord(String word)
Parameters:
word -
Returns:
is Key Word

isNextKeyWord

public boolean isNextKeyWord()
Method to determine if the next token is of type DdlTokenStream.DdlTokenizer KEYWORD.

Returns:
is Key Word

isNextStatementStart

public boolean isNextStatementStart()
Method to determine if next tokens match a registered statement start phrase.

Returns:
true if next tokens match a registered statement start phrase

mark

public void mark()
Marks the current position (line & column number) of the currentToken


getMarkedContent

public String getMarkedContent()
Returns the string content for characters bounded by the previous marked position and the position of the currentToken (inclusive). Method also marks() the new position the the currentToken.

Returns:
the string content for characters bounded by the previous marked position and the position of the currentToken (inclusive).

ddlTokenizer

public static DdlTokenStream.DdlTokenizer ddlTokenizer(boolean includeComments)
Obtain a ddl DdlTokenStream.DdlTokenizer implementation that ignores whitespace but includes tokens for individual symbols, the period ('.'), single-quoted strings, double-quoted strings, whitespace-delimited words, and optionally comments.

Note that the resulting Tokenizer may not be appropriate in many situations, but is provided merely as a convenience for those situations that happen to be able to use it.

Parameters:
includeComments - true if the comments should be retained and be included in the token stream, or false if comments should be stripped and not included in the token stream
Returns:
the tokenizer; never null


Copyright © 2008-2010 JBoss, a division of Red Hat. All Rights Reserved.