ModeShape Distribution 3.0.0.Beta4

org.modeshape.jcr.api.text
Class TextExtractor

java.lang.Object
  extended by org.modeshape.jcr.api.text.TextExtractor
Direct Known Subclasses:
TikaTextExtractor

public abstract class TextExtractor
extends Object

An abstraction for components that are able to extract text content from an input stream.


Nested Class Summary
protected static interface TextExtractor.BinaryOperation<T>
          Interface which can be used by subclasses to process the input stream of a binary property.
static interface TextExtractor.Context
          Interface which provides additional information to the text extractors, during the extraction operation.
static interface TextExtractor.Output
          The interface passed to a TextExtractor to which the extractor should record all text content.
 
Constructor Summary
TextExtractor()
           
 
Method Summary
abstract  void extractFrom(Binary binary, TextExtractor.Output output, TextExtractor.Context context)
          Extract text from the given Binary, using the given output to record the results.
protected  Logger getLogger()
           
 String getName()
           
protected
<T> T
processStream(Binary binary, TextExtractor.BinaryOperation<T> operation)
          Allows subclasses to process the stream of binary value property in "safe" fashion, making sure the stream is closed at the end of the operation.
 void setLogger(Logger logger)
           
 void setName(String name)
           
abstract  boolean supportsMimeType(String mimeType)
          Determine if this extractor is capable of processing content with the supplied MIME type.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextExtractor

public TextExtractor()
Method Detail

supportsMimeType

public abstract boolean supportsMimeType(String mimeType)
Determine if this extractor is capable of processing content with the supplied MIME type.

Parameters:
mimeType - the MIME type; never null
Returns:
true if this extractor can process content with the supplied MIME type, or false otherwise.

extractFrom

public abstract void extractFrom(Binary binary,
                                 TextExtractor.Output output,
                                 TextExtractor.Context context)
                          throws Exception
Extract text from the given Binary, using the given output to record the results.

Parameters:
binary - the binary value that can be used in the extraction process; never null
output - the output from the sequencing operation; never null
context - the context for the sequencing operation; never null
Throws:
Exception - if there is a problem during the extraction process

processStream

protected final <T> T processStream(Binary binary,
                                    TextExtractor.BinaryOperation<T> operation)
                         throws Exception
Allows subclasses to process the stream of binary value property in "safe" fashion, making sure the stream is closed at the end of the operation.

Type Parameters:
T - the return type of the binary operation
Parameters:
binary - a Binary who is expected to contain a non-null binary value.
operation - a TextExtractor.BinaryOperation which should work with the stream
Returns:
whatever type of result the stream operation returns
Throws:
Exception - if there is an error processing the stream

setLogger

public final void setLogger(Logger logger)

getLogger

protected final Logger getLogger()

getName

public String getName()

setName

public void setName(String name)

ModeShape Distribution 3.0.0.Beta4

Copyright © 2008-2012 JBoss, a division of Red Hat. All Rights Reserved.