Interface TextExtractor

All Known Implementing Classes:
TeiidVdbTextExtractor, TextExtractors, TikaTextExtractor

public interface TextExtractor

An abstraction for components that are able to extract text content from an input stream.

Method Summary
 void extractFrom(InputStream stream, TextExtractorOutput output, TextExtractorContext context)
          Sequence the data found in the supplied stream, placing the output information into the supplied map.
 boolean supportsMimeType(String mimeType)
          Determine if this extractor is capable of processing content with the supplied MIME type.

Method Detail


boolean supportsMimeType(String mimeType)
Determine if this extractor is capable of processing content with the supplied MIME type.

mimeType - the MIME type; never null
true if this extractor can process content with the supplied MIME type, or false otherwise.


void extractFrom(InputStream stream,
                 TextExtractorOutput output,
                 TextExtractorContext context)
                 throws IOException
Sequence the data found in the supplied stream, placing the output information into the supplied map.

ModeShape's SequencingService determines the sequencers that should be executed by monitoring the changes to one or more workspaces that it is monitoring. Changes in those workspaces are aggregated and used to determine which sequencers should be called. If the sequencer implements this interface, then this method is called with the property that is to be sequenced along with the interface used to register the output. The framework takes care of all the rest.

stream - the stream with the data to be sequenced; never null
output - the output from the sequencing operation; never null
context - the context for the sequencing operation; never null
IOException - if there is a problem reading the stream

