org.jboss.dna.sequencer.msoffice
Class MSOfficeMetadataSequencer
java.lang.Object
org.jboss.dna.sequencer.msoffice.MSOfficeMetadataSequencer
- All Implemented Interfaces:
- StreamSequencer
public class MSOfficeMetadataSequencer
- extends Object
- implements StreamSequencer
A sequencer that processes the content of an MS Office document, extracts the metadata for the file, and then writes that
metadata to the repository.
This sequencer produces data that corresponds to the following structure:
- msoffice:metadata node of type
msoffice:metadata
- msoffice:title optional string property for the title of the documnt
- msoffice:subject optional string property for the subject of the document
- msoffice:author optional string property for the author of the document
- msoffice:keywords optional string property for the document keywords
- msoffice:comment optional string property for the document comment
- msoffice:template optional string property for the template from which this document originates
- msoffice:last_saved_by optional string property for the person that last saved this document
- msoffice:revision optional string property for this document revision
- msoffice:total_editing_time optional long property for the length this document has been edited
- msoffice:last_printed optional date property for the date of last printing this document
- msoffice:created date property for the date of creation of the document
- msoffice:saved date property for the date of last save of this document
- msoffice:pages long property for the number of pages of this document
- msoffice:words long property for the number of words in this document
- msoffice:characters long property for the number of characters in this document
- msoffice:creating_application string property for the application used to create this document
- msoffice:thumbnail optional binary property for the thumbanail of this document
- msoffice:full_contents optional String property holding the text contents of an excel file
- msoffice:sheet_name optional String property for the name of a sheet in excel (multiple)
- msoffice:slide node of type
msoffice:pptslide
- msoffice:title optional String property for the title of a slide
- msoffice:notes optional String property for the notes of a slide
- msoffice:text optional String property for the text of a slide
- msoffice:thumbnail optional binary property for the thumbnail of a slide (PNG image)
- Author:
- Michael Trezzi, John Verhaeg
METADATA_NODE
public static final String METADATA_NODE
- See Also:
- Constant Field Values
MSOFFICE_PRIMARY_TYPE
public static final String MSOFFICE_PRIMARY_TYPE
- See Also:
- Constant Field Values
MSOFFICE_TITLE
public static final String MSOFFICE_TITLE
- See Also:
- Constant Field Values
MSOFFICE_SUBJECT
public static final String MSOFFICE_SUBJECT
- See Also:
- Constant Field Values
MSOFFICE_AUTHOR
public static final String MSOFFICE_AUTHOR
- See Also:
- Constant Field Values
MSOFFICE_KEYWORDS
public static final String MSOFFICE_KEYWORDS
- See Also:
- Constant Field Values
MSOFFICE_COMMENT
public static final String MSOFFICE_COMMENT
- See Also:
- Constant Field Values
MSOFFICE_TEMPLATE
public static final String MSOFFICE_TEMPLATE
- See Also:
- Constant Field Values
MSOFFICE_LAST_SAVED_BY
public static final String MSOFFICE_LAST_SAVED_BY
- See Also:
- Constant Field Values
MSOFFICE_REVISION
public static final String MSOFFICE_REVISION
- See Also:
- Constant Field Values
MSOFFICE_TOTAL_EDITING_TIME
public static final String MSOFFICE_TOTAL_EDITING_TIME
- See Also:
- Constant Field Values
MSOFFICE_LAST_PRINTED
public static final String MSOFFICE_LAST_PRINTED
- See Also:
- Constant Field Values
MSOFFICE_CREATED
public static final String MSOFFICE_CREATED
- See Also:
- Constant Field Values
MSOFFICE_SAVED
public static final String MSOFFICE_SAVED
- See Also:
- Constant Field Values
MSOFFICE_PAGES
public static final String MSOFFICE_PAGES
- See Also:
- Constant Field Values
MSOFFICE_WORDS
public static final String MSOFFICE_WORDS
- See Also:
- Constant Field Values
MSOFFICE_CHARACTERS
public static final String MSOFFICE_CHARACTERS
- See Also:
- Constant Field Values
MSOFFICE_CREATING_APPLICATION
public static final String MSOFFICE_CREATING_APPLICATION
- See Also:
- Constant Field Values
MSOFFICE_THUMBNAIL
public static final String MSOFFICE_THUMBNAIL
- See Also:
- Constant Field Values
POWERPOINT_SLIDE_NODE
public static final String POWERPOINT_SLIDE_NODE
- See Also:
- Constant Field Values
SLIDE_TITLE
public static final String SLIDE_TITLE
- See Also:
- Constant Field Values
SLIDE_TEXT
public static final String SLIDE_TEXT
- See Also:
- Constant Field Values
SLIDE_NOTES
public static final String SLIDE_NOTES
- See Also:
- Constant Field Values
SLIDE_THUMBNAIL
public static final String SLIDE_THUMBNAIL
- See Also:
- Constant Field Values
EXCEL_FULL_CONTENT
public static final String EXCEL_FULL_CONTENT
- See Also:
- Constant Field Values
EXCEL_SHEET_NAME
public static final String EXCEL_SHEET_NAME
- See Also:
- Constant Field Values
WORD_HEADING_NODE
public static final String WORD_HEADING_NODE
- See Also:
- Constant Field Values
WORD_HEADING_NAME
public static final String WORD_HEADING_NAME
- See Also:
- Constant Field Values
WORD_HEADING_LEVEL
public static final String WORD_HEADING_LEVEL
- See Also:
- Constant Field Values
MSOfficeMetadataSequencer
public MSOfficeMetadataSequencer()
sequence
public void sequence(InputStream stream,
SequencerOutput output,
StreamSequencerContext context)
- Sequence the data found in the supplied stream, placing the output information into the supplied map.
JBoss DNA's SequencingService determines the sequencers that should be executed by monitoring the changes to one or more
workspaces that it is monitoring. Changes in those workspaces are aggregated and used to determine which sequencers should
be called. If the sequencer implements this interface, then this method is called with the property that is to be sequenced
along with the interface used to register the output. The framework takes care of all the rest.
- Specified by:
sequence
in interface StreamSequencer
- Parameters:
stream
- the stream with the data to be sequenced; never null
output
- the output from the sequencing operation; never null
context
- the context for the sequencing operation; never null
Copyright © 2008-Present JBoss a division of Red Hat. All Rights Reserved.