org.apache.clerezza.uima.metadatagenerator.mediatype
Interface MediaTypeTextExtractor

All Known Implementing Classes:
PlainTextExtractor, TikaTextExtractor

public interface MediaTypeTextExtractor

A MediaTypeTextExtractor extracts text from a (list of) specified MediaType.


Method Summary
 String extract(byte[] bytes)
          Extract the text from the provided input if its Media Type is supported.
 boolean supports(javax.ws.rs.core.MediaType mediaType)
          Check if the provided MediaType is supported by this extractor.
 

Method Detail

supports

boolean supports(javax.ws.rs.core.MediaType mediaType)
Check if the provided MediaType is supported by this extractor.

Parameters:
mediaType - to be checked.
Returns:
true if the provided MediaType as input is supported.

extract

String extract(byte[] bytes)
               throws UnsupportedMediaTypeException
Extract the text from the provided input if its Media Type is supported.

Parameters:
bytes - an array of byte representing the input.
Returns:
a String with the extracted text.
Throws:
UnsupportedMediaTypeException - if the input implicit Media type is not supported.


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.