org.apache.stanbol.enhancer.engines.opennlp.token.impl
Class OpenNlpTokenizerEngine

java.lang.Object
  extended by org.apache.stanbol.enhancer.servicesapi.impl.AbstractEnhancementEngine<RuntimeException,RuntimeException>
      extended by org.apache.stanbol.enhancer.engines.opennlp.token.impl.OpenNlpTokenizerEngine
All Implemented Interfaces:
org.apache.stanbol.enhancer.servicesapi.EnhancementEngine, org.apache.stanbol.enhancer.servicesapi.ServiceProperties

@Service
@Properties(value={@Property(name="stanbol.enhancer.engine.name",value="opennlp-token"),@Property(name="org.apache.stanbol.enhancer.token.languages",value="*",cardinality=2147483647),@Property(name="service.ranking",intValue=-100)})
public class OpenNlpTokenizerEngine
extends org.apache.stanbol.enhancer.servicesapi.impl.AbstractEnhancementEngine<RuntimeException,RuntimeException>
implements org.apache.stanbol.enhancer.servicesapi.ServiceProperties

A german language POS tagger. Requires that the content item has a text/plain part and a language id of "de". Adds a POSContentPart to the content item that can be used for further processing by other modules.

Author:
Sebastian Schaffert

Field Summary
static String CONFIG_LANGUAGES
          Language configuration.
 
Fields inherited from interface org.apache.stanbol.enhancer.servicesapi.ServiceProperties
ENHANCEMENT_ENGINE_ORDERING, ORDERING_CONTENT_EXTRACTION, ORDERING_DEFAULT, ORDERING_EXTRACTION_ENHANCEMENT, ORDERING_NLP_CHUNK, ORDERING_NLP_LANGAUGE_DETECTION, ORDERING_NLP_LEMMATIZE, ORDERING_NLP_POS, ORDERING_NLP_SENTENCE_DETECTION, ORDERING_NLP_TOKENIZING, ORDERING_POST_PROCESSING, ORDERING_PRE_PROCESSING
 
Fields inherited from interface org.apache.stanbol.enhancer.servicesapi.EnhancementEngine
CANNOT_ENHANCE, ENHANCE_ASYNC, ENHANCE_SYNCHRONOUS, PROPERTY_NAME
 
Constructor Summary
OpenNlpTokenizerEngine()
           
 
Method Summary
protected  void activate(org.osgi.service.component.ComponentContext ce)
          Activate and read the properties.
 int canEnhance(org.apache.stanbol.enhancer.servicesapi.ContentItem ci)
          Indicate if this engine can enhance supplied ContentItem, and if it suggests enhancing it synchronously or asynchronously.
 void computeEnhancements(org.apache.stanbol.enhancer.servicesapi.ContentItem ci)
          Compute enhancements for supplied ContentItem.
protected  void deactivate(org.osgi.service.component.ComponentContext context)
           
 Map<String,Object> getServiceProperties()
           
 
Methods inherited from class org.apache.stanbol.enhancer.servicesapi.impl.AbstractEnhancementEngine
getName, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

CONFIG_LANGUAGES

public static final String CONFIG_LANGUAGES
Language configuration. Takes a list of ISO language codes of supported languages. Currently supported are the languages given as default value.

See Also:
Constant Field Values
Constructor Detail

OpenNlpTokenizerEngine

public OpenNlpTokenizerEngine()
Method Detail

canEnhance

public int canEnhance(org.apache.stanbol.enhancer.servicesapi.ContentItem ci)
               throws org.apache.stanbol.enhancer.servicesapi.EngineException
Indicate if this engine can enhance supplied ContentItem, and if it suggests enhancing it synchronously or asynchronously. The EnhancementJobManager can force sync/async mode if desired, it is just a suggestion from the engine.

Returns ENHANCE_ASYNC in case there is a text/plain content part and a tagger for the language identified for the content item, CANNOT_ENHANCE otherwise.

Specified by:
canEnhance in interface org.apache.stanbol.enhancer.servicesapi.EnhancementEngine
Throws:
org.apache.stanbol.enhancer.servicesapi.EngineException - if the introspecting process of the content item fails

computeEnhancements

public void computeEnhancements(org.apache.stanbol.enhancer.servicesapi.ContentItem ci)
                         throws org.apache.stanbol.enhancer.servicesapi.EngineException
Compute enhancements for supplied ContentItem. The results of the process are expected to be stored in the metadata of the content item.

The client (usually an EnhancementJobManager) should take care of persistent storage of the enhanced ContentItem.

This method creates a new POSContentPart using org.apache.stanbol.enhancer.engines.pos.api.POSTaggerHelper#createContentPart from a text/plain part and stores it as a new part in the content item. The metadata is not changed.

Specified by:
computeEnhancements in interface org.apache.stanbol.enhancer.servicesapi.EnhancementEngine
Throws:
org.apache.stanbol.enhancer.servicesapi.EngineException - if the underlying process failed to work as expected

getServiceProperties

public Map<String,Object> getServiceProperties()
Specified by:
getServiceProperties in interface org.apache.stanbol.enhancer.servicesapi.ServiceProperties

activate

@Activate
protected void activate(org.osgi.service.component.ComponentContext ce)
                 throws org.osgi.service.cm.ConfigurationException
Activate and read the properties. Configures and initialises a POSTagger for each language configured in CONFIG_LANGUAGES.

Overrides:
activate in class org.apache.stanbol.enhancer.servicesapi.impl.AbstractEnhancementEngine<RuntimeException,RuntimeException>
Parameters:
ce - the ComponentContext
Throws:
org.osgi.service.cm.ConfigurationException

deactivate

@Deactivate
protected void deactivate(org.osgi.service.component.ComponentContext context)
Overrides:
deactivate in class org.apache.stanbol.enhancer.servicesapi.impl.AbstractEnhancementEngine<RuntimeException,RuntimeException>


Copyright © 2012-2013 The Apache Software Foundation. All Rights Reserved.