org.apache.stanbol.commons.opennlp
Class KeywordTokenizer

java.lang.Object
  extended by org.apache.stanbol.commons.opennlp.KeywordTokenizer
All Implemented Interfaces:
opennlp.tools.tokenize.Tokenizer

public class KeywordTokenizer
extends Object
implements opennlp.tools.tokenize.Tokenizer

Performs tokenization using the character class whitespace. Will create seperate tokens for punctation at the end of words. Intended to be used to extract alphanumeric keywords from texts

Author:
Rupert Westenthaler

Field Summary
static KeywordTokenizer INSTANCE
           
 
Method Summary
 String[] tokenize(String s)
           
 opennlp.tools.util.Span[] tokenizePos(String s)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final KeywordTokenizer INSTANCE
Method Detail

tokenize

public String[] tokenize(String s)
Specified by:
tokenize in interface opennlp.tools.tokenize.Tokenizer

tokenizePos

public opennlp.tools.util.Span[] tokenizePos(String s)
Specified by:
tokenizePos in interface opennlp.tools.tokenize.Tokenizer


Copyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.