org.apache.stanbol.commons.opennlp
Class KeywordTokenizer
java.lang.Object
org.apache.stanbol.commons.opennlp.KeywordTokenizer
- All Implemented Interfaces:
- opennlp.tools.tokenize.Tokenizer
public class KeywordTokenizer
- extends Object
- implements opennlp.tools.tokenize.Tokenizer
Performs tokenization using the character class whitespace. Will create
seperate tokens for punctation at the end of words.
Intended to be used to extract alphanumeric
keywords from texts
- Author:
- Rupert Westenthaler
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
INSTANCE
public static final KeywordTokenizer INSTANCE
tokenize
public String[] tokenize(String s)
- Specified by:
tokenize in interface opennlp.tools.tokenize.Tokenizer
tokenizePos
public opennlp.tools.util.Span[] tokenizePos(String s)
- Specified by:
tokenizePos in interface opennlp.tools.tokenize.Tokenizer
Copyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.