public class TextAnalyzer extends Object
| Modifier and Type | Class and Description |
|---|---|
class |
TextAnalyzer.AnalysedText
Deprecated.
replaced by STANBOL-733 (stanbol nlp processing module
|
static class |
TextAnalyzer.TextAnalyzerConfig
Deprecated.
replaced by STANBOL-733 (stanbol nlp processing module
|
| Constructor and Description |
|---|
TextAnalyzer(OpenNLP openNLP,
String language)
Deprecated.
Creates a TextAnalyzer based on the OpenNLP and the given language and the
default
configuration. |
TextAnalyzer(OpenNLP openNLP,
String language,
TextAnalyzer.TextAnalyzerConfig config)
Deprecated.
Creates a TextAnalyzer based on the OpenNLP and the given language.
|
| Modifier and Type | Method and Description |
|---|---|
Iterator<TextAnalyzer.AnalysedText> |
analyse(String text)
Deprecated.
Analyses sentence by sentence when
Iterator.next() is called on
the returned Iterator. |
TextAnalyzer.AnalysedText |
analyseSentence(String sentence)
Deprecated.
Analyses the parsed text in a single chunk.
|
protected opennlp.tools.chunker.ChunkerME |
getChunker()
Deprecated.
|
TextAnalyzer.TextAnalyzerConfig |
getConfig()
Deprecated.
|
String |
getLanguage()
Deprecated.
|
OpenNLP |
getOpenNLP()
Deprecated.
|
protected opennlp.tools.postag.POSTaggerME |
getPosTagger()
Deprecated.
|
protected PosTypeChunker |
getPosTypeChunker()
Deprecated.
|
protected opennlp.tools.sentdetect.SentenceDetector |
getSentenceDetector()
Deprecated.
|
opennlp.tools.tokenize.Tokenizer |
getTokenizer()
Deprecated.
Getter for the Tokenizer of a given language
|
public TextAnalyzer(OpenNLP openNLP, String language)
configuration.
If null is parsed as language, than a minimal configuration
that tokenizes the text using the SimpleTokenizer is used.
openNLP - The openNLP configuration to be used to analyze the textlanguage - the language or null if not known.public TextAnalyzer(OpenNLP openNLP, String language, TextAnalyzer.TextAnalyzerConfig config)
If null is parsed as language, than a minimal configuration
that tokenizes the text using the SimpleTokenizer is used.
openNLP - The openNLP configuration to be used to analyze the textlanguage - the language or null if not known.protected final opennlp.tools.postag.POSTaggerME getPosTagger()
public final opennlp.tools.tokenize.Tokenizer getTokenizer()
language - the languageprotected final opennlp.tools.chunker.ChunkerME getChunker()
protected final PosTypeChunker getPosTypeChunker()
protected final opennlp.tools.sentdetect.SentenceDetector getSentenceDetector()
public final OpenNLP getOpenNLP()
public final TextAnalyzer.TextAnalyzerConfig getConfig()
public final String getLanguage()
public TextAnalyzer.AnalysedText analyseSentence(String sentence)
sentence - the sentence (text) to analysepublic Iterator<TextAnalyzer.AnalysedText> analyse(String text)
Iterator.next() is called on
the returned Iterator. Changes to the configuration of this class will
have an effect on the analysis results of this iterator.if no sentence detector is available the whole text is parsed at once.
text - The text to analyseIterator.next().Copyright © 2010-2014 The Apache Software Foundation. All Rights Reserved.