public class PosTypeChunker extends Object
Chunker that uses the POS tags to build chunks.
It does not implement the Chunker interface because implementing
methods other than the Chunker.chunkAsSpans(String[], String[])
is not feasible.Defaults are based on the Penn Treebank tag set TODO:
Chunker interface
| Constructor and Description |
|---|
PosTypeChunker(Set<String> buildPosTypes,
Set<String> followPosTypes,
double minPosProb)
Deprecated.
Initialise a new PosTypeChunker for the parsed POS tag collections.
|
| Modifier and Type | Method and Description |
|---|---|
opennlp.tools.util.Span[] |
chunkAsSpans(String[] tokens,
String[] tags)
Deprecated.
Build the chunks based on the parsed tokens and POS tags.
|
opennlp.tools.util.Span[] |
chunkAsSpans(String[] tokens,
String[][] tags,
double[][] props)
Deprecated.
Build the chunks based on the parsed tokens and the one or more detected
POS tags alternatives for the tokens.
|
Set<String> |
getChunkPosTypes()
Deprecated.
The set of POS types used to create Chunks
|
Set<String> |
getFollowedPosTypes()
Deprecated.
The set of POS types followed to extend Chunks.
|
static PosTypeChunker |
getInstance(String lang,
double minPosTagProbaility)
Deprecated.
Creates an instance for the given language based on the configuration
within the
PosTagsCollectionEnum. |
public PosTypeChunker(Set<String> buildPosTypes, Set<String> followPosTypes, double minPosProb)
PosTagsCollectionEnumNote that buildPosTypes are added to the followed once. Therefore the followPosTypes may or may not include some/all buildPosTypes.
buildPosTypes - the POS types that trigger a new Chunk (MUST NOT be
null nor empty).followPosTypes - additional POS types followed to extend Chunks (MAY
BE null or empty).public static PosTypeChunker getInstance(String lang, double minPosTagProbaility)
PosTagsCollectionEnum.lang - The languageminPosTagProbaility - The minimum probability of a POS tag so that
it is processed. In case of lower Probabilities POS tags are ignored and
assumed to be matching.null if no configuration for the
parsed language is present in the PosTagsCollectionEnum.public final Set<String> getFollowedPosTypes()
getChunkPosTypes() valuespublic final Set<String> getChunkPosTypes()
public opennlp.tools.util.Span[] chunkAsSpans(String[] tokens, String[] tags)
This method is the equivalent to
Chunker.chunkAsSpans(String[], String[])
tokens - the tokenstags - the POS tags for the tokenspublic opennlp.tools.util.Span[] chunkAsSpans(String[] tokens, String[][] tags, double[][] props)
tokens - the tokenstags - the POS tags for the tokens (1D:tokens; 2D:POS tags)Copyright © 2010-2014 The Apache Software Foundation. All Rights Reserved.