public enum PosTagsCollectionEnum extends Enum<PosTagsCollectionEnum>
| Enum Constant and Description |
|---|
DA_FOLLOW
Deprecated.
POS types that are followd to extend chunks for Danish based on the PAROLE Tagset as
described by this paper
TODO: Someone who speaks Danish should check this List
NOTES:
included also "U" for unknown, because most of the examples in the
training data for OpenNLP seam to be good candidates for following
"XA" is included because the examples include units of
"XP" stands for punctuation and such
|
DA_NOUN
Deprecated.
POS types representing Nouns for Danish based on the PAROLE Tagset as
described by this paper
TODO: Someone who speaks Danish should check this List
NOTES:
included also "XX" and "XR" because the examples in the
training data for OpenNLP seam to be good candidates for processing
"AC" is included because it refers to numbers
|
DA_VERB
Deprecated.
POS types representing Verbs for Danish based on the PAROLE Tagset as
described by this paper
TODO: Someone who speaks Danish should check this List
|
DE_FOLLOW
Deprecated.
POS types one needs typically to follow to build
TextAnalyzer.AnalysedText.Chunks over
Nouns (e.g. |
DE_NOUN
Deprecated.
Noun related POS types for German based on the
STTS Tag Set
|
DE_VERB
Deprecated.
Verb related POS types for German based on the
STTS Tag Set
|
EN_FOLLOW
Deprecated.
POS types one needs typically to follow to build
TextAnalyzer.AnalysedText.Chunks over
Nouns (e.g. |
EN_NOUN
Deprecated.
Nouns related POS types for English based on the
Penn Treebank tag set.
|
EN_VERB
Deprecated.
Verb related POS types for English based on the
Penn Treebank tag set
|
ES_FOLLOW
Deprecated.
POS types one needs typically to follow to build
TextAnalyzer.AnalysedText.Chunks over
Nouns (e.g. |
ES_NOUN
Deprecated.
Nouns related POS types for Spanish language.
|
ES_VERB
Deprecated.
Verb related POS types for Spanish language.
|
NL_FOLLOW
Deprecated.
POS types followed to build Chunks based on the WOTAN tagset for Dutch
(as used with Mbt).
|
NL_NOUN
Deprecated.
POS types for Nouns based on the WOTAN tagset for Dutch (as used with
Mbt).
|
NL_VERB
Deprecated.
POS types for Verbs based on the WOTAN tagset for Dutch (as used with
Mbt).
|
PT_FOLLOW
Deprecated.
POS types followed to build Chunks based on the
PALAVRAS tag set
for Portuguese.
|
PT_NOUN
Deprecated.
POS types for Nouns based on the
PALAVRAS tag set
for Portuguese.
|
PT_VERB
Deprecated.
POS types for Verbs based on the
PALAVRAS tag set
for Portuguese.
|
SV_FOLLOW
Deprecated.
POS types followed to build Chunks based on the TODO
NOTES: this includes prepositions, Part of idiom, Infinitive marker
as well as all kinds of punctuations
|
SV_NOUN
Deprecated.
POS types for Nouns for Swedish language based on
Lexical categories in MAMBA
NOTE:
This includes all typical noun categories as defined by MAMBA
Unclassifiable part-of-speech and
Numerical "RO"
EN is excluded
|
SV_VERB
Deprecated.
POS types for Verbs of the Swedish language based on the
Lexical categories in MAMBA
|
| Modifier and Type | Method and Description |
|---|---|
String |
getLanguage()
Deprecated.
|
static Set<String> |
getPosTagCollection(String lang,
PosTypeCollectionType type)
Deprecated.
Getter for the POS (Part-of-Speech) tag collection for the given language
and type
|
Set<String> |
getTags()
Deprecated.
Getter for the set of POS tags
|
PosTypeCollectionType |
getType()
Deprecated.
|
static PosTagsCollectionEnum |
valueOf(String name)
Deprecated.
Returns the enum constant of this type with the specified name.
|
static PosTagsCollectionEnum[] |
values()
Deprecated.
Returns an array containing the constants of this enum type, in
the order they are declared.
|
public static final PosTagsCollectionEnum EN_NOUN
NOTE the "``" tag is also added as noun, because it can not be found in the official tag set and is sometimes used to tag nouns.
public static final PosTagsCollectionEnum EN_VERB
public static final PosTagsCollectionEnum EN_FOLLOW
TextAnalyzer.AnalysedText.Chunks over
Nouns (e.g. "University_NN of_IN Otago_NNP" or "Geneva_NNP ,_, Ohio_NNP").
For English and based on the
Penn Treebank tag setpublic static final PosTagsCollectionEnum DE_NOUN
public static final PosTagsCollectionEnum DE_VERB
public static final PosTagsCollectionEnum DE_FOLLOW
TextAnalyzer.AnalysedText.Chunks over
Nouns (e.g. "University_NN of_IN Otago_NNP" or "Geneva_NNP ,_, Ohio_NNP").
For German based on the
STTS Tag Setpublic static final PosTagsCollectionEnum DA_NOUN
TODO: Someone who speaks Danish should check this List NOTES:
public static final PosTagsCollectionEnum DA_VERB
TODO: Someone who speaks Danish should check this List
public static final PosTagsCollectionEnum DA_FOLLOW
TODO: Someone who speaks Danish should check this List
NOTES:
public static final PosTagsCollectionEnum PT_NOUN
TODO: Someone who speaks this language should check this List
NOTES: Currently this includes nouns, proper nouns and numbers. In addition I added "vp". "vp" is not part of the POS tag set documentation but in the training set there is a single occurrence therefore the POS tagger sometimes do tag words with this tag.
public static final PosTagsCollectionEnum PT_VERB
TODO: Someone who speaks this language should check this List
public static final PosTagsCollectionEnum PT_FOLLOW
TODO: Someone who speaks this language should check this List
NOTES: Currently this pubctations and prepositions.
public static final PosTagsCollectionEnum NL_NOUN
TODOO: Someone who speaks this language should checkthis List
NOTES: This includes now Nouns, Numbers and "others".
public static final PosTagsCollectionEnum NL_VERB
The tagger does not distinguish the different forms fo verbs. Therefore it is enough so include "V"
public static final PosTagsCollectionEnum NL_FOLLOW
NOTES: THis includes only prepositions and punctuations
public static final PosTagsCollectionEnum SV_NOUN
public static final PosTagsCollectionEnum SV_VERB
public static final PosTagsCollectionEnum SV_FOLLOW
NOTES: this includes prepositions, Part of idiom, Infinitive marker as well as all kinds of punctuations
public static final PosTagsCollectionEnum ES_NOUN
public static final PosTagsCollectionEnum ES_VERB
public static final PosTagsCollectionEnum ES_FOLLOW
TextAnalyzer.AnalysedText.Chunks over
Nouns (e.g. "University_NN of_IN Otago_NNP" or "Geneva_NNP ,_, Ohio_NNP").
I was not able to find the list, so POS tag results where used to
create this configuration.For now "SP" and all "F*" tokens referring to '.', ';', ...
public static PosTagsCollectionEnum[] values()
for (PosTagsCollectionEnum c : PosTagsCollectionEnum.values()) System.out.println(c);
public static PosTagsCollectionEnum valueOf(String name)
name - the name of the enum constant to be returned.IllegalArgumentException - if this enum type has no constant
with the specified nameNullPointerException - if the argument is nullpublic final Set<String> getTags()
public final String getLanguage()
public final PosTypeCollectionType getType()
public static Set<String> getPosTagCollection(String lang, PosTypeCollectionType type)
lang - the languagetype - the typenull if no configuration for the
parsed parameters is available.Copyright © 2010-2014 The Apache Software Foundation. All Rights Reserved.