org.apache.stanbol.commons.opennlp
Enum PosTagsCollectionEnum

java.lang.Object
  extended by java.lang.Enum<PosTagsCollectionEnum>
      extended by org.apache.stanbol.commons.opennlp.PosTagsCollectionEnum
All Implemented Interfaces:
Serializable, Comparable<PosTagsCollectionEnum>

Deprecated. replaced by STANBOL-733 (stanbol nlp processing module

public enum PosTagsCollectionEnum
extends Enum<PosTagsCollectionEnum>

Enumeration with pre-configured sets of POS tags for finding nouns, verbs ... in different languages

Author:
Rupert Westenthaler

Enum Constant Summary
DA_FOLLOW
          Deprecated. POS types that are followd to extend chunks for Danish based on the PAROLE Tagset as described by this paper
DA_NOUN
          Deprecated. POS types representing Nouns for Danish based on the PAROLE Tagset as described by this paper
DA_VERB
          Deprecated. POS types representing Verbs for Danish based on the PAROLE Tagset as described by this paper
DE_FOLLOW
          Deprecated. POS types one needs typically to follow to build TextAnalyzer.AnalysedText.Chunks over Nouns (e.g.
DE_NOUN
          Deprecated. Noun related POS types for German based on the STTS Tag Set
DE_VERB
          Deprecated. Verb related POS types for German based on the STTS Tag Set
EN_FOLLOW
          Deprecated. POS types one needs typically to follow to build TextAnalyzer.AnalysedText.Chunks over Nouns (e.g.
EN_NOUN
          Deprecated. Nouns related POS types for English based on the Penn Treebank tag set.
EN_VERB
          Deprecated. Verb related POS types for English based on the Penn Treebank tag set
ES_FOLLOW
          Deprecated. POS types one needs typically to follow to build TextAnalyzer.AnalysedText.Chunks over Nouns (e.g.
ES_NOUN
          Deprecated. Nouns related POS types for Spanish language.
ES_VERB
          Deprecated. Verb related POS types for Spanish language.
NL_FOLLOW
          Deprecated. POS types followed to build Chunks based on the WOTAN tagset for Dutch (as used with Mbt).
NL_NOUN
          Deprecated. POS types for Nouns based on the WOTAN tagset for Dutch (as used with Mbt).
NL_VERB
          Deprecated. POS types for Verbs based on the WOTAN tagset for Dutch (as used with Mbt).
PT_FOLLOW
          Deprecated. POS types followed to build Chunks based on the PALAVRAS tag set for Portuguese.
PT_NOUN
          Deprecated. POS types for Nouns based on the PALAVRAS tag set for Portuguese.
PT_VERB
          Deprecated. POS types for Verbs based on the PALAVRAS tag set for Portuguese.
SV_FOLLOW
          Deprecated. POS types followed to build Chunks based on the TODO
SV_NOUN
          Deprecated. POS types for Nouns for Swedish language based on Lexical categories in MAMBA NOTE: This includes all typical noun categories as defined by MAMBA Unclassifiable part-of-speech and Numerical "RO" EN is excluded
SV_VERB
          Deprecated. POS types for Verbs of the Swedish language based on the Lexical categories in MAMBA
 
Method Summary
 String getLanguage()
          Deprecated.  
static Set<String> getPosTagCollection(String lang, PosTypeCollectionType type)
          Deprecated. Getter for the POS (Part-of-Speech) tag collection for the given language and type
 Set<String> getTags()
          Deprecated. Getter for the set of POS tags
 PosTypeCollectionType getType()
          Deprecated.  
static PosTagsCollectionEnum valueOf(String name)
          Deprecated. Returns the enum constant of this type with the specified name.
static PosTagsCollectionEnum[] values()
          Deprecated. Returns an array containing the constants of this enum type, in the order they are declared.
 
Methods inherited from class java.lang.Enum
clone, compareTo, equals, finalize, getDeclaringClass, hashCode, name, ordinal, toString, valueOf
 
Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait
 

Enum Constant Detail

EN_NOUN

public static final PosTagsCollectionEnum EN_NOUN
Deprecated. 
Nouns related POS types for English based on the Penn Treebank tag set.

NOTE the "``" tag is also added as noun, because it can not be found in the official tag set and is sometimes used to tag nouns.


EN_VERB

public static final PosTagsCollectionEnum EN_VERB
Deprecated. 
Verb related POS types for English based on the Penn Treebank tag set


EN_FOLLOW

public static final PosTagsCollectionEnum EN_FOLLOW
Deprecated. 
POS types one needs typically to follow to build TextAnalyzer.AnalysedText.Chunks over Nouns (e.g. "University_NN of_IN Otago_NNP" or "Geneva_NNP ,_, Ohio_NNP"). For English and based on the Penn Treebank tag set


DE_NOUN

public static final PosTagsCollectionEnum DE_NOUN
Deprecated. 
Noun related POS types for German based on the STTS Tag Set


DE_VERB

public static final PosTagsCollectionEnum DE_VERB
Deprecated. 
Verb related POS types for German based on the STTS Tag Set


DE_FOLLOW

public static final PosTagsCollectionEnum DE_FOLLOW
Deprecated. 
POS types one needs typically to follow to build TextAnalyzer.AnalysedText.Chunks over Nouns (e.g. "University_NN of_IN Otago_NNP" or "Geneva_NNP ,_, Ohio_NNP"). For German based on the STTS Tag Set


DA_NOUN

public static final PosTagsCollectionEnum DA_NOUN
Deprecated. 
POS types representing Nouns for Danish based on the PAROLE Tagset as described by this paper

TODO: Someone who speaks Danish should check this List NOTES:


DA_VERB

public static final PosTagsCollectionEnum DA_VERB
Deprecated. 
POS types representing Verbs for Danish based on the PAROLE Tagset as described by this paper

TODO: Someone who speaks Danish should check this List


DA_FOLLOW

public static final PosTagsCollectionEnum DA_FOLLOW
Deprecated. 
POS types that are followd to extend chunks for Danish based on the PAROLE Tagset as described by this paper

TODO: Someone who speaks Danish should check this List

NOTES:


PT_NOUN

public static final PosTagsCollectionEnum PT_NOUN
Deprecated. 
POS types for Nouns based on the PALAVRAS tag set for Portuguese.

TODO: Someone who speaks this language should check this List

NOTES: Currently this includes nouns, proper nouns and numbers. In addition I added "vp". "vp" is not part of the POS tag set documentation but in the training set there is a single occurrence therefore the POS tagger sometimes do tag words with this tag.


PT_VERB

public static final PosTagsCollectionEnum PT_VERB
Deprecated. 
POS types for Verbs based on the PALAVRAS tag set for Portuguese.

TODO: Someone who speaks this language should check this List


PT_FOLLOW

public static final PosTagsCollectionEnum PT_FOLLOW
Deprecated. 
POS types followed to build Chunks based on the PALAVRAS tag set for Portuguese.

TODO: Someone who speaks this language should check this List

NOTES: Currently this pubctations and prepositions.


NL_NOUN

public static final PosTagsCollectionEnum NL_NOUN
Deprecated. 
POS types for Nouns based on the WOTAN tagset for Dutch (as used with Mbt).

TODOO: Someone who speaks this language should checkthis List

NOTES: This includes now Nouns, Numbers and "others".


NL_VERB

public static final PosTagsCollectionEnum NL_VERB
Deprecated. 
POS types for Verbs based on the WOTAN tagset for Dutch (as used with Mbt).

The tagger does not distinguish the different forms fo verbs. Therefore it is enough so include "V"


NL_FOLLOW

public static final PosTagsCollectionEnum NL_FOLLOW
Deprecated. 
POS types followed to build Chunks based on the WOTAN tagset for Dutch (as used with Mbt).

NOTES: THis includes only prepositions and punctuations


SV_NOUN

public static final PosTagsCollectionEnum SV_NOUN
Deprecated. 
POS types for Nouns for Swedish language based on Lexical categories in MAMBA NOTE:


SV_VERB

public static final PosTagsCollectionEnum SV_VERB
Deprecated. 
POS types for Verbs of the Swedish language based on the Lexical categories in MAMBA


SV_FOLLOW

public static final PosTagsCollectionEnum SV_FOLLOW
Deprecated. 
POS types followed to build Chunks based on the TODO

NOTES: this includes prepositions, Part of idiom, Infinitive marker as well as all kinds of punctuations


ES_NOUN

public static final PosTagsCollectionEnum ES_NOUN
Deprecated. 
Nouns related POS types for Spanish language. The description of the Tagset is available at http://www.lsi.upc.edu/~nlp/SVMTool/parole.html


ES_VERB

public static final PosTagsCollectionEnum ES_VERB
Deprecated. 
Verb related POS types for Spanish language. I was not able to find the list, so POS tag results where used to create this configuration


ES_FOLLOW

public static final PosTagsCollectionEnum ES_FOLLOW
Deprecated. 
POS types one needs typically to follow to build TextAnalyzer.AnalysedText.Chunks over Nouns (e.g. "University_NN of_IN Otago_NNP" or "Geneva_NNP ,_, Ohio_NNP"). I was not able to find the list, so POS tag results where used to create this configuration.

For now "SP" and all "F*" tokens referring to '.', ';', ...

Method Detail

values

public static PosTagsCollectionEnum[] values()
Deprecated. 
Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
for (PosTagsCollectionEnum c : PosTagsCollectionEnum.values())
    System.out.println(c);

Returns:
an array containing the constants of this enum type, in the order they are declared

valueOf

public static PosTagsCollectionEnum valueOf(String name)
Deprecated. 
Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)

Parameters:
name - the name of the enum constant to be returned.
Returns:
the enum constant with the specified name
Throws:
IllegalArgumentException - if this enum type has no constant with the specified name
NullPointerException - if the argument is null

getTags

public final Set<String> getTags()
Deprecated. 
Getter for the set of POS tags

Returns:
the tags

getLanguage

public final String getLanguage()
Deprecated. 
Returns:
the language

getType

public final PosTypeCollectionType getType()
Deprecated. 
Returns:
the type

getPosTagCollection

public static Set<String> getPosTagCollection(String lang,
                                              PosTypeCollectionType type)
Deprecated. 
Getter for the POS (Part-of-Speech) tag collection for the given language and type

Parameters:
lang - the language
type - the type
Returns:
the collection or null if no configuration for the parsed parameters is available.


Copyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.