This noun tagger uses UMBEL reference concepts to tag an input text or Web documents. The OBIE (Ontology-Based Information Extraction) method is used, driven by the UMBEL reference concept ontology. By noun we mean that the tagging only occurs with the words (tokens) that are considered singular or plurial nouns in the sentence(s) of the input text. The nouns are matched to either the preferred labels or alternative labels of the reference concepts, with the match basis denoted by color. The simple tagger is merely making nouns string matches to the possible UMBEL reference concepts.

This tagger uses the plain labels of the reference concepts as matches against the nouns of the input text. With this tagger, no manipulations are performed on the reference concept labels nor on the input text except if you specified the usage of the stemmer. Also, there is NO disambiguation performed by the tagger if multiple concepts are tagged for a given keyword.

If the stemmer is specified, then the input text will be stemmed and the matches will be made againsts an index where all the preferred and alternative labels have been stemmed as well. Then once the matches occurs, the tagger will recompose the text such that unstemmed versions of the input text and the tagged reference concepts are presented to the user.

The results are presented in two sections depending on whether the preferred or alternative label(s) were matched. Multiple matches, either by concept or label type, are coded by color. Source words with matches and multiple source occurrences are ranked first; thereafter, all source words are presented alphabetically.

This tool is intended for those who want to focus on UMBEL and do not care about more complicated matches. The output of the tagger can be used as-is, but it intended to be the initial input to more sophisticated reference concept matching and disambiguation methods.

