UMBEL Concept Plain Tagger

Enter your text or documents in the area below.

Supported Web formats are: HTML, XML and derivative formats, Microsoft Office documents, OpenDocument Format (ODF), PDF, EPub, plain text, RSS feeds, Atom feeds, image metadata and video metadata.

Paste the source text you want to tag in the text box.
Paste the URL of the document you want to tag in the URL box.

This plain tagger uses UMBEL reference concepts to tag an input text or a Web document. The OBIE (Ontology-Based Information Extraction) method is used, driven by the UMBEL reference concept ontology. By plain we mean that the words (tokens) of the input text are matched to either the preferred labels or alternative labels of the reference concepts, with the match basis denoted by color. The simple tagger is merely making string matches to the possible UMBEL reference concepts.

This tagger uses the plain labels of the reference concepts as matches against the input text. With this tagger, no manipulations are performed on the reference concept labels nor on the input text (like stemming, etc.). Also, there is NO disambiguation performed by the tagger if multiple concepts are tagged for a given keyword.

The results are presented in two sections depending on whether the preferred or alternative label(s) were matched. Multiple matches, either by concept or label type, are coded by color. Source words with matches and multiple source occurrences are ranked first; thereafter, all source words are presented alphabetically.

This tool is intended for those who want to focus on UMBEL and do not care about more complicated matches. The output of the tagger can be used as-is, but it intended to be the initial input to more sophisticated reference concept matching and disambiguation methods.

You may also directly access the plain tagger web service endpoint.


Copyright © 2008-2018. Structured Dynamics LLC. All content available via Creative Commons Attribution 3.0