For those that don't know about it (which usually means people not active in the NLP/Semantic space) Wordnet is a free dictionary created by Princeton that is based on a theory of how the brain organises words. There are numerous semantic links between words, and nouns and verbs are arranged in an ontology that ranges from abstract ideas at the top through to specific ideas at the bottom.
While not really intended as an NLP resource, it has become useful in semantic applications exactly due this hierachy of concepts (if it is known that "a beach house can catch fire" an AI program with wordnet can effortlessly produce the generalisation that "houses can catch fire").
Recently, the same group at Princeton have created a "semantically annotated gloss corpus" (SACG) in which the definitions of each word are cross-linked to the exact semantic definition of each word. This has long been considered a useful exercise as semantically tagged data is in a woefully short supply, and moreover, such a cross-linked semantic dictionary should in theory aid in semantic disambiguation tasks.
Both Wordnet 3.0 and the SACG are now available from a web based interface that combines wordnet lookups and definitions with the full range of semantic links and the SACG, whereby the parts of speech and semantic defintion of each word in the definitions (or gloss) are expressed.
0 comments:
Post a Comment