?הנוכמה השוע המ
|
Semantic fields, networks or taxonomies can be used to store lexical knowledge (that we all take for granted) regarding the semantic relationships between words. Meijs(1992) describes this knowledge as being attempted answers to the questions : Which words `belong' together ?
How can we characterize the groupings that emerge ?
n. "a road vehicle with usually four wheels which is driven by a motor."
The LDOCE MRD (but not the printed version) contains `subject-field codes' which indicate the semantic field to which the senses of a lexical item belong : zoology, botany, sports, religion etc. Most main subject fields also contain subfields; for example `substance' has the subfields `liquid' and `gas'.
The relationships between `subject-field codes' may be represented using a semantic network, as in this example which shows the top-level ISA relationships between nouns in LDOCE, derived from a study of their meaning definitions.
Semantic disambiguation involves identifying the correct sense or meaning of each word in a particular sentence, using the context in which it occurs as a guide. MRDs will list several senses for most words, but very common words will have many more. For example, LDOCE gives 13 meaning definitions for `bank'. The problem is complicated though by the number of possible sense combinations involved in disambiguating a sentence of words, giving a large problem search space. For example :
The man drove the car into the bank 8 10 3 9 (word senses) Number of possible sense combinations = 8*10*3*9 = 2160
Note that no attempt is made to disambiguate function words, as otherwise they will significantly increase ambiguity (for example, `the' has 21 sense meanings and `for' has 35 meanings in LDOCE) without introducing any obvious semantic information for the disambiguation algorithm. (The semantic distinctions are very abstract, and don't represent finer grained subject-field distinctions.)
Two approaches to automatic semantic disambiguation using MRDs have involved using (i) Semantic Field Codes, and (ii) Sense Meaning Definitions.
:םיפסונ םיגשומו תורדגה
Automatic syntax analysis, or parsing, is one of the oldest and most developed subfields of CL. A parser, as the corresponding analysing system is dubbed, in the CL sense is a program assigning structural descriptions to input sentences (Hellwig 1989, 348). Parsing methods and parser implementations can be classified, among other criteria, according to parsing strategies (serial or "depth first" vs. parallel or "breadth first"; "reductive" or "bottom-up" vs. "expansive" or "top-down"), intrinsic grammar types ("Chomsky hierarchy") and formalisms, parsing techniques and parser architectures (e.g. King, ed. 1983; Hellwig 1989).
A program that only checks sentence structures in order to decide, for instance, whether they are "well-formed", is a recogniser. A program that assigns descriptive "tags" to parts of utterances especially words and parts of sentences or phrases, is a tagger, and the corresponding operation is known as tagging. Tags, as a rule, represent grammatical categories ("word class" labels), sometimes also citation forms of words like those used in dictionaries as lemma signs, or various other grammatical and semantic information - depending on the application. Different kinds of tagging are correspondingly distinguished, as morphological, syntactic or semantic tagging. Automatic word class tagging together with the determination of citation forms was formerly also termed lemmatisation.