NLP tasks:
- Classification taks (e.g. spam detection)
- Sequence tasks (e.g. text generation)
- Meaning tasks
A lexical database:
- Nodes are synsets
- Correspond to abstract concepts
- Ployhierarchical structure
- A polyhierarchical structure is one that allows multiple parents.
According to WordNet which is a large lexical database of English, a synset or synonym set is defined as a set of one or more synonyms that are interchangeable in some context without changing the truth value of the proposition in which they are embedded.
Using WordNet, we can programmatically:
- Identify hyponyms (child terms) and hypernyms (parent terms)
- Measure semantic similarity
The process of classifying words into their parts of speech and labelling them accordingly is known as parts-of-speech tagging, POS-tagging, or simply tagging. Parts-of-speech are also known as word classes or lexical categories.
- POS-tagger processes a sequence of words, and attaches a part of speech tag to each word.
- The collection of tags used for a particular task is known as a tagset.
'NaturalLanguageProcessing > Concept' 카테고리의 다른 글
(w08) Vector semantics and embeddings (0) | 2024.05.27 |
---|---|
(w06) N-gram Language Models (0) | 2024.05.14 |
(w04) Regular expression (0) | 2024.04.30 |
(w03) Text processing fundamentals (0) | 2024.04.24 |
(w02) NLP evaluation -basic (0) | 2024.04.17 |