'ELIZA' 태그의 글 목록

ELIZA

(w01) NLP applications 2024.04.17

(w01) NLP applications

welcometosorapark 2024. 4. 17. 18:35

2024. 4. 17. 18:35

Natural Language Processing (NLP) is informed by a number of perspectives (disciplines contribute to NLP):

Computer/data science
- Theoretical foundation of computation and practical techniques for implementation
Information science
- Analysis, classification, manipulation, retrieval and dissemination of information
Computational Linguistics
- Use of computational techniques to study linguistic phenomena
Cognitive science
- Study of human information processing (perception, language, reasoning, etc.)

NLP adopts multiple paradigms:

Symbolic approaches
- Rule-based, hand coded (by linguists/subject matter experts)
- Knowledge-intensive
Statistical approaches
- Distributional & neural approaches, supervised or unsupervised
- Data-intensive

NLP applications:

Text categorisation
- Media monitoring
  - Classify incoming news stories
- Search engines
  - Classify query intent, e.g. search for 'LOG313'
- Spam detection
Machine translation
- Fully automatic, e.g. Google translate
- Semi-automated
  - Helping human translators
Text summarisation
: to manage information in overload, we need to abstract it down to the most important elements or summarise it
- Summarisation
  - Single-document vs. multi-document
- Search results
- Word processing
- Research/analysis tools
Dialog systems
- Chatbots
- Smartphone speakers
- Smartphone assistants
- Call handling systems
  - Travel
  - Hospitality
  - Banking
Sentiment Analysis
: identify and extract subjective information
- Several sub-tasks:
  - Identify polarity
    e.g. of movie reviews
    e.g. positive, negative, or neutral
  - Identify emotional states
    e.g. angry, sad, happy, etc
  - Subjectivity/objectivity identification
    e.g. “fact” from opinion
  - Feature/aspect-based
    : differentiate between specific features or aspects of entities
Text mining
- Analogy with Data Mining
  - Discover or infer new knowledge from unstructured text resources
- A<->B and B<->C
  - Infer A<->C?
    e.g. link between migraine headaches and magnesium deficiency
- Applications in life sciences, media/publishing, counter terrorism and competitive intelligence
Question answering
- Going beyond the document retrieval paradigm
  : provide specific answers to specific questions
Natural language generation
Speech recognition & synthesis

…and lots more

History of NLP

Foundational Insights: 1940s and 1950s
- Two foundational paradigms:
  1. The automaton, which is the essential information processing unit
  2. Probabilistic or information-theoretic models
- The automaton arose out of Turing’s (1936) model of algorithmic computation
  - Chomsky (1956) considered finite state machines as a way to characterise a grammar
    : he was one of the first people to use these ideas
- Shannon (1948) borrowed the concept of entropy from thermodynamics
  : Entropy is a measure of uncertainty (as entropy approaches 1.0, uncertainty increases)
  - As a way of measuring the information content of a language
  - Measured of the entropy of English by using probabilistic techniques based on the concept of entropy
Two camps: 1960s and 1970s
- Speech and language processing split into two paradigms:
  1. Symbolic:
  - Chomsky and others on parsing algorithms
  - Artificial intelligence (1956) work on reasoning and logic
  - Early natural language understanding (NLU) systems:
  - Single domains pattern matching
  - Keyword search
  - Heuristics for reasoning
  2. Statistical (stochastic)
  - Mosteller and Wallace (1964) applied Byesian methods to the problem of authorship attribution on The Federalist Papers
Early NLP systems
: ELIZA and SHRDLU were the highly influential early NLP systems
- ELIZA
  - Wiezenbaum 1966
  - Pattern matching (ELIZA used elementary keyword spotting techniques)
  - First chatbot
- SHRDLU
  - Winograd 1972
  - Natural language understanding
  - Comprehensive grammar of English
    They created this imaginary world called the block’s world (simulated a robot embedded in a world of toy blocks). The user could interact with this block’s world by asking questions and giving commands.
- Further developments in the 1960s
  - First text corpora (corpora is plural of corpus)
    - The Brown corpus: a one-million-word collection of samples from 500 written texts from different genres (newspaper, novels, non-fiction, academic, etc.), assembled at Brown University in 1963-64 (Kuˇcera and Francis, 1967; Francis, 1979; Francis and Kuˇcera, 1982), and William S. Y. Wang’s 1967 DOC (Dictionary on Computer)
- Empiricism: 1980s and 1990s
  : The rise of the WWW emphasised the need for language-based information retrieval and information extraction.
  - The return of two classes of models that had lost popularity:
    1. Finite-state models:
    - Finite-state morphology by Kaplan and Kay (1981) and models of syntax by Church (1980)
    2. Probabilistic and data-driven approaches:
    - From speech recognition to part-of-speech tagging, parsing and semantics
  - Model evaluation
    - Quantitative metrics, comparison of performance with previous published research
    - Regular competitive evaluation exercises such as the Message Understanding Conferences (MUC)
- The rise of machine learning: 2000s
  : Large amounts of spoken and written language data became available, including annotated collections
  e.g. Penn Treebank (Marcus et al. 1993)
  - Traditional NLP problems, such as parsing and semantic analysis, became problems for supervised learning
  - Unsupervised statistical approaches began to receive renewed attention
    - Statistical approaches to machine translation (Brown et al., 1990; Och and Ney, 2003) and topic modelling (Blei et al., 2003) demonstrated that effective applications could be constructed from systems trained on unannotated data
    - Cost and difficulty of producing annotated corpora became a limiting factor for supervised approaches
- Ascendance of deep learning: 2010s onwards
  - Deep learning methods have become pervasive in NLP and AI in general
    - Advances in technology such as GPUs developed for gaming
    - Plummeting costs of memory
    - Wide availability of software platforms
  - Classic ML methods require analysts to select features based on domain knowledge
    - Deep learning introduced automated feature engineering: generated by the learning system itself
  - Collobert et al (2011) applied convolutional neural nets (CNNs) to POS tagging, chunking, NE tags and language modelling
    - CNNs unable to handle long-distance contextual information
  - Recurrent neural networks (RNNs) process items as a sequence with a "memory" of previous inputs'
    : The method is very useful for what we call sequence labelling tasks.
    - Applicable to many tasks such as:
      - Word-level: named entity recognition, language modelling
      - Sentence-level: sentiment analysis, selecting responses to messages
      - Language generation for machine translation, image captioning, etc.

RNNs are supplemented with long short-term memory (LSTM) or gated recurrent units (GRUs) to improve training performance (the 'vanishing gradient problem').

'NaturalLanguageProcessing > Concept' 카테고리의 다른 글

(w07) Lexical semantics (0)	2024.05.22
(w06) N-gram Language Models (0)	2024.05.14
(w04) Regular expression (0)	2024.04.30
(w03) Text processing fundamentals (0)	2024.04.24
(w02) NLP evaluation -basic (0)	2024.04.17

PREV 이전 1 NEXT 다음

Welcome To Sora Park

ELIZA

(w01) NLP applications

'NaturalLanguageProcessing > Concept' 카테고리의 다른 글

+ Recent posts

티스토리툴바