NLP tasks can be categorised by problem type:

  • Classification
    • Sentiment classification
    • News categorisation
  • Regression
    • Essay scoring
  • Sequence labelling
    • Part of speech tagging, named entity recognition

How to evaluate the models? Here is an example for a classification problem below:

  • Imagine we are building a spam classifier
    : Predict whether email messages will be filtered or not
    • Input = feature matrix (email message)
    • Output = target vector (yes/no)
  • Model could be Naive Bayes, k-nearest neighbour, etc.
  • This is a binary classification problem

In the case, the goal is to predict 'spam' or 'not spam' for email messages. As before:

  1. Choose a class of model
  2. Set model hyperparameters
  3. Configure the data (X and y)
  4. Fit the model to the data
  5. Apply model to new (unseen) data

To measure performance, we should consider several factors, including

  • Metric(s)
    • These are quantitative measures that assess how well a model performs. A common metric is accuracy, which is calculated as the number of correct predictions divided by the total number of predictions (n).
  • Balance of the dataset
    • This refers to the distribution of classes within your data. An imbalanced dataset can skew the performance metrics, so it's important to consider this factor as well (for an unbalanced dataset, we can achieve high accuracy simply by selecting the majority class).

Another example for a classification problem:

  • Imagine you work in a hospital
    : Predict whether a CT scan shows tumour or not
    • Tumours are rare wvents, so the classes are unbalanced
      : The cost of missing a tumour is much higher than a 'false alarm'
  • Accuracy is not a good metric

In the case, the confusion matrix can be used to compare the predicted values with actual values (ground truth):

  Predicted Actual
True Positive (TP) Positive Positive
False Positive (FP) Positive Negative
False Negative (FN) Negative Positive
True Negative (TN) Negative Negative

 

Confusion Matrix Actual Values
Positive Negative
Predicted
Values
Positive TP FP
Negative FN TN

 

  • Accuracy = (TP + TN) / (TP + TN + FP + FN)
  • Recall = TP / (TP + FN)
    : Recall is the proportion of actual positive values that are predicted positive
  • Precision = TP / (TP + FP)
    : Precision is the proportion of predicted positive values that are actually positive

 

'NaturalLanguageProcessing > Concept' 카테고리의 다른 글

(w07) Lexical semantics  (0) 2024.05.22
(w06) N-gram Language Models  (0) 2024.05.14
(w04) Regular expression  (0) 2024.04.30
(w03) Text processing fundamentals  (0) 2024.04.24
(w01) NLP applications  (0) 2024.04.17

+ Recent posts