NLP tasks can be categorised by problem type:
- Classification
- Sentiment classification
- News categorisation
- Regression
- Essay scoring
- Sequence labelling
- Part of speech tagging, named entity recognition
How to evaluate the models? Here is an example for a classification problem below:
- Imagine we are building a spam classifier
: Predict whether email messages will be filtered or not- Input = feature matrix (email message)
- Output = target vector (yes/no)
- Model could be Naive Bayes, k-nearest neighbour, etc.
- This is a binary classification problem
In the case, the goal is to predict 'spam' or 'not spam' for email messages. As before:
- Choose a class of model
- Set model hyperparameters
- Configure the data (X and y)
- Fit the model to the data
- Apply model to new (unseen) data
To measure performance, we should consider several factors, including
- Metric(s)
- These are quantitative measures that assess how well a model performs.
→ A common metric is accuracy, which is calculated as the number of correct predictions divided by the total number of predictions (n).
- These are quantitative measures that assess how well a model performs.
- Balance of the dataset
- This refers to the distribution of classes within your data.
→ An imbalanced dataset can skew the performance metrics, so it's important to consider this factor as well (for an unbalanced dataset, we can achieve high accuracy simply by selecting the majority class).
- This refers to the distribution of classes within your data.
Another example for a classification problem:
- Imagine you work in a hospital
: Predict whether a CT scan shows tumour or not- Tumours are rare events, so the classes are unbalanced
: The cost of missing a tumour is much higher than a 'false alarm'
- Tumours are rare events, so the classes are unbalanced
- Accuracy is not a good metric
In the case, the confusion matrix can be used to compare the predicted values with actual values (ground truth):
Predicted | Actual | |
True Positive (TP) | Positive | Positive |
False Positive (FP) | Positive | Negative |
False Negative (FN) | Negative | Positive |
True Negative (TN) | Negative | Negative |
Confusion Matrix | Actual Values | ||
Positive | Negative | ||
Predicted Values |
Positive | TP | FP |
Negative | FN | TN |
- Accuracy = (TP + TN) / (TP + TN + FP + FN)
- Recall = TP / (TP + FN)
: Recall is the proportion of actual positive values that are predicted positive - Precision = TP / (TP + FP)
: Precision is the proportion of predicted positive values that are actually positive
'NaturalLanguageProcessing > Concept' 카테고리의 다른 글
(w07) Lexical semantics (0) | 2024.05.22 |
---|---|
(w06) N-gram Language Models (0) | 2024.05.14 |
(w04) Regular expression (0) | 2024.04.30 |
(w03) Text processing fundamentals (0) | 2024.04.24 |
(w01) NLP applications (0) | 2024.04.17 |