Loss functions (Incorrect predictions penalty)

1 - About

Loss functions define how to penalize incorrect predictions. The optimization problems associated with various linear classifiers are defined as minimizing the loss on training points (sometime along with a regularization term).

They can also be used to evaluate the quality of models.

3 - Type

3.1 - Regression

  • Squared loss = <math>(y-\hat{y})^2</math>

3.2 - Classification

3.2.1 - 0-1

0-1 loss: Penalty is 0 for correct prediction, and 1 otherwise

As 0-1 loss is not convex, the standard approach is to transform the categorical features into numerical features: (See Statistics - Dummy (Coding|Variable) - One-hot-encoding (OHE)) and to use a regression loss.

3.2.2 - Log

Log loss is defined as: <MATH> \begin{align} \ell_{log}(p, y) = \begin{cases} -\log (p) & \text{if } y = 1\\\ -\log(1-p) & \text{if } y = 0 \end{cases} \end{align} </MATH> where

  • <math>p</math> is a probability between 0 and 1. <note tip>A base probability for a binary event will be just the mean over the training target</note><note tip>Then it can be compared to the output of probabilistic model such as the logistic regression</note>
  • <math>y</math> is a label of either 0 or 1.

Log loss is a standard evaluation criterion when predicting rare-events such as click-through rate prediction


from math import log

def computeLogLoss(p, y):
    """Calculates the value of log loss for a given probabilty and label.
        p (float): A probabilty between 0 and 1.
        y (int): A label.  Takes on the values 0 and 1.

        float: The log loss value.
    epsilon = 10e-12
    if p==0:
        p = epsilon
    elif p==1:
        p = p - epsilon
    if y == 1:
        return -log(p)
    elif y == 0:
        return -log(1-p)

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap