Data Mining - Rare Event

Thomas Bayes


A rare event is always rare in function of the population being studied.

In high dimensions, all cases are edge cases

Sam Ross

The rate of an event is related to the probability of an event occurring in some small subinterval (of time, space or otherwise).


  • click-through rate prediction

Documentation / Reference

Discover More
Anomalies Election Fraud
Data Mining - (Anomaly|outlier) Detection

The goal of anomaly detection is to identify unusual or suspicious cases based on deviation from the norm within data that is seemingly homogeneous. Anomaly detection is an important tool: in data...
Thomas Bayes
Data Mining - High Dimension (Curse of Dimensionality)

High dimension In high dimension, it's really difficult to stay local. edge cases See this interactive app in R Shiny on the Curse of Dimensionality. Circle...
Thomas Bayes
Loss functions (Incorrect predictions penalty)

Loss functions define how to penalize incorrect predictions. The optimization problems associated with various linear classifiers are defined as minimizing the loss on training points (sometime along with...
Thomas Bayes
Statistics - (Base rate fallacy|Bonferroni's principle)

Every accurate (model|test) can be useless as detection tools if the studied case is sufficiently rare among the general population. The data model will produce too many false positives or false negatives....
Thomas Bayes
Statistics - Power of a test

The power of a test sometimes, less formally, refers to the probability of rejecting the null when it is not correct, the chance that your experiment is right. A test's power is influenced by the choice...
Text Mining
Text Mining - term frequency – inverse document frequency (tf-idf)

tf–idf, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. It is often used...
Thomas Bayes
What is Event Detection or Event mining?

Event Mining or Event detection can be defined as a process of finding: the frequent events, the rare events, unknown event (it occurrence can be deduced from observation of the system), the anomaly,...

Share this page:
Follow us:
Task Runner