# What is the Cosine Similarity or Cosine Distance? (Measure of Angle)

The cosine similarity (or cosine distance) is a distance that measures the angle between two vectors, normalized by magnitude. You just divide the dot product by the magnitude of the two vectors.

## Formula

By taking the Linear Algebra - (Dot|Scalar|Inner) Product of two vectors and Linear Algebra - (Dot|Scalar|Inner) Product of two vectors definition of the dot product, we get the cosine similarity that is a normalized dot product of two vectors $$similarity = \cos \theta = \frac{a.b}{||a|| ||b||} = \frac{ \sum a_i b_i }{ \sqrt{\sum a_i^2} \sqrt{\sum b_i^2} }$$

• If the angle is small (they share many tokens in common), the cosine is large.
• If the angle is large (and they have few tokens in common), the cosine is small.

## Documentation / Reference

Discover More
Linear Algebra - (Dot|Scalar|Inner) Product of two vectors

A dot Product is the multiplication of two two equal-length sequences of numbers (usually coordinate vectors) that produce a scalar (single number) Dot-product is also known as: scalar product or...
Natural Language - Document (Cosine) Similarity

Cosine similarity applied to document similarity. Each document becomes a vector in some high dimensional space. To compare two documents we compute the cosine of the angle between their two document...
What are models of text in NLP? (Natural Language, Text Modeling)

This page talks model creation for natural language text. ie how to store and represent text ? Let's say that you want to search in a list of documents, documents that are similar on 2 dimensions,...
What is Similarity?

Simliarity is determined as being the closest distance between 2 objects in a set. You can find similarities by looking at: the metadata: Were they created at roughly the same time? Do they tend...
What is a Distance?

Distance is a numerical description of how far apart objects are. Same as: In most cases, “distance from A to B” is interchangeable with “distance between B and A”. In physics...
What is a Term-document Matrix?

A term-document matrix is an important representation for text analytics. Each row of the matrix is a document vector, with one column for every term in the entire corpus. Naturally, some documents...