D3 - Histogram

Card Puncher Data Processing

About

Data Visualisation - Histogram (Frequency distribution) in D3.

Usage

Histogram function building

Instantiation

var histogram = d3
  .histogram()

The properties of an histogram function

Accessor

A callback function to return the data to bins from the raw data (data passed to the histogram function). Parameters are:

  • the element
  • the index
  • the raw data

Example: histogram.value - specify how to get the data from the data variable (ie a value accessor). Example

var histogram = d3
  .histogram()
  .value(function(d) {
    return d.price;
  }) 

Thresholds

Thresholds specify how values are divided into bins (also known as breaks)

Thresholds (breaks) are defined via:

  • an array of values [x0, x1, …]
  • a generator function to generate an array of values
  • or a number of bins

The generated histogram will have thresholds.length + 1 bins.

histogram.thresholds([count])
histogram.thresholds([thresholds]) 

The default is the (Sturges’ formula)

Domain

  • histogram.domain - specify the interval of observable values. (filter ?)

Any threshold values outside the domain are ignored.

Call

var bins = histogram(data);

Return values

  • bins is an array of bin, where each bin is an array containing:
    • the associated elements from the input data. (the length of the bin is the number of elements in that bin)
    • and two attributes:
      • x0 - the lower bound of the bin (inclusive).
      • x1 - the upper bound of the bin (exclusive, except for the last bin).

The first bin.x0 is always equal to the minimum domain value, and the last bin.x1 is always equal to the maximum domain value.

Example

Usage

Basic

data = [1,2,3,4,5,6,7,8,9];

var histogram = d3.histogram().thresholds([0,6]);
var bins = histogram(data);
console.log(bins)

With an axis

with a scale

data = [1,2,3,4,5,6,7,8,9];

var x = d3.scaleLinear()
  .domain([0, 10])
  .range([0, 300]);
  
var histogram = d3.histogram()
    .domain(x.domain())
    .thresholds(x.ticks(4));

var bins = histogram(data);
console.log(bins)

Complete graphic example

  • Raw Data
data = [1,2,3,8,7,4,9,8,7,3,4,5,2,1,9,7,8,4,0,2,3,8,7,6];
min = d3.min(data);
max = d3.max(data);
domain = [min,max];
  • Graph data
var margin = { top: 30, right: 30, bottom: 30, left: 50 },
  width = 460 - margin.left - margin.right,
  height = 400 - margin.top - margin.bottom;

// The number of bins 
Nbin = 10;
  • As the Histogram gets the bin threshold from the x ticks of the x axis, we built first the x axis with a scale
var x = d3
  .scaleLinear()
  .domain(domain) 
  .range([0, width]); 
  • Build the histogram function and gets the bins
var histogram = d3
  .histogram()
  .domain(x.domain()) // then the domain of the graphic
  .thresholds(x.ticks(Nbin)); // then the numbers of bins

// And apply this function to data to get the bins
var bins = histogram(data);
  • Build the top element of the graphic
// Add the svg element to the body and set the dimensions and margins of the graph
var svg = d3
  .select("body")
  .append("svg")
  .attr("width", width + margin.left + margin.right)
  .attr("height", height + margin.top + margin.bottom)
  .append("g")
  .attr("transform", "translate(" + margin.left + "," + margin.top + ")");
  • Add the x axis
svg
  .append("g")
  .attr("transform", "translate(0," + height + ")")
  .call(d3.axisBottom(x));
  • Add the y axis. The domain goes from 0 to the max of the element by bins (the length attributes of the bins arrays)
var y = d3
  .scaleLinear()
  .range([height, 0])
  .domain([
    0,
    d3.max(bins, function(d) {
      return d.length;
    })
  ]); 

svg.append("g").call(d3.axisLeft(y));
  • Append the bar rectangles to the svg element
svg
  .selectAll("rect")
  .data(bins)
  .enter()
  .append("rect")
  .attr("x", 1)
  .attr("transform", function(d) {
    return "translate(" + x(d.x0) + "," + y(d.length) + ")";
  })
  .attr("width", function(d) {
    return x(d.x1) - x(d.x0) - 1;
  })
  .attr("height", function(d) {
    return height - y(d.length);
  })
  .style("fill", "#69b3a2");
  • Result:

Documentation / Reference





Discover More
Utah Teapot
Data Visualisation - Histogram (Frequency distribution)

A histogram is a type of graph generally used to visualize a distribution An histogram is also known as a frequency distribution. Histograms can reveal information not captured by summary statistics...



Share this page:
Follow us:
Task Runner