Number - Pseudo-random Numbers

1 - About

Pseudo-random numbers is a sequence of numbers that is predictable if you know the seed. Because true randomness is unpredictable, this is called pseudo randomness (If you know the seed, you can predict the output)

A pseudo-random sequence has the following properties:

  • The sequence should never repeat itself
  • The numbers should be spread evenly across the numeric domain (for instance between 0 and 10)

3 - Example of bad random sequence

Type Sequence Example
Uniform sequence 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
Repeated sequence 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
Too many low numbers 1 3 2 5 3 9 1 2 4 2 5 1 1 2 8 1 5 2 3 4
Too many even numbers 2 8 4 6 0 9 8 2 4 8 6 4 2 2 5 1 4 8 6 2

4 - Generator

This sequence is generated through a random generator that requires a seed value. For the same seed value, you will get the same sequence of Pseudo-random Number.

Random number generators are pseudo-random number generators because the output of a deterministic program cannot really be random. They are complex because they are deterministic programs that must give the illusion of being non-deterministic.

A random generator may be considered high-quality for simulation while being considered unacceptable for cryptography.

4.1 - With Distribution

A frequent problem in statistical simulations (the Monte Carlo method) is the generation of pseudo-random numbers that are distributed in a given way. Most algorithms are based on a pseudorandom number generator that produces numbers X that are uniformly distributed in the interval [0,1]. These random variates X are then transformed via some algorithm to create a new random variate having the required probability distribution.

5 - Demo

The below graphic was generated with Javascript. Click on the “Try the code” to see the code.

  • A helper function to draw an histogram

function histogram_graphic(params) {
  
  var selector = params.selector
  var data = params.data;
  var bins = params.bins;
  
  // data
  var min = d3.min(data);
  var max = d3.max(data);

  // Graphics data
  var margin = { top: 30, right: 30, bottom: 30, left: 50 },
    width = 460 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
  

  // Histogram gets the threshold from the x ticks
  // X axis (the ticks of the x axis will be the threshold/breaks of the histogram function)
  var x = d3
    .scaleLinear()
    .domain([min, max]) // can use this instead of 1000 to have the max of data: d3.max(data, function(d) { return +d.price })
    .range([0, width]); // Map of the data to the graphic


  // append the svg object to the body of the page
  // Set the dimensions and margins of the graph
  var svg = d3
    .select("#"+selector)
    .append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)
    .append("g")
    .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

  // Add the x axis
  svg
    .append("g")
    .attr("transform", "translate(0," + height + ")")
    .call(d3.axisBottom(x));

  // Y axis: scale and draw:
  var y = d3
    .scaleLinear()
    .range([height, 0])
    .domain([
      0,
      d3.max(bins, function(d) {
        return d.length;
      })
    ]);

  svg.append("g").call(d3.axisLeft(y));

  // // append the bar rectangles to the svg element
  svg
    .selectAll("rect")
    .data(bins)
    .enter()
    .append("rect")
    .attr("x", 1)
    .attr("transform", function(d) {
      return "translate(" + x(d.x0) + "," + y(d.length) + ")";
    })
    .attr("width", function(d) {
      return x(d.x1) - x(d.x0) - 1;
    })
    .attr("height", function(d) {
      return height - y(d.length);
    })
    .style("fill", "#69b3a2");
}

  • Creating the population data randomly distributed

population_n = 10000;
population_data = [];
population_max = 100;
population_data = [];

for (i = 0; i < population_n; i++) {
  random_value = Math.floor(Math.random() * Math.floor(population_max));
  population_data.push(random_value);
}

  • Building and getting the bins

var thresholds= [];
for (var i = 0; i <= population_max; i++) {
   thresholds.push(i);
};

var histogram = d3
    .histogram()
    .domain([0,population_max]) // then the domain of the graphic
    .thresholds(thresholds); // then the threshold
var bins = histogram(population_data);


histogram_graphic({ selector: "population", data: population_data, bins: bins});

  • Getting the bin

lengths = bins.map(function (d) { return d.length })
lengths_mean = d3.mean(lengths)
console.log("Mean of each length bin = "+lengths_mean)
errors = bins
    .filter(function(d) { return Math.abs(d.length - lengths_mean) < 50 }) // One outlier, why ?
    .map(function(d) { return d.length - lengths_mean; } )
errors_min = d3.min(errors)
errors_max = d3.max(errors)

errors_bins = d3.histogram()
    .domain([errors_min,errors_max]) // then the domain of the graphic
    .thresholds(30)
    (errors ); // 30 bins

histogram_graphic({ selector: "error", data: errors , bins: errors_bins });

  • The HTML page

<h1>The Population Distribution</h1>
<p>A population  was generated with pseudo-random data and has an uniform shape</p>
<p>Number of point (N) = 10000, Pseudo-Randomness Range = [0, 100], Number of bin = 100</p>
<div id="population"></div>
<h1>The Distribution of the error against the mean of each bin</h1>
<p>The error distribution follows a normal distribution (as stated by the Central Limit Theorem)</p>
<div id="error"></div>


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap