(Stream|Pipe|Message Queue|Event Processing)

1 - About

From an abstract point of view, a stream is a sequence of an infinite cardinality (size) delivered at unknown time intervals.

An finite sequence is called a list

Streams:

  • are inputs and outputs of operations
  • may be also buffers (samza) or not (Java)

Operations:

  • functional-style operations on streams of elements on collections, such as map-reduce transformations.

Collections are primarily concerned with the efficient management of, and access to, their elements. By contrast, streams do not provide a means to directly access or manipulate their elements, and are instead concerned with declaratively describing their source and the computational operations which will be performed in aggregate on that source.

To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline can be viewed as a query on the stream source.

Because processing of stream is also infinite, streams are associated to realtime processing.

All data processing algorithm cannot rely on the size to make assumption.

System that manages a stream are called messaging system.

Stream processing lets model systems that have state without ever using assignment or mutable data.

The data structures involved in streams application are then:

  • a message
  • and a queue to store messages.

An application that handle this message is called a messaging system.

In a stream architecture, stream processing is using the observer operator:

  • Something happened (A new element in the stream such as an Event),
  • Subscribe to it (Streams)

A table is a stream of data manipulation with an infinite windows.

The world is concurrent. Things in the world don’t share data. Things communicate with messages. Things fail.

Event sourcing describes a process as a sequence of event.

3 - Definition

A stream is a infinite sequence of element delivered at unknown time intervals.


<div id="graph"></div>


function rand() { 
  return Math.random();
}

Plotly.plot('graph', [{
  y: [1,2,3].map(rand),
  mode: 'lines',
  line: {color: '#80CAF6'}
}]);

var cnt = 0;

var interval = setInterval(function() {
  
  Plotly.extendTraces('graph', {
    y: [[rand()]]
  }, [0])

  if(cnt === 100) clearInterval(interval);
}, 300);


4 - Streaming concepts

5 - Architecture

A messaging technology needs to have the following characteristics:

  • Replayable
  • Persistent
  • Capable of high performance at large scale

5.1 - Vision

Real-time Mapreduce Event-driven microservices
Storm, Spark Streaming, Flink Kafka Stream API
Central cluster Embedded library in any Java app
Custom packaging, deployment & monitoring Just Kafka and your app
Suitable for analytics-type use cases Makes stream processing accessible to any use case

6 - Event Centric

7 - Example

Streams of data

  • user activity on a website
  • sensor readings from devices (IOT)
  • order delivery

8 - Documentation / Reference


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap