Data Processing - (Pipeline | Compose | Chain)

Card Puncher Data Processing

About

A pipeline is a finite or infinite automata known as a stream) where:

A pipeline creates a composition relationship.

A pipeline is also known as:

  • Compose
  • Chain (for instance Chain of command) - Daisy Chain ;)

Pipeline follows a compositional structure known as cascade of operations.

Flow vs Pipeline

A dataflow (data workflow) is:

Model

A pipeline is a combination of:

  • an input source
  • an output destination.
  • and a sequence of data (that may be a byte of an object). In the general sense, it's called a message.
  • that may be chained to form a pipeline

Type

Imperative

The pipeline is executed step by step

Declarative

The pipeline is executed only when the terminal operation is called.

All steps are building a composite type known as algebraic data type.

Example

Shell

In an OS Shell (Dos, bash), a serie of command connected by the pipe operators forms a pipeline. See Shell Data Processing - Pipeline

Code

By returning the calling object from a function, you can compose (or chain) functions. See Design Pattern - (Object) Builder. When we compose (chain) an operation, the output of one operation becomes the input for the next operation, and operations are applied from left to right.

MapReduce

MapReduce - Pipeline

Library

Documentation / Reference





Discover More
Cpu Moore Law Transistor
CPU - Pipeline (Cycle)

pipeline A computer (ie CPU) essentially implements this process: reads the instruction pointer, fetches the next instruction from a storage device decode the instruction execute it, increments...
Scale Counter Graph
Counter - Collector

Metrics collector query and collects metrics in order to be able to send them to a metrics server Log Collector In a instrumented application, reporter are a client piece of code which: process...
Data System Architecture
Data Concurrency - Producer Consumer Thread

Producer / Consumer is concurrency model (ie two threads/process communication) where: one thread called a Producer sends data and the other thread called the Consumer receive data. The data send...
Card Puncher Data Processing
Data Flow - Message (Operand)

This page talks message in the context of data processing. In data processing application, a Message is the data that are carried along the arcs of a pipeline. (ie the object traveling along the dataflow...
Card Puncher Data Processing
Data Processing - Operations / Operator / Filter

A data processing function takes an input and creates an output in a pipeline. transition in Automata functional interface in Functional Programming Filter in Data Processing (Shell and Log Pipeline)...
Data System Architecture
Data Warehousing - 34 Kimball Subsytems

This page takes back the Kimball Datawarehouse 34 Subsystem as a table of content and links them to a page on this website....
Card Puncher Data Processing
Dos - Pipeline

Pipelines (or Pipe) is a redirection operator that are used to chain the output of a command to the input of an other.. See
Relational Data Model
Functional Programming - Algebraic Data Type

An algebraic data type (Algebraic_data_type) is a data type that is the inputand the output of its own operations. An algebraic structure can be composed before being executed. This is a composite...
Card Puncher Data Processing
How to process data with a shell pipeline ?

This article shows you how to process data in the shell
Data System Architecture
LogStash

is: a metrics collector a log collector. with pipeline ability A Logstash pipeline is composed of the following elements; input (produce the data) filter (optional, process the data) ...



Share this page:
Follow us:
Task Runner