(Relation|Table) - Tabular data
Table of Contents
About
A Relation is a logical data structure composed of
The following data structure are a relation:
- a table, a materialized view (query) (store data)
It also model either:
- an entity
- or a relationship,
but not both.
In the SQL Iso, a relation is a collection of zero or more rows where each row is a sequence of one or more column values.
A relation is a bag (multiset) of tuple (ie data with possible different sql data type by column). It's not precisely a set because a set does not allow duplicate whereas a multiset (bag) does.
Articles Related
Schema
The schema of a relation is its name and columns along with all attributes such as data type. Because the schema is stored as relation, you can query it.
More .. a SQL - Schema (Metadata).
Equality
We say that R1 = R2 if and only if we can guarantee that the bag of tuples (rows) produced by R1 is the same as the bag of tuples produced by R2.
Implementation
JDBC Rowset
Swing JTable
Guava Table
Spark Data Frame
Spark DataFrame is a distributed collection of data organized into named columns
Example:
people.col("age").plus(10); // in Java
Data Frame Panda
Data Frame Panda (API) is a 2-dimensional labeled data structure with columns of potentially different types.
You can think of it like a spreadsheet or SQL table, or a dict of Series objects.
DataFrame accepts many different kinds of input:
- Dict of 1D ndarrays, lists, dicts, or Series
- 2-D numpy.ndarray
- Structured or record ndarray
- A Series
- Another DataFrame
- sequence of (key, value) pairs
- pandas.read_csv, pandas.read_table, pandas.read_clipboard (tab)
R DataFrame
A data frame (doc), a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on).
A data frame is a collection of data organized into named columns from differents data type.
Derby
In java\client\org\apache\derby\client\am\Cursor.java, they hold the data in byte array.
//-------------Structures for holding and scrolling the data -----------------
public byte[] dataBuffer_;
public ByteArrayOutputStream dataBufferStream_;
public int position_; // This is the read head
public int lastValidBytePosition_;
public boolean hasLobs_; // is there at least one LOB column?
// Current row positioning
protected int currentRowPosition_;
private int nextRowPosition_;
// Let's new up a 2-dimensional array based on fetch-size and reuse so that
protected int[] columnDataPosition_;
// This is the actual, computed lengths of varchar fields, not the max length from query descriptor or DA
protected int[] columnDataComputedLength_;
// populate this for
Engine:
- Types: org.apache.derby.iapi.types.DataType Interface. And see all SQL type implementation (SQLBinary, SQLBit, SQLBlob, … )
- ResultSet Interface
Java
- https://github.com/martincooper/java-datatable - Immutable Table implement
- https://github.com/jtablesaw/tablesaw - In-Memory Columns stored implementation (and Visualisation)