About
A Relation is a logical data structure composed of
The following data structure are a relation:
- a table, a materialized view (query) (store data)
It also model either:
- an entity
- or a relationship,
but not both.
In the SQL Iso, a relation is a collection of zero or more rows where each row is a sequence of one or more column values.
A relation is a bag (multiset) of tuple (ie data with possible different sql data type by column). It's not precisely a set because a set does not allow duplicate whereas a multiset (bag) does.
Articles Related
Schema
The schema of a relation is its name and columns along with all attributes such as data type. Because the schema is stored as relation, you can query it.
More .. a SQL - Schema (Metadata).
Equality
We say that R1 = R2 if and only if we can guarantee that the bag of tuples (rows) produced by R1 is the same as the bag of tuples produced by R2.
Implementation
JDBC Rowset
Logical:
Swing JTable
Guava Table
Spark Data Frame
Spark DataFrame is a distributed collection of data organized into named columns
Example:
people.col("age").plus(10); // in Java
Data Frame Panda
Data Frame Panda (API) is a 2-dimensional labeled data structure with columns of potentially different types.
You can think of it like a spreadsheet or SQL table, or a dict of Series objects.
DataFrame accepts many different kinds of input:
- Dict of 1D ndarrays, lists, dicts, or Series
- 2-D numpy.ndarray
- Structured or record ndarray
- A Series
- Another DataFrame
- sequence of (key, value) pairs
- pandas.read_csv, pandas.read_table, pandas.read_clipboard (tab)
R DataFrame
A data frame (doc), a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on).
A data frame is a collection of data organized into named columns from differents data type.
Derby
In java\client\org\apache\derby\client\am\Cursor.java, they hold the data in byte array.
//-------------Structures for holding and scrolling the data -----------------
public byte[] dataBuffer_;
public ByteArrayOutputStream dataBufferStream_;
public int position_; // This is the read head
public int lastValidBytePosition_;
public boolean hasLobs_; // is there at least one LOB column?
// Current row positioning
protected int currentRowPosition_;
private int nextRowPosition_;
// Let's new up a 2-dimensional array based on fetch-size and reuse so that
protected int[] columnDataPosition_;
// This is the actual, computed lengths of varchar fields, not the max length from query descriptor or DA
protected int[] columnDataComputedLength_;
// populate this for
Engine:
- Types: org.apache.derby.iapi.types.DataType Interface. And see all SQL type implementation (SQLBinary, SQLBit, SQLBlob, … )
- ResultSet Interface
Java
- https://github.com/martincooper/java-datatable - Immutable Table implement
- https://github.com/jtablesaw/tablesaw - In-Memory Columns stored implementation (and Visualisation)