What is Unstructured data? known also as structure-later, schema-later or schema on read
About
With schema-later data (as semi-structured data), we apply a schema after we read the data.
The knowledge of the schema is delegated to the code that reads the data.
(Unstructured|Schema never)
Structured data is data organized into a schema such as tabular (rows, columns) whereas unstructured data has no pre-defined schema and therefore does not fit well into relational model. Unstructured data is typically text that you found in various forms.
Unstructured data is typically more lengthy and “verbose” than structured data. This verbosity of data can lead to loss of context when viewing results. Unstructured data search explores a number of facets and attributes, not just a single one. Also, unstructured data is often geared towards concepts not numbers.
Example of unstructured data container
- Email,
- documents,
- Extensible Markup Language content
- presentations,
- web content
- Free-form events such as:
- and social media.