Collation - String comparison

Data System Architecture

About

Collation is a general term for the process and function of determining the sorting order of strings of characters (or in other term, how a strings comparison is performed)

Collation implementations must deal with the complex linguistic conventions for ordering text in specific languages, and provide for common customizations based on user preferences.

Collation can be based on:

  • character position (that varies according to language and culture: Germans, French and Swedes sort the same characters differently)
  • application dictionaries (that may sort differently than phonebooks or book indices)
  • phonetic or appearance of the character (for non-alphabetic scripts such as East Asian ideographs)

Usage

  • sorting a list of strings
  • sorting records in a database (ie order by)
  • selecting sets of records with fields within given bounds.
  • search. For instance, “v” and “w” sort as if they were the same base letter in Swedish, a loose search should pick up words with either one of them.

Parameters

A collation will compare strings according to:

  • the locale of your application (country/language)
  • Case sensitivitivty comparisons
    • (a=A)
    • putting uppercase before lowercase (or vice versa)
  • accent sensitive, so 'é' does not equal 'e'
  • ignoring punctuation or not
  • WS = width sensitive
  • KS = kanatype sensitive
  • and any other application property that you can think of

Algorithm





Discover More
Card Puncher Data Processing
Datacadamia - Data all the things

Computer science from a data perspective
Data System Architecture
SQL - Order by

order by is a query clause that will sort the records (row) according to a collation.
Card Puncher Data Processing
SQL Server - Collation

collation in Sql server. locale (Character set) in SQL Server where: SQL_latin1_General = Collation Designator The SQL is a prefix for old collations created prior the use of OS-level Collations...
Sqlite Banner
Sqlite - Collation (String comparison)

Collation in Sqlite This query is case insensitive and will match: FOO fOo ... This query is case insensitive and will match: FOO bar fOo Bar ... case insensitiveglob Example...
Data System Architecture
String - Comparison - case-(sensitive|insensitive|less)

case sensitivity is a parameter of string comparison (or collation) that tells if the case should be taken into account during the comparison. Comparing two strings in a case-sensitive manner means...
Data System Architecture
Text - Order

Order in text data is defined by: the order of character in the character set. the language directionality collation collation page The language directionality (known also as character order)...
Data System Architecture
What is Text? String or Character?

A character is an atomic unit of text as specified by ISO/IEC 10646:2000 [ISO/IEC 10646] Every unit of text (character) is assigned a unique integer known as a code point. All the characters within a...



Share this page:
Follow us:
Task Runner